openstack-manuals/doc/admin-guide-cloud/source/baremetal.rst
venkatamahesh fe67b3842c [admin-guide] Fix rst mark-ups for baremetal service
Change-Id: I1d3d2b9b79ea06968e95cd59d98100117af8761a
2015-12-15 08:23:07 +00:00

4.8 KiB

Bare Metal

The Bare Metal service provides physical hardware management features.

Introduction

The Bare Metal service provides physical hardware as opposed to virtual machines and provides several reference drivers which leverage common technologies like PXE and IPMI, to cover a wide range of hardware. The pluggable driver architecture also allows vendor-specific drivers to be added for improved performance or functionality not provided by reference drivers. The Bare Metal service makes physical servers as easy to provision as virtual machines in a cloud, which in turn will open up new avenues for enterprises and service providers.

System architecture

The Bare Metal service is composed of the following components:

  1. An admin-only RESTful API service, by which privileged users, such as cloud operators and other services within the cloud control plane, may interact with the managed bare-metal servers.
  2. A conductor service, which conducts all activity related to bare-metal deployments. Functionality is exposed via the API service. The Bare Metal service conductor and API service communicate via RPC.
  3. Various drivers that support heterogeneous hardware, which enable features specific to unique hardware platforms and leverage divergent capabilities via a common API.
  4. A message queue, which is a central hub for passing messages, such as RabbitMQ. It should use the same implementation as that of the Compute service.
  5. A database for storing information about the resources. Among other things, this includes the state of the conductors, nodes (physical servers), and drivers.

When a user requests to boot an instance, the request is passed to the Compute service via the Compute service API and scheduler. The Compute service hands over this request to the Bare Metal service, where the request passes from the Bare Metal service API, to the conductor which will invoke a driver to successfully provision a physical server for the user.

Bare Metal deployment

  1. PXE deploy process
  2. Agent deploy process

Use Bare Metal

  1. Install the Bare Metal service.
  2. Setup the Bare Metal driver in the compute node's nova.conf file.
  3. Setup TFTP folder and prepare PXE boot loader file.
  4. Prepare the bare metal flavor.
  5. Register the nodes with correct drivers.
  6. Configure the driver information.
  7. Register the ports information.
  8. Use nova boot to kick off the bare metal provision.
  9. Check nodes' provision state and power state.

Troubleshooting

No valid host found error

Sometimes /var/log/nova/nova-conductor.log contains the following error:

NoValidHost: No valid host was found. There are not enough hosts available.

The message No valid host was found means that the Compute service scheduler could not find a bare metal node suitable for booting the new instance.

This means there will be some mismatch between resources that the Compute service expects to find and resources that Bare Metal service advertised to the Compute service.

If you get this message, check the following:

  1. Introspection should have succeeded for you before, or you should have entered the required bare-metal node properties manually. For each node in ironic node-list use:

    $ ironic node-show <IRONIC-NODE-UUID>

    and make sure that properties JSON field has valid values for keys cpus, cpu_arch, memory_mb and local_gb.

  2. The flavor in the Compute service that you are using does not exceed the bare-metal node properties above for a required number of nodes. Use:

    $ nova flavor-show <FLAVOR NAME>
  3. Make sure that enough nodes are in available state according to ironic node-list. Nodes in manageable state usually mean they have failed introspection.

  4. Make sure nodes you are going to deploy to are not in maintenance mode. Use ironic node-list to check. A node automatically going to maintenance mode usually means the incorrect credentials for this node. Check them and then remove maintenance mode:

    $ ironic node-set-maintenance <IRONIC-NODE-UUID> off
  5. It takes some time for nodes information to propagate from the Bare Metal service to the Compute service after introspection. Our tooling usually accounts for it, but if you did some steps manually, there may be a period of time when nodes are not available to the Compute service yet. Check that nova hypervisor-stats correctly shows total amount of resources in your system.