taikun.cloud

Taikun OCP Guide

Table of Contents

Hardware Burn-in

Overview

Workflows to onboard new hardware often include a stress-testing step
to provoke early failures and to avoid that these load-triggered issues
only occur when the nodes have already moved to production. These
burn-in tests typically include CPU, memory, disk, and
network. With the Xena release, Ironic supports such tests as part of
the cleaning framework.

The burn-in steps rely on standard tools such as stress-ng
for CPU and memory, or fio for disk and
network. The burn-in cleaning steps are part of the generic hardware
manager in the Ironic Python Agent (IPA) and therefore the agent ramdisk
does not need to be bundled with a specific IPA hardware manager
<admin/hardware_managers.html>
to have them available.

Each burn-in step accepts (or in the case of network: needs) some
basic configuration options, mostly to limit the duration of the test
and to specify the amount of resources to be used. The options are set
on a node’s driver-info and prefixed with
agent_burnin_. The options available for the individual
tests will be outlined below.

CPU burn-in

The options, following a agent_burnin_
+ stress-ng stressor (cpu) + stress-ng
option schema, are:

  • agent_burnin_cpu_timeout (default: 24 hours)
  • agent_burnin_cpu_cpu (default: 0, meaning all
    CPUs)

to limit the overall runtime and to pick the number of CPUs to
stress.

For instance, in order to limit the time of the CPU burn-in to 10
minutes do:

baremetal node set --driver-info agent_burnin_cpu_timeout=600 \
    $NODE_NAME_OR_UUID

Then launch the test with:

baremetal node clean --clean-steps '[{"step": "burnin_cpu", \
    "interface": "deploy"}]' $NODE_NAME_OR_UUID

Memory burn-in

The options, following a agent_burnin_
+ stress-ng stressor (vm) + stress-ng
option schema, are:

  • agent_burnin_vm_timeout (default: 24 hours)
  • agent_burnin_vm_vm-bytes (default: 98%)

to limit the overall runtime and to set the fraction of RAM to
stress.

For instance, in order to limit the time of the memory burn-in to 1
hour and the amount of RAM to be used to 75% run:

baremetal node set --driver-info agent_burnin_vm_timeout=3600 \
    $NODE_NAME_OR_UUID
baremetal node set --driver-info agent_burnin_vm_vm-bytes=75% \
    $NODE_NAME_OR_UUID

Then launch the test with:

baremetal node clean --clean-steps '[{"step": "burnin_memory", \
    "interface": "deploy"}]' $NODE_NAME_OR_UUID

Disk burn-in

The options, following a agent_burnin_
+ fio stressor (fio_disk) + fio option
schema, are:

  • agent_burnin_fio_disk_runtime (default: 0, meaning no time
    limit)
  • agent_burnin_fio_disk_loops (default: 4)

to set the time limit and the number of iterations when going over
the disks.

For instance, in order to limit the number of loops to 2 set:

baremetal node set --driver-info agent_burnin_fio_disk_loops=2 \
    $NODE_NAME_OR_UUID

Then launch the test with:

baremetal node clean --clean-steps '[{"step": "burnin_disk", \
    "interface": "deploy"}]' $NODE_NAME_OR_UUID

Network burn-in

Burning in the network needs a little more config, since we need a
pair of nodes to perform the test. This pairing can be done either in a
static way, i.e. pairs are defined upfront, or dynamically via a
distributed coordination backend which orchestrates the pair matching.
While the static approach is more predictable in terms of which nodes
test each other, the dynamic approach avoids nodes being blocked in case
there are issues with servers and simply pairs all available nodes.

Static network burn-in
configuration

To define pairs of nodes statically, each node can be assigned a
agent_burnin_fio_network_config JSON which requires a
role field (values: reader,
writer) and a partner field (value is the
hostname of the other node to test), like:

baremetal node set --driver-info agent_burnin_fio_network_config= \
    '{"role": "writer", "partner": "$HOST2"}' $NODE_NAME_OR_UUID1
baremetal node set --driver-info agent_burnin_fio_network_config= \
    '{"role": "reader", "partner": "$HOST1"}' $NODE_NAME_OR_UUID2

Dynamic network burn-in
configuration

In order to use dynamic pair matching, a coordination backend is used
via tooz. The
corresponding backend URL then needs to be added to the node, e.g. for a
Zookeeper backend it would look similar to:

baremetal node set --driver-info \
    agent_burnin_fio_network_pairing_backend_url= \
    'zookeeper://zk1.xyz.com:2181,zk2.xyz.com:2181,zk3.xyz.com:2181' \
    $NODE_NAME_OR_UUID1
baremetal node set --driver-info \
    agent_burnin_fio_network_pairing_backend_url= \
    'zookeeper://zk1.xyz.com:2181,zk2.xyz.com:2181,zk3.xyz.com:2181' \
    $NODE_NAME_OR_UUID2
...
baremetal node set --driver-info \
    agent_burnin_fio_network_pairing_backend_url= \
    'zookeeper://zk1.xyz.com:2181,zk2.xyz.com:2181,zk3.xyz.com:2181' \
    $NODE_NAME_OR_UUIDN

Different deliveries or network ports can be separated by creating
different rooms on the backend with:

baremetal node set --driver-info \
agent_burnin_fio_network_pairing_group_name=$DELIVERY $NODE_NAME_OR_UUID

This allows to control which nodes (or interfaces) connect with which
other nodes (or interfaces).

Launching network burn-in

In addition and similar to the other tests, there is a runtime option
to be set (only on the writer):

baremetal node set --driver-info agent_burnin_fio_network_runtime=600 \
    $NODE_NAME_OR_UUID

The actual network burn-in can then be launched with:

baremetal node clean --clean-steps '[{"step": "burnin_network",\
    "interface": "deploy"}]' $NODE_NAME_OR_UUID1
baremetal node clean --clean-steps '[{"step": "burnin_network",\
    "interface": "deploy"}]' $NODE_NAME_OR_UUID2

Both nodes will wait for the other node to show up and block while
waiting. If the partner does not show up, the cleaning timeout will step
in.

Logging

Since most of the burn-in steps are also providing information about
the performance of the stressed components, keeping this information for
verification or acceptance purposes may be desirable. By default, the
output of the burn-in tools goes to the journal of the Ironic Python
Agent and is therefore sent back as an archive to the conductor. In
order to consume the output of the burn-in steps more easily, or even in
real-time, the nodes can be configured to store the output of the
individual steps to files in the ramdisk (from where they can be picked
up by a logging pipeline).

The configuration of the outpout file is done via one of
agent_burnin_cpu_outputfile,
agent_burnin_vm_outputfile,
agent_burnin_fio_disk_outputfile, and
agent_burnin_fio_network_outputfile parameters which need
to be added to a node like:

baremetal node set --driver-info agent_burnin_cpu_outputfile=\
    '/var/log/burnin.cpu' $NODE_NAME_OR_UUID

Additional Information

All tests can be aborted at any moment with

baremetal node abort $NODE_NAME_OR_UUID

One can also launch multiple tests which will be run in sequence,
e.g.:

baremetal node clean --clean-steps '[{"step": "burnin_cpu",\
   "interface": "deploy"}, {"step": "burnin_memory",\
   "interface": "deploy"}]' $NODE_NAME_OR_UUID

If desired, configuring fast-track may be helpful here
as it allows to keep the node up between consecutive calls of
baremetal node clean.