d9913370de
One of the biggest frustrations larger operators have is when they trigger a massive number of concurrent deployments. As one would expect, the memory utilization of the conductor goes up. Except, even with the default number of worker threads, if we're requested to convert 80 images at the same time, or to perform the write-out to the remote node at the same time, we will consume a large amount of system RAM. Or more specifically, qemu-img will consume a large amount of memory. If the amount of memory goes too low, the system can trigger OOMKiller which will slay processes using ram. Ideally, we do not want this to happen to our conductor process, much less the work that is being performed, so we need to add some guard rails to help keep us from entering into situations where we may compromise the conductor by taking on too much work. Adds a guard in the conductor to prevent multiple parallel deployment operations from running the conductor out of memory. With the defaults, the conductor will attempt to throttle back automatically and hold worker threads which will slow down the amount of work also proceeding through the conductor, as we are in a memory condition where we should be careful about the work. The defaults allow this to occur for a total of 15 seconds between re-check of available RAM, for a total number of six retries. The minimum default is 1024 (MB), as this is the amount of memory qemu-img allocates when trying to write images. This quite literally means no additional qemu-img process can spawn until the default memory situation has resolved itself. Change-Id: I69db0169c564c5b22abd0cb1b890f409c13b0ac2 |
||
---|---|---|
api-ref | ||
devstack | ||
doc | ||
etc | ||
ironic | ||
playbooks/ci-workarounds | ||
releasenotes | ||
tools | ||
zuul.d | ||
.gitignore | ||
.gitreview | ||
.mailmap | ||
.stestr.conf | ||
bindep.txt | ||
CONTRIBUTING.rst | ||
driver-requirements.txt | ||
LICENSE | ||
README.rst | ||
reno.yaml | ||
requirements.txt | ||
setup.cfg | ||
setup.py | ||
test-requirements.txt | ||
tox.ini |
Ironic
Team and repository tags
Overview
Ironic consists of an API and plug-ins for managing and provisioning physical machines in a security-aware and fault-tolerant manner. It can be used with nova as a hypervisor driver, or standalone service using bifrost. By default, it will use PXE and IPMI to interact with bare metal machines. Ironic also supports vendor-specific plug-ins which may implement additional functionality.
Ironic is distributed under the terms of the Apache License, Version 2.0. The full terms and conditions of this license are detailed in the LICENSE file.
Project resources
- Documentation: https://docs.openstack.org/ironic/latest
- Source: https://opendev.org/openstack/ironic
- Bugs: https://storyboard.openstack.org/#!/project/943
- Wiki: https://wiki.openstack.org/wiki/Ironic
- APIs: https://docs.openstack.org/api-ref/baremetal/index.html
- Release Notes: https://docs.openstack.org/releasenotes/ironic/
- Design Specifications: https://specs.openstack.org/openstack/ironic-specs/
Project status, bugs, and requests for feature enhancements (RFEs) are tracked in StoryBoard: https://storyboard.openstack.org/#!/project/943
For information on how to contribute to ironic, see https://docs.openstack.org/ironic/latest/contributor