While investigating a very curious report, I discovered that
if somehow the power was *already* turned off to a node, say
through an incorrect BMC *or* human action, and Ironic were
to pick it up (as it does by default, because it checks before
applying the power state, then it would not wipe the token
information, preventing the agent from connecting on the next
action/attempt/operation.
We now remove the token on all calls to conductor
utilities node_power_action method when appropriate, even
if no other work is required.
Change-Id: Ie89e8be9ad2887467f277772445d4bef79fa5ea1
This change adds the capability for the ironic-conductor
and standalone service process to transmit timer and counter
metrics to the message bus notifier which may be consumed by
a ceilometer, ironic-prometheus-exporter, or other consumer of
metrics event data on to the message bus.
This functionality is not presently supported on dedicated API
services such as those running as an ``ironic-api`` application
process, or Ironic WSGI application. This is due to the lack of
an internal trigger mechanism to transmit the data in a metrics
update to the message bus and/or notifier plugin.
This change requires ironic-lib 5.4.0 to collect and ship metrics via
the message bus.
Depends-On: https://review.opendev.org/c/openstack/ironic-lib/+/865311
Change-Id: If6941f970241a22d96e06d88365f76edc4683364
While developing some internal metrics collection capability,
and the realization that a lock was needed, we realized that
the lock activity itself would be a bit noisy. And image actions
also get lock logging, and it is just really noisy, but not super
helpful for troubleshooting.
So, set it to WARNING instead.
Discussion wise, see:
https://review.opendev.org/c/openstack/ironic-lib/+/865311
Change-Id: I3ab14ee5b5cc063784d26e3c760f1422c692060d
Follow-up to I6b830e5cc30f1fa1f1900e7c45e6f246fa1ec51c
Original changa introduced some errors such as mismatched
arguments for exceptions
Story: 2010275
Task: 46204
Change-Id: I550e048ab22a6cd25502b41d1c579819df369249
Follow-up to I74b19f7a42c1326d7ec04e6320176e81639ebfb4
Mention need of the maintenance mode to orphan swift
objects during node clean up
Story: 2010275
Task: 46204
Change-Id: Ie95a5bd333b0dab3e97254dfb4eb532bdbfd2650
The tl;dr is that we changed ``inspecting`` to include a
``inspect wait`` state. Unfortunately we never spotted the logic
inside of the db API. We never spotted it because our testing in
inspection code uses a mocked task manager... and we *really* don't
have intense db testing because we expect the objects and higher
level interactions to validate the lowest db level.
Unfortunately, because of the out of band inspection workflow,
we have to cover both cases in terms of what the starting state
and ending state could be, but we've added tests to
validate this is handled as we expect.
Change-Id: Icccbc6d65531e460c55555e021bf81d362f5fc8b
The dynamically allocated console port for a node is saved
into database and reused on subsequent console operations.
In certain code path the port record cann't be trusted and
we should do a re-allocation.
This patch fixes the issue by ignores previous allocation
record. The extra cleanup in the takeover is not required
anymore and removed as well.
Change-Id: I1a07ea9b30a2c760af7a6a4e39f3ff227df28fff
Story: 2010489
Task: 47061
Recently we hit an issue that the pid file is missing, current logic
simply removes pid file if the corresponding process is not found,
but if the pid file is lost then the console could never be stopped
and futher more, be restarted, regardless if the process is there or
not.
This patch captures FileNotFound to the exception handling to allow
console recovery.
Change-Id: I1a0b8347e960c6cff8aca10a22c67b710f7d617e
Follow-up to Ie174904420691be64ce6ca10bca3231f45a5bc58
which enables storage of inventory in Swift, but does not delete
the Swift entry when the node whose inventory is stored is deleted
Story: 2010275
Task: 46204
Change-Id: I74b19f7a42c1326d7ec04e6320176e81639ebfb4
- Basic support and testing for CRUD for node.shard.
- Policy checking for update node.shard.
- New API endpoint: GET /v1/shards
- Policy checking for GET /v1/shards
- Support for querying for nodes in a list of shards
Story: 2010378
Task: 46624
Change-Id: I385594339028c20cfc83fdcc4cbbec107efdacff
This request parameter will allow an operator to ask the question
"Do I need to assign shards to any of my nodes?".
Change-Id: I26b745e5ef2b320a8d8a0667ac61c080fcdcd576
The format string is expecting a dictionary with keys matching
those used in the format string. Any unused parameters will
cause an "not all arguments converted during string formatting"
exception.
The quote style is also changed from double to single quotes to
match the other logging statements in the code.
Change-Id: Ic9dea4f51d82866be8ac16242a79237c789b9745
Ports cannot be filtered by node, node_uuid, or portgroup at the same
time as other potential filters. Explicitly document this.
Change-Id: Ia875a6543eb8871ce70028c055de2f1832c3ecdb
Per discussion in IRC, the retirement documentation sets forth
an understanding that sensitive data will be removed from the
baremetal node, however this is performed through cleaning which
inherently sets forth a requirement in automated cleaning.
Explicitly note, and provide options should an operator wish
to utilize the feature.
Change-Id: I6755433b97cacd6ebf6a8f7eb5b404697e0a4349
In its current place, reno config changes will not cause
build-openstack-releasenotes job to run, which means changes can land to
that config without being tested. Yikes!
Also fixes error in regexp which was preventing this from actually
fixing the build-openstack-releasenotes job.
Change-Id: I4d46ba06ada1afb5fd1c63db5850a1983e502a6c
The anaconda job is failing as were getting a redirect issued back
upon attempting to validate URLs. The servers are now directing us
to use HTTPS instead.
Change-Id: Iac8e6e58653ac616250f4ce3ab3ae7f5164e5b03