2290 Commits

Author SHA1 Message Date
Jay Faulkner
1b2a35afdf Add release note for node sharding
Release note covers changes in the previous 4 commits in this chain.

Change-Id: I5388e82e958acd930295215c9f9427080650866d
2023-02-17 09:38:20 -08:00
Zuul
c9c595f235 Merge "Add service role RBAC policy support" 2023-01-31 21:54:19 +00:00
Zuul
e73c3c9c22 Merge "Fix grub config path default" 2023-01-27 22:09:15 +00:00
Zuul
571d0223ba Merge "[iRMC] Handle IPMI incompatibility in iRMC S6 2.x" 2023-01-18 16:46:10 +00:00
Zuul
8cb5ba9ff8 Merge "[iRMC] identify BMC firmware version" 2023-01-18 16:32:31 +00:00
Julia Kreger
bad3790e8a Add service role RBAC policy support
This change adds support for the ``service`` role, which is intended
largely for service to service communiation, such as if one wanted to
utilzie a "nova" project, and have an ironic service user within it,
and then configure the ``nova-compute`` service utilizing those credentials.

Or vice versa, an "ironic" project, with a nova user.

In this case, access is exceptionally similar to the rights afforded to
a "project scoped manager" or an "owner-admin".

Change-Id: Ifd098a4567d60c90550afe5236ae2af143b6bac2
2023-01-18 07:59:35 -08:00
Zuul
a48af6b5f1 Merge "Fix selinux context of published image hardlink" 2023-01-17 16:57:59 +00:00
Vanou Ishii
d23f72ee50 [iRMC] Handle IPMI incompatibility in iRMC S6 2.x
Since iRMC S6 2.00, iRMC firmware disables IPMI over LAN
with default iRMC firmware configuration.

To deal with this firmware incompatibility, this commit
modifies driver's methods which use IPMI to first try
IPMI and, if IPMI fails, try to use Redfish API.

Story: 2010396
Task: 46746
Change-Id: I1730279d2225f1248ecf7fe403a5e503b6c3ff87
2023-01-17 09:36:27 +09:00
Vanou Ishii
eae33a0acb [iRMC] identify BMC firmware version
Since iRMC S6 2.00, iRMC firmware doesn't support HTTP
connection to REST API.

To deal with this firmware incompatibility, this commit
adds verify step to check connection to REST API and adds
node vendor passthru to fetch&cache version of iRMC firmware.

Story: 2010396
Task: 46745
Change-Id: Ib04b66b0c7b1ef1c4175841689c16a7fbc0b1e54
2023-01-16 18:38:57 +09:00
Jakub Jelinek
2e80ea9099 API for node inventory
Add api to access node inventory

Story: 2010275
Task: 46204
Change-Id: If50f665da5fbb16f7646f3d6195a6e14e7325b0a
2023-01-12 15:09:18 +00:00
Riccardo Pittau
c05c09fd3a Fix selinux context of published image hardlink
If the published image is a hardlink, the source selinux context is
preserved. This could cause access denied when retrieving the image
using its URL.

Change-Id: I550dac9d055ec30ec11530f18a675cf9e16063b5
2023-01-11 16:00:01 +01:00
Julia Kreger
2f4a156d29 Fix grub config path default
Grub2 looks for files in different paths depending on the boot mode
of the binary. Previously the grub_config_path setting was defaulted
to the path used exclusively for BIOS booting, which meant anyone
using it had to override the setting. Now, we've set the default
to the default for UEFI booting, and the world should be a happier,
and less override filled place.

Change-Id: Id6723e92efb62f8ca03099f15c90580cec887ddd
2023-01-11 06:59:45 -08:00
Zuul
4f6a456334 Merge "Fix "'NoneType' object is not iterable" in RAID" 2023-01-05 11:44:09 +00:00
Aija Jauntēva
17c9e58c9e Fix "'NoneType' object is not iterable" in RAID
Do not update `raid_configs` if operation is synchronous.
First, it is not needed, second, it will not be cleaned
up by async periodics. As the result the data remains
on the node and causes errors the next time node is in
cleaning state.

Story: 2010476
Task: 47037

Change-Id: Ib1850c58d1670c3555ac9b02eb7958a1b440a339
2022-12-16 06:32:37 -05:00
Zuul
f96b258709 Merge "Catch any exception for Cleaning" 2022-12-13 21:37:03 +00:00
Zuul
cccc4483b0 Merge "Fixes anaconda deploy for PXE boot" 2022-12-12 16:58:55 +00:00
Julia Kreger
aca8ebc064 Catch any exception for Cleaning
No exception is used to communicate back, the exceptions
are used to catch failures, and if we don't catch other
possible exceptions leaving cleaning states, we may not clean
up state properly.

So instead of specific exceptions, we just catch any exception
like is used earlier in the same method.

Inspired by https://review.opendev.org/c/openstack/ironic/+/866856
and investigation through the code base as a result of inability
to clean the node.

Change-Id: I2a6bca3550819b98adbaffe315f77427b8a43d62
2022-12-12 07:22:21 -08:00
Zuul
c04344ca60 Merge "Align iRMC driver with Ironic's default boot_mode" 2022-11-25 16:32:42 +00:00
Vanou Ishii
071cf9b2dd Align iRMC driver with Ironic's default boot_mode
This commit modifies iRMC driver to use ironic.conf [deploy]
default_boot_mode as default value of boot_mode.
Before this commit, iRMC driver assumes Legacy BIOS as default
boot_mode and value of default_boot_mode doesn't have any effect
on iRMC driver's behavior.

Story: 2010381
Task: 46643
Change-Id: Ic5a235785a1a2bb37fef38bd3a86f40125acb3d9
2022-11-06 21:57:11 -05:00
Vanou Ishii
2200f931de Change boot_interface order of iRMC driver
This change aligns the boot interface order of the irmc
hardware type to match the other hardware type interface
order lists.

This change is a result of an operator reporting inconsistent
behavior of ironic when they are adding nodes using the irmc
hardware type, where they would default to use the "irmc-pxe" boot
interface, where as the other interfaces would end up defaulting
to "ipxe".

Change-Id: I017c6560f9de884eefb2c1925321380cc1c721e2
2022-11-06 21:55:58 -05:00
Zuul
5c01f7f7b4 Merge "Fix the anaconda deploy for the ISO mounted" 2022-11-03 16:09:44 +00:00
Zuul
c06cb281f9 Merge "Add support auth protocols for iRMC" 2022-10-19 23:10:56 +00:00
Julia Kreger
49e085583d Phase 1 - SQLAlchemy 2.0 Compatability
One of the major changes in SQLAlchemy 2.0 is the removal
of autocommit support. It turns out Ironic was using this quite
aggressively without even really being aware of it.

* Moved the declaritive_base to ORM, as noted in the SQLAlchemy 2.0
  changes[0].

* Console testing caused us to become aware of issues around locking
  where session synchronization, when autocommit was enabled, was
  defaulted to False. The result of this is that you could have two
  sessions have different results, which could results on different
  threads, and where one could still attempt to lock based upon prior
  information. Inherently, while this basically worked, it was
  also sort of broken behavior. This resulted in locking being
  rewritten to use the style mandated in SQLAlchemy 2.0 migration
  documentation. This ultimately is due to locking, which is *heavily*
  relied upon in Ironic, and in unit testing with sqlite, there are
  no transactions, which means we can get some data inconsistency
  in unit testing as well if we're reliant upon the database to
  precisely and exactly return what we committed.[1]

* Begins changing the query.one()/query.all() style to use explicit
  select statements as part of the new style mandated for migration
  to SQLAlchemy 2.0.

* Instead of using field label strings for joined queries, use the
  object format, which makes much more sense now, and is part of
  the items required for eventual migration to 2.0.

* DB queries involving Traits are now loaded using SelectInLoad
  as opposed to Joins. The now deprecated ORM queries were quietly
  and silently de-duplicating rows and providing consistent sets
  from the resulting joined table responses, however putting much
  higher CPU load on the processing of results on the client.
  Prior performance testing has informed us this should be a minimal
  overhead impact, however these queries should no longer be in
  transactions with the Database Servers which should offset the
  shift in load pattern. The reason we cannot continue to deduplicate
  locally in our code is because we carry Dict data sets which cannot
  be hashed for deduplication. Most projects have handled this by
  treating them as Text and then converting, but without a massive
  rewrite, this seems to be the viable middle ground.

* Adds an explict mapping for traits and tags on the Node object
  to point directly to the NodeTrait and NodeTag classes. This
  superceeds the prior usage of a backref to make the association.

* Splits SQLAlchemy class model Node into Node and NodeBase, which
  allows for high performance queries to skip querying for ``tags``
  and ``traits``. Otherwise with the afrormentioned lookups would
  always execute as they are now properties as well on the Node
  class. This more common of a SQLAlchemy model, but Ironic's model
  has been a bit more rigid to date.

* Adds a ``start_consoles`` and ``start_allocations`` option to the
  conductor ``init_host`` method. This allows unit tests to be
  executed and launched with the service context, while *not* also
  creating race conditions which resulted in failed tests.

* The db API ``_paginate_query`` wrapper now contains additional
  logic to handle traditional ORM query responses and the newer style
  of unified query responses. Due to differences in queries and handling,
  which also was part of the driver for the creation of ``NodeBase``,
  as SQLAlchemy will only create an object if a base object is referenced.
  Also, by default, everything returned is a tuple in 1.4 with the
  unified interface.

* Also modified one unit test which counted time.sleep calls, which is
  a known pattern which can create failures which are ultimately noise.

Ultimately, I have labelled the remaining places which SQLAlchemy
warnings are raised at for deprecation/removal of functionality,
which needs to be addressed.

[0] https://docs.sqlalchemy.org/en/14/changelog/migration_20.html
[1] https://docs.sqlalchemy.org/en/14/dialects/sqlite.html#transaction-isolation-level-autocommit

Change-Id: Ie0f4b8a814eaef1e852088d12d33ce1eab408e23
2022-10-13 21:21:40 +00:00
Julia Kreger
1435a15ce3 Fix allocations default table type
In trying to figure out why I was unable to run
all of the test_migrations tests, I realized we need
to fix and clean up our unicode declarations.

Specifically, the way I found this was my local mysql
install was defaulted to using 4 Byte Unicode characters,
however some of our fields are 255 characters, which do not
fit inside of InnoDB tables.

They do, however fit with the "utf8" storage alias, which is
presently short for UTF8MB3, as opposed to UTF8MB4 which is
what my local database server was configured for. Because this
was in opportunistic tests, I wasn't able to really sort out
what was going on and thought we needed to shorten the fields.

In reality, it turns out we never defined the allocations
table to use UTF8 and Innodb for storage.

Storage engine wise, this is not a big deal, but may mean a
DBA will one day need to dump and reload the allocation table
of a deployment.

Character set wise... It is not great, but there is not a good
way for us to do this programatically. In my opinion, the chance
of an issue being encountered by an operator is unlikely, which
out weighs the risk and impact of dumping the entire table,
deleting the table, recreating the table with the updated schema
and then repopulating the entries. Of course, if operators are not
using allocations, then it really doesn't matter for them.

Along the way, I discovered we had used the "UTF8" type alias,
which may change one day, which would break Ironic. As such,
I've also updated the definitions used to create databases
and updated our documentation.

Recommended reading:
https://docs.sqlalchemy.org/en/14/dialects/mysql.html#unicode
https://dev.mysql.com/doc/refman/8.0/en/charset-unicode-utf8mb4.html

Story: 2010348
Task: 46492

Change-Id: I4103152489bf61e2d614eaa297da858f7b2112a3
2022-10-13 21:21:24 +00:00
Julia Kreger
9344eb22d1 Add upgrade check warning for allocations db
Adding an upgrade check to provide awareness to the state of
the database in regards if an unexpected engine is in use or
if the character set encoding is also not UTF8.

These will raise non-fatal warnings on the upgrade status
check.

Change-Id: Ide0eb4690a056be557e5ea7d5ba5f6be37b50d0a
Story: 2010384
2022-10-13 10:54:55 -07:00
Nisha Agarwal
0215d3cd76 Fixes anaconda deploy for PXE boot
Fixes the anaconda deploy(URL based) and adds
anaconda_boot entry to pxe_grub_config.template so
that ProLiants can be also deployed in PXE mode.

Story: 2010347
Task: 46490

Change-Id: I4b9e3a2060d9d73de5cab31cc08d3a764dc56e90
2022-10-07 11:31:09 +00:00
Nisha Agarwal
ca54c4df26 Fix the anaconda deploy for the ISO mounted
Fix the anaconda deploy for the ISO mounted
on a webserver.

Story: 2010322
Task: 46429
Change-Id: I2860faa7322116ffef1255709fe12f806257b069
2022-09-29 15:13:53 +00:00
Shukun Song
233c640838 Add support auth protocols for iRMC
This patch adds new SNMPv3 auth protocols to iRMC which are supported
from iRMC S6.

Change-Id: Id2fca59bebb0745e6b16caaaa7838d1f1a2717e1
Story: 2010309
Task: 46353
2022-09-29 20:12:17 +09:00
Zuul
fa2c3aa58c Merge "Zed: Add a prelude for the release notes" 2022-09-23 00:31:09 +00:00
Julia Kreger
38a170dd6a Zed: Add a prelude for the release notes
The Zed cycle is coming to a close, and we need a release notes
prelude.

Contribtors, please edit, I just put this together so we wouldn't
forget.

Change-Id: I3a5ca31bf3648c9f8a956f4592c305d2b23f419e
2022-09-22 20:22:27 -03:00
Zuul
0773a80f91 Merge "Implement a DHCP driver backed by dnsmasq" 2022-09-22 13:22:33 +00:00
Zuul
37a0e97712 Merge "Fix idrac-redfish RAID controller mode conversion" 2022-09-22 09:38:22 +00:00
Zuul
eeeaa274cf Merge "Concurrent Distructive/Intensive ops limits" 2022-09-21 16:38:35 +00:00
Zuul
b767a92dd8 Merge "increase disk_erasure_coconcurrency" 2022-09-20 20:56:35 +00:00
Zuul
0d71a2562e Merge "Fix nodes stuck at cleaning on Network Service issues" 2022-09-20 20:56:31 +00:00
Kaifeng Wang
31c8087408 Fix nodes stuck at cleaning on Network Service issues
Ironic validates network interface before the cleaning process,
currently invalid parameter is captured but for not others.
There is chance that a node could be stucked at the cleaning
state on networking issues or temporary service down of neutron
service.

This patch adds NetworkError to the exception hanlding to cover
such cases.

Change-Id: If20de2ad4ae4177dea10b7ebfc9a91ca6fbabdb9
2022-09-20 09:31:51 -07:00
Julia Kreger
9a8b1d149c Concurrent Distructive/Intensive ops limits
Provide the ability to limit resource intensive or potentially
wide scale operations which could be a symptom of a highly
distructive and unplanned operation in progress.

The idea behind this change is to help guard the overall deployment
to prevent an overall resource exhaustion situation, or prevent an
attacker with valid credentials from putting an entire deployment
into a potentially disasterous cleaning situation since ironic only
other wise limits concurrency based upon running tasks by conductor.

Story: 2010007
Task: 45140

Change-Id: I642452cd480e7674ff720b65ca32bce59a4a834a
2022-09-20 06:47:38 -07:00
Zuul
33d1d439d0 Merge "Correct Image properties lookup for paths" 2022-09-20 01:07:45 +00:00
Aija Jauntēva
397e49a5e6 Fix idrac-redfish RAID controller mode conversion
PERC 9 and PERC 10 might not be in RAID mode with no or limited RAID
support. This fixes to convert any eligible controllers to RAID mode
during delete_configuration clean step or deploy step.

Story: 2010272
Task: 46199

Change-Id: I5e85df95a66aed9772ae0660b2c85ca3a39b96c7
2022-09-15 03:36:31 -04:00
Zuul
aae524a46c Merge "Adds create_csr and add_https_certificate clean step" 2022-09-13 11:51:23 +00:00
Zuul
a171e588fd Merge "Enables event subscription methods for ilo and ilo5 hardware types" 2022-09-12 15:49:33 +00:00
Alexander Lingo
4415c55028 Cleanup submitted SNMP driver code for additional PDUs
* Resolved PEP8 issues
* Trimmed comments to remove extraneous information
* Changed rfc1902.Integer() calls to the correct snmp.Integer() calls
* Fixed power state logic checking for new PDUs that don't have transitional states (e.g., 'pendingOn')
* Removed redundant warning messages
* Added unit tests for Raritan PD2, ServerTech Sentry 3/4, and Vertiv Geist drivers
* Updated documentation to list tested PDUs for the new drivers
* Updated release notes

Change-Id: I9da7b9042b817c346f75a44cd8287e1f63efcb56
2022-09-09 16:47:47 -07:00
ankit
9c19dd6ef3 Adds create_csr and add_https_certificate clean step
This commit adds new clean steps create_csr and add_https_certificate
to allow users to create certificate signing request and adds
https certificate to the iLO.

Story: 2009118
Task: 43016
Change-Id: I1e2da0e0da5e397b6e519e817e0bf60a02bbf007
2022-09-09 07:44:02 +00:00
Zuul
d5df494ad5 Merge "CI: anaconda: permit tls certificate validation bypass" 2022-09-05 17:32:37 +00:00
mallikarjuna.kolagatla
166bd1697a Enables event subscription methods for ilo and ilo5 hardware types
Enables event subscription methods by inheriting RedfishVendorPassthru
for ilo and ilo5 hardware types

Story: 2010207
Task: 45931
Change-Id: I96f7e44069402e3f1d25bcd527408008ca5e77cb
2022-09-05 11:58:44 +00:00
Zuul
7f933a1bed Merge "Redfish: Consider password part of the session cache" 2022-09-05 09:26:57 +00:00
Steve Baker
754e6bb662 Implement a DHCP driver backed by dnsmasq
The ``[dhcp]dhcp_provider`` configuration option can now be set to
``dnsmasq`` as an alternative to ``none`` for standalone deployments.
This enables the same node-specific DHCP capabilities as the
``neutron`` provider. See the ``[dnsmasq]`` section for configuration
options.

Change-Id: I3ab86ed68c6597d4fb4b0f2ae6d4fc34b1d59f11
Story: 2010203
Task: 45922
2022-09-05 13:57:39 +12:00
Jay Faulkner
9eec746660 Update releasenote for proper formatting
Step names are monospaced by convention in Ironic renos.

Change-Id: I274e9ecd7237f899298c309dd6f86029ecd2b3a0
2022-09-02 15:21:19 -07:00
Zuul
7f15710bc4 Merge "Allow project scoped admins to create/delete nodes" 2022-08-31 14:00:03 +00:00
Zuul
644ed94f48 Merge "Fix ilo boot interface order" 2022-08-31 13:59:44 +00:00