12487 Commits

Author SHA1 Message Date
Julia Kreger
6d3c4ced5f Disable spanning tree
So, I've long wondered if we still have some spanning tree behavior
going on in CI. Turns out we might, but we just rely upon the defaults
which creates a variable.

Anyway, regardless, I found some details in the ovs-vsctl manual[0], and
well, lets set the options!

[0]: http://www.openvswitch.org/support/dist-docs/ovs-vsctl.8.html

Change-Id: I8f229fa6e738a69a668d4b891723431b2da362fa
2023-07-06 17:16:33 +00:00
Zuul
db5d6ada36 Merge "Use jammy for base jobs" 2023-07-03 15:11:27 +00:00
Riccardo Pittau
f69e9da1d0 Fix multiple things in CI
Disable irmc virtual media test_prepare_instance_with_secure_boot
and test_clean_up_instance_with_secure_boot
Related-Bug: #2025424

Explicitly close out test connection
When creating a test database, we should follow the same
pattern we know to be good, where an orphaned handler is not
left in memory to close out a connection.
Change connection style in dbTestBase to mirror how we do
database connections so they close out when we are done
with them.

Change migrations timeout to be >60 seoncds
In local testing, I found the migrations tended to take an
average of 70 seconds. Granted, my test machine is old, and slow
but the performance is very similar to a busy cloud provider.
As such, increase the timeout to a larger value so we can enable
the double migration test again.
Also use BASE_TEST_TIMEOUT as time limit for unit tests, failing
hard if that's passed.

Co-Authored-By: Julia Kreger <juliaashleykreger@gmail.com>
Change-Id: I84802be2e75751fe44ba2e1b60e60563cd276483
2023-06-30 14:45:45 +02:00
Zuul
1d9e9b6e77 Merge "Use tox env variables in coverage tests" 2023-06-29 16:48:22 +00:00
Zuul
52ffbc92a0 Merge "Execute tests by class, not randomly" 2023-06-29 16:12:25 +00:00
Riccardo Pittau
048bff1daf Use tox env variables in coverage tests
We should try to run coverage tests in the same conditions as
unit tests using the same tox environment.

Change-Id: I07341a5e16264964f4d165b9b5ef80dc35293a17
2023-06-29 15:53:09 +00:00
Julia Kreger
a600895ba3 Execute tests by class, not randomly
The default test runner behavior is to distribute various classes
across many runners, which is fine. But when you have lots of setup
to ensure an environment is correct for tests in a class, you
generally want to execute that together instead of separately.

As such, set the --parallel-class execution option for stestr.

Change-Id: I5a65accfb7e2033690b2934d874141db7f4bf383
2023-06-29 02:03:55 +00:00
Julia Kreger
cf49d54e6f CI: minor fix to irmc driver clean_up_instance testing
Change-Id: I5d73879cfb747c8357f823d1ac8fbcb08c5addaa
2023-06-29 01:51:46 +00:00
Julia Kreger
c392814ca8 CI: Fix PXE Ananconda cleanup test
The PXE Annaconda dhcp cleanup test triggers the dhcp_factory clean
up code by default. Which is good! Problem is, if you don't have
dnsmasq installed, things blow up.

Specifically becuase it was called in such a way where it was
trying to clean up dhcp records for nodes. Example:

ironic.common.exception.InstanceDeployFailure: An error occurred
    after deployment, while preparing to reboot the node
    1be26c0b-03f2-4d2e-ae87-c02d7f33c123: [Errno 2] No such file
    or directory:
    '/etc/dnsmasq.d/hostsdir.d/ironic-52:54:00:cf:2d:31.conf'

Instead of executing that far, we just now check that we did, indeed
call for dhcp cleanup.

This was discovered while trying to fix unit test race conditions
and random failures in CI.

Change-Id: Id7b1e2e9ca97aeff786e9df06f35eca67dd36b58
2023-06-28 17:57:57 +00:00
Zuul
76a920aed2 Merge "Handle SAWarning around allocations FK Constratins" 2023-06-28 01:26:22 +00:00
Zuul
8746ba0f62 Merge "Disable WAL Pragma for Unit Testing" 2023-06-28 01:24:01 +00:00
Zuul
654c85c0cc Merge "Fix SQLAlchemy engine connection listener" 2023-06-27 23:45:21 +00:00
Jay Faulkner
d5e4f013c8 Skip tests that fail occassionally in CI
These tests fail in CI some 5-10% of the time, leading to unneeded CI
churn and rechecks.

The problem with the intermittant failure is related to how the test
is constructed, not to the code itself.

https://bugs.launchpad.net/ironic/+bug/2024994 has been filed to track
true resolution of this issue, but until then, we should not abuse CI
resources by continuing to run tests.

Depends-On: https://review.opendev.org/c/openstack/ironic/+/886881
Related-Bug: https://bugs.launchpad.net/ironic/+bug/2024994
Change-Id: I9f124b005d346f961f9c95c917d5014988a7f45e
2023-06-26 22:33:07 +00:00
Julia Kreger
402c32094b Handle SAWarning around allocations FK Constratins
We have started to notice an SAWarning from sqlalchemy indicating:

  SAWarning: Cannot correctly sort tables; there are unresolvable
      cycles between tables "allocations, nodes", which is usually
      caused by mutually dependent foreign key constraints.
      Foreign key constraints involving these tables will not be
      considered; this warning may raise an error in a future release.

Hunting this down, it appears to be the two data consistency Foreign
Key constraints in the "allocations" table where an allocation would
try to have a conductor_affinity value mapped to conductors.id
and also have a direct association to a node, which *also* had the
same constraint.

And then similarlly, mapping in reverse, asserting a fk constraint,
when nodes also had it's own constraint back on allocations.

Sort of a circular loop.

Anyhow, removes it, and adds a db migration to remove the two
constraints.

Change-Id: I5596008e4971a29c635c45b24cb85db2d0d13ed3
2023-06-26 14:27:59 -07:00
Julia Kreger
46e4f447ff Disable WAL Pragma for Unit Testing
In all the concern about the WAL pragma change with unit test issues,
I've decided that maybe the path is to just turn it off for unit testing.

Change-Id: I67bbdb158ad1c8350f1e613ac0afb861ccea00a0
2023-06-26 14:27:42 -07:00
Jay Faulkner
4a570042c9 Fully ensure counts are out of scope of cxtmgr
With counts declared outside the context manager, it means values
in counts that come from the sqlalchemy results object are still in
scope when the context manager exits! Oh no!

Co-Authored-By: Clif Houck <me@clifhouck.com>
Related-Bug: https://bugs.launchpad.net/ironic/+bug/2023316
Change-Id: Ifd66444f73355d912c52905a0f04748284b25c1b
2023-06-26 14:26:27 -07:00
Jay Faulkner
dd79ae5e24 Ensure all sqla objects descoped before ending txn
Another of our continuing attempts to find the magic stop sign plaguing
our CI runs.

Related-bug: https://bugs.launchpad.net/ironic/+bug/2024941
Change-Id: I3626a12bad3299ace2991550cafd92f4f0142434
2023-06-26 17:47:09 +00:00
Zuul
d869dd3a15 Merge "Revert "Disabling test_upgrade_twice temporarily for CI fix"" 2023-06-26 15:49:58 +00:00
Julia Kreger
3d869bca26 Fix test_migrations with firmware information.
Two issues existed in our test migrations:

1) We took a sqlalchemy orm-ey object and returned it. The handler closed
   but the connection stays open.
2) In the test we mocked data, saved data, checked data, but there is a
   constraint that would prevent nodes from being deleted before the
   firmware_information data.

Also fixes some additional assignment of object value
pattern of usage elsewhere in the test migrations as pointed
out in code review.

Co-Authored-By: Jay Faulkner <jay@jvf.cc>
Change-Id: I91429f33ce01f40ecd665b2f76b27a162354bc6f
2023-06-26 06:57:56 -07:00
Jay Faulkner
ee963d3f0d Revert "Disabling test_upgrade_twice temporarily for CI fix"
This reverts commit b4f8209b99af32d8d2a646591af9b62436aad3d8.

Change-Id: I3d29f347cd943858df0d85358073daf5575a07ad
Depends-on: https://review.opendev.org/c/openstack/ironic/+/886881
2023-06-23 23:33:50 +00:00
Julia Kreger
9be84608f1 Fix SQLAlchemy engine connection listener
Change-Id: Ifa156d61329d61aac05b3f8c01f7c9c51bd386b1
2023-06-23 10:24:25 -07:00
Iury Gregory Melo Ferreira
2f8ee2cf40 Fix IRONIC_IMAGE_NAME=non-existent-image
Our jobs started failling after a possible change in tempest
that introduced "non-existent-image" [1]

[1] https://review.opendev.org/c/openstack/tempest/+/831018

Change-Id: Iff7943446741e499100561a79c9f4930beab3da2
2023-06-22 22:02:29 -03:00
Zuul
24943402a1 Merge "Migrate the inspector's /continue API" 2023-06-21 14:37:16 +00:00
Riccardo Pittau
dfb7f05ff0 Allow setting migrations timeout value from tox
The MIGRATIONS_TIMEOUT value is the timeout for the migrations tests
in seconds.
Can now be set ad environment variable, default to 60 seconds.

Change-Id: I20045a7c0c08a37038b4707791f6b67f715c4877
2023-06-20 17:34:28 +02:00
Zuul
6c05e99c8d Merge "Add test timeout to tox config" 2023-06-20 04:15:29 +00:00
Zuul
ce1abd4007 Merge "Handle duplicate node inventory entries per node" 2023-06-14 16:33:52 +00:00
Riccardo Pittau
f052bde94f Add test timeout to tox config
Fails a test after OS_TEST_TIMEOUT seconds, default to 30 seconds.
Based on OS_TEST_TIMEOUT from oslotest
See https://github.com/openstack/oslotest/blob/master/oslotest/base.py#L35-L45
for more info

Also increase log output for improved troubleshooting.

Change-Id: Ibcdca2c449970b5a2ecdf2e3bb3bb900881b6d7c
2023-06-12 09:17:29 +02:00
Jay Faulkner
b4f8209b99 Disabling test_upgrade_twice temporarily for CI fix
This test has been taking an inordinate amount of time to complete. We
should figure out the root cause and fix it, but in the meantime it only
causes us harm to be unable to land patches.

Related-Bug: 2023316
Change-Id: I604369e000c80914cf0c584c9deab7245c66b1b4
2023-06-08 10:21:29 -07:00
Dmitry Tantsur
ca5b2feeee Mock sleep in unit tests that rely on it
This change fixes the worst offenders, potentially reducing the test
runs by more than 10 seconds.

I could not fix iLO tests though because they heavily rely on time.sleep
being precise.

Change-Id: I10d7845700275d9d03b98ebadd0f12540f1e7656
2023-06-07 16:10:07 +02:00
Mahnoor Asghar
fa2d6685f3 Handle duplicate node inventory entries per node
When a node is inspected more than one time and the database is
configured as a storage backend, a new entry is made in the database
for each inspection result (node inventory). This patch handles this
behaviour as follows:
By deleting previous inventory entries for the same node before adding
 a new entry in the database.
By retrieving the most recent node inventory from the database when the
database is queried.

Change-Id: Ic3df86f395601742d2fea2bcde62f7547067d8e4
2023-06-07 08:08:37 -04:00
Dmitry Tantsur
0370f5ac97 Migrate the inspector's /continue API
This change creates all necessary parts to processing inspection data:

* New API /v1/continue_inspection

Depending on the API version, either behaves like the inspector's API
or (new version) adds the lookup functionality on top.

The lookup process is migrated from ironic-inspector with minor changes.
It takes MAC addresses, BMC addresses and (optionally) a node UUID and
tries to find a single node in INSPECTWAIT state that satisfies all
of these. Any failure results in HTTP 404.

To make lookup faster, the resolved BMC addresses are cached in advance.

* New RPC continue_inspection

Essentially, checks the provision state again and delegates to the
inspect interface.

* New inspect interface call continue_inspection

The base version does nothing. Since we don't yet have in-band
inspection in Ironic proper, the only actual implementation is added
to the existing "inspector" interface that works by doing a call
to ironic-inspector.

Story: #2010275
Task: #46208
Change-Id: Ia3f5bb9d1845d6b8fab30232a72b5a360a5a56d2
2023-06-07 10:57:08 +02:00
Zuul
97f7177495 Merge "execute on child node support" 2023-06-07 04:04:45 +00:00
Zuul
a648ac22d8 Merge "Be explicit about bugfix branches support lifetime" 2023-06-06 17:28:45 +00:00
Riccardo Pittau
f434643293 Use jammy for base jobs
Leave the snmp job on focal for the time being as it's failing on jammy
and we need to move forward with the migration.

Change-Id: I0b9b600c3eb10761054abdb9c13d7107269001b9
2023-06-06 17:09:42 +02:00
Zuul
8ef69aaa6a Merge "Prepare [inspector]require_managed_boot to change to True in the future" 2023-06-05 14:36:59 +00:00
Zuul
964a82db18 Merge "Add to Redfish hardware inventory collection" 2023-06-01 10:25:14 +00:00
Riccardo Pittau
c0643e9d05 Be explicit about bugfix branches support lifetime
Also fix new release model link

Change-Id: I1c9b3b1c8481a315199070468298a73936ae93a7
2023-05-31 15:57:38 +02:00
Zuul
2bd69444d9 Merge "[iRMC] Fix IPMI incompatibility handling error" 22.0.0 bugfix-22.0-eol 2023-05-30 13:20:39 +00:00
Mahnoor Asghar
b3d7ba88d2 Add to Redfish hardware inventory collection
Add to the information collected by Redfish hardware inspection from
sushy, and store it in the documented hardware inventory format

Change-Id: I651599b84e6b8901647960b719626489b000b65f
2023-05-30 05:58:00 -04:00
Zuul
b0f76a2cf1 Merge "Make metal3 job voting" 2023-05-26 14:57:20 +00:00
Zuul
19567077d5 Merge "Don't return the in-flight SQL handler" 2023-05-25 15:40:14 +00:00
Jay Faulkner
bf850cad14 Make metal3 job voting
Now that we have autocommit disabled, we need to make the metal3 job
voting to ensure we don't accidentally break sqlite support in the
future.

Change-Id: I4915dc23b1101b9b799f392434f237e5ccb323e4
2023-05-25 16:36:15 +02:00
Zuul
443cbdebce Merge " Add DB model for Firmware" 2023-05-25 14:23:37 +00:00
Zuul
c119e0f722 Merge "Add ironic-grenade-skip-level Job" 2023-05-25 10:45:09 +00:00
Zuul
209e1a70a7 Merge "Remove unused get_not_versions from dbapi" 2023-05-25 04:43:49 +00:00
Zuul
3c3188e7b0 Merge "Remove model_query use from general dbapi calls" 2023-05-25 04:43:46 +00:00
Zuul
d107252caa Merge "follow-up on DPU change api-ref" 2023-05-25 01:58:35 +00:00
Iury Gregory Melo Ferreira
d665304940 Add DB model for Firmware
* firmware_information table
* `firmware_interface` field in node table
* database migration
* tests

Story: 2010659
Task: 47976

Change-Id: I593d20d3b58ab202c32c31213121b5a2d90934c5
2023-05-24 20:31:38 -03:00
Zuul
32532eeda5 Merge "DPU modeling - parent_node DB/Model/API" 2023-05-24 23:18:33 +00:00
Julia Kreger
013ac0cb41 execute on child node support
Allows steps to be executed on child nodes, and adds
the reserved power_on, power_off, and reboot step names.

Change-Id: I4673214d2ed066aa8b95a35513b144668ade3e2b
2023-05-24 15:42:46 -07:00