So, I've long wondered if we still have some spanning tree behavior
going on in CI. Turns out we might, but we just rely upon the defaults
which creates a variable.
Anyway, regardless, I found some details in the ovs-vsctl manual[0], and
well, lets set the options!
[0]: http://www.openvswitch.org/support/dist-docs/ovs-vsctl.8.html
Change-Id: I8f229fa6e738a69a668d4b891723431b2da362fa
Disable irmc virtual media test_prepare_instance_with_secure_boot
and test_clean_up_instance_with_secure_boot
Related-Bug: #2025424
Explicitly close out test connection
When creating a test database, we should follow the same
pattern we know to be good, where an orphaned handler is not
left in memory to close out a connection.
Change connection style in dbTestBase to mirror how we do
database connections so they close out when we are done
with them.
Change migrations timeout to be >60 seoncds
In local testing, I found the migrations tended to take an
average of 70 seconds. Granted, my test machine is old, and slow
but the performance is very similar to a busy cloud provider.
As such, increase the timeout to a larger value so we can enable
the double migration test again.
Also use BASE_TEST_TIMEOUT as time limit for unit tests, failing
hard if that's passed.
Co-Authored-By: Julia Kreger <juliaashleykreger@gmail.com>
Change-Id: I84802be2e75751fe44ba2e1b60e60563cd276483
We should try to run coverage tests in the same conditions as
unit tests using the same tox environment.
Change-Id: I07341a5e16264964f4d165b9b5ef80dc35293a17
The default test runner behavior is to distribute various classes
across many runners, which is fine. But when you have lots of setup
to ensure an environment is correct for tests in a class, you
generally want to execute that together instead of separately.
As such, set the --parallel-class execution option for stestr.
Change-Id: I5a65accfb7e2033690b2934d874141db7f4bf383
The PXE Annaconda dhcp cleanup test triggers the dhcp_factory clean
up code by default. Which is good! Problem is, if you don't have
dnsmasq installed, things blow up.
Specifically becuase it was called in such a way where it was
trying to clean up dhcp records for nodes. Example:
ironic.common.exception.InstanceDeployFailure: An error occurred
after deployment, while preparing to reboot the node
1be26c0b-03f2-4d2e-ae87-c02d7f33c123: [Errno 2] No such file
or directory:
'/etc/dnsmasq.d/hostsdir.d/ironic-52:54:00:cf:2d:31.conf'
Instead of executing that far, we just now check that we did, indeed
call for dhcp cleanup.
This was discovered while trying to fix unit test race conditions
and random failures in CI.
Change-Id: Id7b1e2e9ca97aeff786e9df06f35eca67dd36b58
We have started to notice an SAWarning from sqlalchemy indicating:
SAWarning: Cannot correctly sort tables; there are unresolvable
cycles between tables "allocations, nodes", which is usually
caused by mutually dependent foreign key constraints.
Foreign key constraints involving these tables will not be
considered; this warning may raise an error in a future release.
Hunting this down, it appears to be the two data consistency Foreign
Key constraints in the "allocations" table where an allocation would
try to have a conductor_affinity value mapped to conductors.id
and also have a direct association to a node, which *also* had the
same constraint.
And then similarlly, mapping in reverse, asserting a fk constraint,
when nodes also had it's own constraint back on allocations.
Sort of a circular loop.
Anyhow, removes it, and adds a db migration to remove the two
constraints.
Change-Id: I5596008e4971a29c635c45b24cb85db2d0d13ed3
In all the concern about the WAL pragma change with unit test issues,
I've decided that maybe the path is to just turn it off for unit testing.
Change-Id: I67bbdb158ad1c8350f1e613ac0afb861ccea00a0
With counts declared outside the context manager, it means values
in counts that come from the sqlalchemy results object are still in
scope when the context manager exits! Oh no!
Co-Authored-By: Clif Houck <me@clifhouck.com>
Related-Bug: https://bugs.launchpad.net/ironic/+bug/2023316
Change-Id: Ifd66444f73355d912c52905a0f04748284b25c1b
Another of our continuing attempts to find the magic stop sign plaguing
our CI runs.
Related-bug: https://bugs.launchpad.net/ironic/+bug/2024941
Change-Id: I3626a12bad3299ace2991550cafd92f4f0142434
Two issues existed in our test migrations:
1) We took a sqlalchemy orm-ey object and returned it. The handler closed
but the connection stays open.
2) In the test we mocked data, saved data, checked data, but there is a
constraint that would prevent nodes from being deleted before the
firmware_information data.
Also fixes some additional assignment of object value
pattern of usage elsewhere in the test migrations as pointed
out in code review.
Co-Authored-By: Jay Faulkner <jay@jvf.cc>
Change-Id: I91429f33ce01f40ecd665b2f76b27a162354bc6f
The MIGRATIONS_TIMEOUT value is the timeout for the migrations tests
in seconds.
Can now be set ad environment variable, default to 60 seconds.
Change-Id: I20045a7c0c08a37038b4707791f6b67f715c4877
Fails a test after OS_TEST_TIMEOUT seconds, default to 30 seconds.
Based on OS_TEST_TIMEOUT from oslotest
See https://github.com/openstack/oslotest/blob/master/oslotest/base.py#L35-L45
for more info
Also increase log output for improved troubleshooting.
Change-Id: Ibcdca2c449970b5a2ecdf2e3bb3bb900881b6d7c
This test has been taking an inordinate amount of time to complete. We
should figure out the root cause and fix it, but in the meantime it only
causes us harm to be unable to land patches.
Related-Bug: 2023316
Change-Id: I604369e000c80914cf0c584c9deab7245c66b1b4
This change fixes the worst offenders, potentially reducing the test
runs by more than 10 seconds.
I could not fix iLO tests though because they heavily rely on time.sleep
being precise.
Change-Id: I10d7845700275d9d03b98ebadd0f12540f1e7656
When a node is inspected more than one time and the database is
configured as a storage backend, a new entry is made in the database
for each inspection result (node inventory). This patch handles this
behaviour as follows:
By deleting previous inventory entries for the same node before adding
a new entry in the database.
By retrieving the most recent node inventory from the database when the
database is queried.
Change-Id: Ic3df86f395601742d2fea2bcde62f7547067d8e4
This change creates all necessary parts to processing inspection data:
* New API /v1/continue_inspection
Depending on the API version, either behaves like the inspector's API
or (new version) adds the lookup functionality on top.
The lookup process is migrated from ironic-inspector with minor changes.
It takes MAC addresses, BMC addresses and (optionally) a node UUID and
tries to find a single node in INSPECTWAIT state that satisfies all
of these. Any failure results in HTTP 404.
To make lookup faster, the resolved BMC addresses are cached in advance.
* New RPC continue_inspection
Essentially, checks the provision state again and delegates to the
inspect interface.
* New inspect interface call continue_inspection
The base version does nothing. Since we don't yet have in-band
inspection in Ironic proper, the only actual implementation is added
to the existing "inspector" interface that works by doing a call
to ironic-inspector.
Story: #2010275
Task: #46208
Change-Id: Ia3f5bb9d1845d6b8fab30232a72b5a360a5a56d2
Leave the snmp job on focal for the time being as it's failing on jammy
and we need to move forward with the migration.
Change-Id: I0b9b600c3eb10761054abdb9c13d7107269001b9
Add to the information collected by Redfish hardware inspection from
sushy, and store it in the documented hardware inventory format
Change-Id: I651599b84e6b8901647960b719626489b000b65f
Now that we have autocommit disabled, we need to make the metal3 job
voting to ensure we don't accidentally break sqlite support in the
future.
Change-Id: I4915dc23b1101b9b799f392434f237e5ccb323e4
Allows steps to be executed on child nodes, and adds
the reserved power_on, power_off, and reboot step names.
Change-Id: I4673214d2ed066aa8b95a35513b144668ade3e2b