225 Commits

Author SHA1 Message Date
Zuul
f5e33812bd Merge "CI: Use focal dnsmasq" 2023-07-11 09:08:41 +00:00
Zuul
6233276b20 Merge "Unit tests: Isolate mysql test migrations" 2023-07-11 01:13:06 +00:00
Julia Kreger
0a11855d3f CI: Use focal dnsmasq
Investigation of our standalone test job issues, where jobs would
fail, hosts not get DHCP updates, and ultimately IPXE would
fail prior to getting a valid or the expected response,
revealed the discovery that dnsmasq was crashing often when
the port updates were going through, ultimately preventing
the mutli-scenario test jobs from running as the standalone
jobs represent a number of different scenarios which are
executed across a pool of test machines.

In this case, the path forward appears to be to downgrade
dnsmasq to stablize our CI and allow us to otherwise upgrade.

This patch adds the focal updates as a package source,
and installs the dnsmasq package.

Related-Bug: #2026757
Change-Id: Iacfd1ab677c612525601afcaeee5e5b067206ff3
2023-07-10 12:57:16 -07:00
Iury Gregory Melo Ferreira
6c35a44424 Move standalone jobs to focal
We are seeing a lot of failures in our standalone jobs
after we switched to jammy, see[1].
Let's pin the jobs to focal and to isolate the problem and
fix in a separate patch.

[1] https://zuul.opendev.org/t/openstack/builds?job_name=ironic-standalone-redfish&project=openstack%2Fironic&branch=master&skip=0

Change-Id: I1a85c0c9285359ee92fb676ec56c817cbe350367
2023-07-07 12:29:34 -03:00
Julia Kreger
5cad8ac773 Unit tests: Isolate mysql test migrations
All database migration testing in opestack is done through
an opportunistic worker model, where if the database is available
and correctly configured for testing, i.e. openstack-citest user
and access appropriately granted, then the tests will create and
test migrations.

However, this has been problematic with mysql as of recent, as we
have seen a long standing migration issue boil to the surface often
with tests.

As a result, we're isolating that test down to it's own job so we
can limit the blast damage. This also helps us isolate is it all
of the tests, or is it just soley isolated down to the mysql test
run class, which is an additional data point.

By default, we continue to run Postgres migration tests in the
main jobs, as they haven't been impacted by this issue.

Change-Id: Iefc044c31ef029e400a7dad294504175a4462638
2023-06-29 09:23:34 -07:00
Riccardo Pittau
f434643293 Use jammy for base jobs
Leave the snmp job on focal for the time being as it's failing on jammy
and we need to move forward with the migration.

Change-Id: I0b9b600c3eb10761054abdb9c13d7107269001b9
2023-06-06 17:09:42 +02:00
Iury Gregory Melo Ferreira
da0d7494e7 Add ironic-grenade-skip-level Job
Change-Id: I359ac822f2fe39b38f29a47fd2e71c8c91476030
2023-05-24 19:59:51 +00:00
Julia Kreger
124ad571fd Explicitly pin CIRROS_VERSION
It appear the push to Cirros 0.6.1 has re-occured, and we now
have things failing as a result.

Specifically ironic-grenade is trying to run with Cirros 0.5.2,
yet the file is not found later on.

Anyhow, an explicit pin should resolve this.

Change-Id: I97a1403820c8dbe633cf1d529adc79e8af463e80
2023-05-23 15:10:00 -07:00
Julia Kreger
8b98dfafd8 CI: Disable mysql counters for grenade
Disabling the performance counters as we suspect it is causing
database interaction to freeze on the grenade CI job.

Change-Id: Id951815ab9bfd1ca16aa66fa4c87c0e1b3e788f6
2023-05-18 17:27:38 -07:00
Julia Kreger
a5a737e388 Set ironic-grenade to wait 120 seconds
Launching test VMs can take a while, and grenade can fail
if the VM's networking is not quite online in under sixty
seconds. As such, it is reasonable to use a larger window
so the failure rate of ironic-grenade will hopefully decline.

Depends-On: https://review.opendev.org/c/openstack/grenade/+/879674
Change-Id: I07aead4b09ccb7e427a0d3d04e7a580cf4b00a95
2023-04-05 09:58:36 -07:00
Julia Kreger
692a383fdc [CI] Swap anaconda urls
The anaconda job is failing as were getting a redirect issued back
upon attempting to validate URLs. The servers are now directing us
to use HTTPS instead.

Change-Id: Iac8e6e58653ac616250f4ce3ab3ae7f5164e5b03
2023-01-26 13:58:12 -08:00
Zuul
e011922bac Merge "CI: Reset VM footprint to 2.6GB" 2023-01-11 00:48:16 +00:00
Julia Kreger
0230d361f4 CI: Reset VM footprint to 2.6GB
This commit partially reverts change set
I0bfef09a5312a17be54ce5c09805f06b7c349026
where the amount of memory for test VMs was
increased to 4GB. This was because excess
junk getting stuck in the staged ramdisk
images used by CI.

Change-Id: Ia0c74cbeecdb9febf9f7a4e76db84e0f378a97fc
2023-01-03 17:42:55 -08:00
Julia Kreger
1d07be8237 Use centos grub artifacts with centos ramdisk for vmedia
It appears we are getting an opcode error when attempting to boot
Centos 9-stream utilizing the EFI artifacts from Ubuntu.

Technically this should work, however further aftifacts in the boot
chain may be signed with other key credentials that Ubuntu's
grub does not know about, because the chain of trust is
MSFT -> Vendor shim (slow change rate) -> Vendor GRUB -> Kernel

Where vendor differences should never work, is if Secure Boot
is enforcing.

Exception on launch:
 X64 Exception Type - 06(#UD - Invalid Opcode)  CPU Apic ID - 00000000 !!!!

A similar Debian bug is open for a very similar issue:

https://groups.google.com/g/linux.debian.bugs.dist/c/BOiLLeROrmo

However, no additional comments or information have been in follow
up to that reported issue. So in the mean time, we're going to try
and do what those smarter than I recommend, use the vendor's
binaries for their distribution.

There is one further, potentially far more depressing possibility,
that centos9's kernel doesn't support the type of hardware
we're getting. This is suggested by the precise opcode error, UD,
https://xem.github.io/minix86/manual/intel-x86-and-64-manual-vol3/o_fe12b1e2a880e0ce-212.html
But again, easiest possibility first.

Change-Id: Id9bd30bc3c2f1076555317e4a3f277725fa7c1f4
2023-01-03 17:05:04 -08:00
Riccardo Pittau
6b84fbf8f2 Fix CI
- Remove skipsdist that it was never supported and causes breakage
when used with usedevelop.
- add script to allowlist for pep8 test
- disable setuptools autodiscovery
- Increase base VM memory according to new requirements for CS9
based IPA

Change-Id: I0bfef09a5312a17be54ce5c09805f06b7c349026
2022-12-29 17:10:53 +01:00
Jay Faulkner
d7c95306d6 Ironic doesn't use metering; don't start it in CI
We don't use metering. We do use every byte of ram we can get our hands
on.

Change-Id: I839c7fd4cb6fe8661a25e6b4e00650575ae17520
2022-12-13 13:51:11 -08:00
Sławek Kapłoński
7c47ad04fc [grenade] Explicitly enable Neutron ML2/OVS services in the CI job
As with [1] basic grenade job will be switched to run with OVN as
Neutron backend, which is default in Devstack, we need to explicitly
disable ML2/OVN neutron services in the ironic-grenade job and use
ML2/OVS related services in that job.

Depends-On: https://review.opendev.org/c/openstack/devstack/+/867065

[1] https://review.opendev.org/c/openstack/grenade/+/862475

Change-Id: I2ef96d1b3e19004f05253dfae508e9f07ae58f63
2022-12-09 11:52:08 +00:00
Riccardo Pittau
ad0b8e4dce Cross test sushy with python 3.10
We don't test python 3.8 anymore in antelope

Change-Id: I4748f14f7a75ae9da204ffafb61c8e495822f040
2022-10-20 18:04:45 +02:00
Julia Kreger
d8fc96fd1f CI: Changes to support Anaconda CI jobs
Introduces additional job configuration to enable automated
integration testing via tempest of the anaconda deployment
interface.

Also, configures a private subnet with DNS, which is required
by anaconda executing, in order to facilitate processing of URLs.

Change-Id: I61b5205cf2c9f83dfcabf4314247c76fb6a56acd
2022-09-06 07:38:11 -07:00
Dmitry Tantsur
f0a1778766 Finally remove support for netboot and the boot_option capability
Instance network boot (not to be confused with ramdisk, iSCSI or
anaconda deploy methods) is insecure, underused and difficult to
maintain. This change removes a lot of related code from Ironic.

The so called "netboot fallback" is still supported for legacy boot when
boot device management is not available or is unreliable.

Change-Id: Ia8510e4acac6dec0a1e4f5cb0e07008548a00c52
2022-08-02 12:47:31 +02:00
Julia Kreger
af838cca79 CI: Pull in diskimage-builder until new release is cut
Change-Id: I88a4863cd24258eb0b395303738c23e3468615c0
2022-06-30 16:29:01 -07:00
Zuul
936414a3cc Merge "Remove netboot jobs from the gate" 2022-06-25 00:21:16 +00:00
Dmitry Tantsur
5bbcabbabe Remove netboot jobs from the gate
Netboot option will be removed soon, this change stops covering it.
Some jobs have been renamed to reflect the new reality.

Change-Id: I7e248c3deb4778fcf59bc64821833987653fbbcd
2022-05-31 10:02:56 +02:00
Dmitry Tantsur
81f583f69b devstack: use CentOS 9 for DIB IPA builds
Additionally bumps CPU model to host-model as centos9 builds now
require a subset of CPU processors which include advanced features.
Host-model also allows for the VM to still start when running with
pure qemu, as opposed to KVM passthrough.

https://developers.redhat.com/blog/2021/01/05/building-red-hat-enterprise-linux-9-for-the-x86-64-v2-microarchitecture-level#architectural_considerations_for_rhel_9

Change-Id: Ic261efd4bf6f5929687df5e7b1b51b541554af18
2022-05-25 08:57:15 -07:00
Zuul
f73639d72c Merge "Fix names of two jobs" 2022-05-09 19:45:25 +00:00
Julia Kreger
a9f4acfdb0 Fix v6 CI job - Return it to normal non-voting status
* Fixes the IPv6 job by utilizing HOST_IPV6 instead of
  SERVICE_IPV6, as Devstack now automatically wraps
  SERVICE_IPV6 with brackets as if it is for a URL.
* Locks ipv6 job to bios mode. Ubuntu Focal OVMF/EDK2 does not
  support IPv6 PXE boot by default.
* Split from Devstack in terms of IP usage, since full explicit
  V6 usage is not a thing anymore. 4+6 is the default in devstack
  and regardless of what we set on the job we see both now used.
  So we delineate apart our usage for our own sanity.
* Reduce VM Interface count for IPv6 in an attempt to eliminate
  in-kernel routing confusion by two interfaces on the same physical
  network.
* Set IPv6 mode to dhcpv6-stateless due to fun issues in dhcp clients.
  When we move to UEFI, this will need to be changed to stateful as
  stateless is not supported in general by OVMF/E2DK.

Once the job has run in normal non-voting for a while, and we
ensure that it seems to be stable, we can make it voting again.

Change-Id: Ia833bfb64c6c3cc8e48cbe34ed200536652a8adf
2022-05-04 11:32:29 -07:00
Riccardo Pittau
b77a5d67da Fix names of two jobs
Making jobs names less misleading

This should impact sushy and sushy-tools only

sushy change https://review.opendev.org/c/openstack/sushy/+/838662
sushy-tools change https://review.opendev.org/c/openstack/sushy-tools/+/838664

Change-Id: I83f3ac7ddc0662e32c205cd8ec0fab073aeaec56
2022-04-20 08:56:55 +00:00
Julia Kreger
9df7e67e69 Grenade: Change to use bios because we have funky networking
Grenade, for some confusing reason, creates a separate network,
and uses that for upgrade testing as opposed to the original network
the VMs were bound to. If Julia's memory is correct, this was for
multinode upgrade testing.

Anyway, When in UEFI mode, it appears that the TFTP packets
don't get tracked nor cross the boundrary. We likley need to
explicitly address this, but first, lets get the job working as
it was and can then update it.

Also, update requirements because markupsafe removed soft_unicode
method taht was deprecated since a while. Jinja2 started using the
new soft_str method since version 3.0.0

Change-Id: Iaebe966569962b0d3d43774d57b570469479f159
2022-04-04 14:13:58 +02:00
Dmitry Tantsur
5a9dd8b092 Deprecate instance network boot
It's insecure and not very popular. See this post for details:
http://lists.openstack.org/pipermail/openstack-discuss/2021-December/026224.html

Change-Id: I9a2df47bb8c08cc991b3c615a9eb533aba3171f4
2022-02-23 12:15:33 +01:00
Dmitry Tantsur
2f09b7b102 CI: force config drive on the multinode job's subnode
We need configdrives to pass information reliably, and the new cirros
image does not work without them.

Change-Id: I6cafa050d5c1c8289483f968d26c50485fd4528a
2022-02-21 11:57:31 +01:00
Zuul
8452de687e Merge "CI: use a custom cirros partition image instead of the default" 2022-02-16 16:53:59 +00:00
Dmitry Tantsur
bbceca562e CI: use a custom cirros partition image instead of the default
Cirros partition images are not compatible with local boot since they
don't ship grub (nor a normal root partition). This change adds a script
that builds a partition image with UEFI artifacts present. It still
cannot be booted in legacy mode, but it's a progress.

Set the tempest plugin's partition_netboot option. We need it to inform
the tempest plugin about the ability to do local boot. This option
already exists but is never set.

Also set the new default_boot_option parameter, which will be introduced
and used in Iaba563a2ecbca029889bc6894b2a7f0754d27b88.

Remove netboot from most of the UEFI jobs.

Change-Id: I15189e7f5928126c6b336b1416ce6408a4950062
2022-02-16 10:12:06 +01:00
Dmitry Tantsur
f67bbeb9f6 Clean up jobs with legacy names
Not everyone on the team even knows what pxe_ipmitool used to mean :)
Furthermore, we don't need "ipa" in job names, everything uses IPA
for... even longer than pxe_ipmitool does not exist.

While here, one job was clearly meant to use BIOS boot, but it does not.

Change-Id: I8a37efa0f222361f30ddb7fa621548685a40f961
2022-02-03 19:01:28 +01:00
Zuul
f6f6ce1a31 Merge "CI: reduce api worker processes to 1" 2021-12-12 18:20:10 +00:00
Julia Kreger
cdc3b9538f CI: Lower test VM memory by 200MB
We're seeing OOM events in CI, hopefully this helps.

Change-Id: Id8c0e4830011ca2fa526df461ed5b9b01f769cf9
2021-12-08 22:43:10 +00:00
Julia Kreger
24184a449b CI: reduce api worker processes to 1
CI is memory intensive, and we realistically don't need 2 or
more API workers running for every single WSGI process which
does not implement it's own specific override value.

This should reduce the memory footprint by an average of six processes
which consume 60-90 MB each.

Change-Id: Ia0a986152c2b9fc9c5ff54cf698a351db452fbbd
2021-12-08 09:17:46 -08:00
Dmitry Tantsur
b37ee7c911 devstack: provide a default for OS_CLOUD
Not having it breaks the inspector grenade job.

Change-Id: I7ee28a85cb2005dd69e6711b301cd029b8ca40cc
2021-12-08 09:49:26 +01:00
Dmitry Tantsur
028448afe4 Add a unit test job with Sushy from source
The final goal is to run it on sushy itself to make sure there are
no regressions.

Change-Id: I6f4bee9a3fa439b1477c41c82304652a801ea55e
2021-11-25 10:00:03 +01:00
Julia Kreger
350c2f7a50 CI: Fix devstack plugin with RBAC changes
Changes a neutron call to be project scoped as system
scoped can't create a resource and, and removes the unset
which no longer makes sense now that
I86ffa9cd52454f1c1c72d29b3a0e0caa3e44b829
has merged removing the legacy vars from devstack.

Also renames intenral use setting of OS_CLOUD to IRONIC_OS_CLOUD
as some services were still working with system scope or some sort
of mixed state occuring previously as some of the environment variables
were present still, however they have been removed from devstack.

This change *does* explicitly set an OS_CLOUD variable as well on
the base ironic job. This is because things like grenade for Xena
will expect the variable to be present.

Depends-On: https://review.opendev.org/c/openstack/devstack/+/818449
Change-Id: I912527d7396a9c6d8ee7e90f0c3fd84461d443c1
2021-11-19 08:22:22 -08:00
Julia Kreger
493b4f0caf Yoga: Change default boot mode to uefi
Change the default boot mode to UEFI, as discussed during the end
of the Wallaby release cycle and previously agreed a very long time
ago by the Ironic community.

Change-Id: I6d735604d56d1687f42d0573a2eed765cbb08aec
2021-10-04 11:57:55 -07:00
Julia Kreger
8e173b88d1 Disable Neutron firewall
Neutron's firewall initialization with OVS seems
to be the source of our pain with ports not being found
by ironic jobs. This is because firewall startup errors
crashes out the agent with a RuntimeError while it is deep
in it's initial __init__ sequence.

This ultimately seems to be rooted with communication
with OVS itself, but perhaps the easiest solution is
to just disable the firewall....

Related: https://bugs.launchpad.net/neutron/+bug/1944201
Change-Id: I303989a825a7e35f1cb7b401134fd63553f6791c
2021-09-20 14:09:20 +00:00
Julia Kreger
34fd84560a Dial back gate job memory allocation
Observed an OOM incident causing
ironic-tempest-ipa-partition-pxe_ipmitool to fail.

One vm started, the other seemed to try to start twice, but both times
stopped shortly into the run and the base OS had recorded in it an OOM
failure.

It appears the actual QEMU memory footprint being consumed when
configured at 3GB is upwards of 4GB, which obviously is too big to
fit in our 8GB VM instance.

Dialing back slightly, in hopes it stabilizes the job.

Change-Id: Id8cef722ed305e96d89b9960a8f60f751f900221
2021-09-15 13:58:19 -07:00
Steve Baker
6af0eb374e Set postgresql password encryption for FIPS compliance
This is part of the work to add jobs which confirm ironic works with
FIPS enabled, but this change is also appropriate non-FIPS jobs.

Change-Id: I4af4e811104088d28d7be6df53c26e72db039e08
2021-08-05 11:47:11 +12:00
Zuul
c71583fc8a Merge "Scoped RBAC Devstack Plugin support" 2021-07-21 11:27:17 +00:00
Julia Kreger
b5872c9032 Set glance limit for baremetal friendly images
The devstack default limit enforcement for glance defaults
to 1GB, and unfortunately this is too small for many to use
larger images such as centos which includes hardware firmware
images for execution on baremetal where drivers need the vendor
blobs in order to load/run.

Sets ironic-base to 5GB, and updates examples accordingly.

Depends-On: https://review.opendev.org/c/openstack/devstack/+/801309
Change-Id: I41294eb571d07a270a69e5b816cdbad530749a94
2021-07-19 10:32:43 -07:00
Julia Kreger
2cd6468346 Scoped RBAC Devstack Plugin support
Adds support to the ironic devstack plugin to configure
ironic to be used in a scope-enforcing mode in line with
the Secure RBAC effort. This change also defines two new
integration jobs *and* changes one of the existing
integration.

In these cases, we're testing functional crub interactions,
integration with nova, and integration with ironic-inspector.

As other services come online with their plugins and
devstack code being able to set the appropriate scope
enforcement configuration, we will be able to change the
overall operating default for all of ironic's jobs and
exclude the differences.

This effort identified issues in ironic-tempest-plugin,
tempest, devstack, and required plugin support in
ironic-inspector as well, and is ultimately required
to ensure we do not break the Secure RBAC.

Luckilly, it all works.

Change-Id: Ic40e47cb11a6b6e9915efcb12e7912861f25cae7
2021-07-15 21:58:31 +00:00
Dmitry Tantsur
5a8a77ec5b Don't run the inspector job on changes to inspector tests
The current irrelevant-files contains the ironic tests directory, but
not the inspector one.

Change-Id: I1654e1333f95d4c3ffd25daa2b80ae4cf1033d91
2021-06-17 19:01:24 +02:00
Zuul
848ba44608 Merge "CI: Collect a snapshot of network connections" 2021-05-25 11:20:45 +00:00
Dmitry Tantsur
a7f4b4a21c Stop testing the iscsi deploy interface
Change-Id: If876d5bbb7e18f293673d56912ecab3170fe5dfb
2021-04-30 15:54:23 +02:00
Vanou Ishii
d6dd05ab12 Enable Reuse of Zuul Job in 3rd Party CI Environment
At current Zuul job in zuul.d/ironic-jobs.yaml, items of
required-project are like this (without leading hostname)

    required-projects:
      - openstack/ironic
      - openstack/ABCD

but not like this (with leading hostname)

    required-projects:
      - opendev.org/openstack/ironic
      - opendev.org/openstack/ABCD

With first format, if we have two openstack/ironic entries in
Zuul's tenant configuration file (Zuul tenant config file in 3rd
party CI environment usually has 2 entries: one to fetch upstream
code, another for Gerrit event stream to trigger Zuul job), we'll
have warning in zuul-scheduler's log

    Project name 'openstack/ironic' is ambiguous,
    please fully qualify the project with a hostname

With second format, that warning doesn't appear. And Zuul running at
3rd party CI environment can reuse Zuul jobs in zuul.d/ironic-jobs.yaml
in their Zuul jobs.

This commit modifies all Zuul jobs in zuul.d/ironic-jobs.yaml
to use second format.

Story: 2008724
Task: 42068
Change-Id: I85adf3c8b3deaf0d1b2d58dcd82724c7e412e2db
2021-03-17 19:01:07 +09:00