Grenade is currently failing not finding the neutron command, we should
likely not be using it anyway since the deprecation message says it may
disappear after Z.
Change-Id: Ic24d59379bafcc5a630fe5c074fcc13303902965
Adds basic testing for PXE/iPXE boot secenarios where the OVN
DHCP service is used instead of dnsmasq.
Also adds a release note and documentation to cover the details
and caveats of using ovn as we have discovered through this process.
Change-Id: I28cd20a7f271220d8ca335895ca9e302452fd069
Remove the $ in the condition so that we don't attept to
execute the output from ping (i.e PING - unknown command)
Change-Id: Ic90f7c93d9a7b86fbf3f2cdef46bc1b2bbea489d
Investigation of our standalone test job issues, where jobs would
fail, hosts not get DHCP updates, and ultimately IPXE would
fail prior to getting a valid or the expected response,
revealed the discovery that dnsmasq was crashing often when
the port updates were going through, ultimately preventing
the mutli-scenario test jobs from running as the standalone
jobs represent a number of different scenarios which are
executed across a pool of test machines.
In this case, the path forward appears to be to downgrade
dnsmasq to stablize our CI and allow us to otherwise upgrade.
This patch adds the focal updates as a package source,
and installs the dnsmasq package.
Related-Bug: #2026757
Change-Id: Iacfd1ab677c612525601afcaeee5e5b067206ff3
So, I've long wondered if we still have some spanning tree behavior
going on in CI. Turns out we might, but we just rely upon the defaults
which creates a variable.
Anyway, regardless, I found some details in the ovs-vsctl manual[0], and
well, lets set the options!
[0]: http://www.openvswitch.org/support/dist-docs/ovs-vsctl.8.html
Change-Id: I8f229fa6e738a69a668d4b891723431b2da362fa
This reverts commit 2f8ee2cf40b594e3da10f883b453bd81bad6d0ab.
Reason for revert: Tempest removed the setting of DEFAULT_IMAGE_NAME to 'non-existent-image', that should fix the issue
- https://review.opendev.org/c/openstack/tempest/+/886796/2
Change-Id: I4767518b3306a8c4da08d1f0650d78ef8d78ca9c
It appears nova's policies have changed, or to be more precise,
they have turned on new policy enforcement[0] and our plugin
was wrong.
+++ /opt/stack/ironic/devstack/lib/ironic:\n
ironic_configure_tempest:3205 :\n
oscwrap --os-cloud devstack-system-admin flavor show baremetal -f value -c id
ForbiddenException: 403: Client Error for url:
https://173.231.254.232/compute/v2.1/flavors/
e4312534-b349-4f70-9a1b-5806debff275/os-extra_specs,
Policy doesn't allow os_compute_api:os-flavor-extra-specs:index to be performed.
[0]: dfd7aeaf6c
Change-Id: I8070852fbe9346e346c50088537797f353753d02
We'v been able to observe one of the scenario test jobs failing
due to tinycorelinux being inaccessible. Possibly on an IPv6 only
test VM. Turns out tinycorelinux's main page is only accessible via
IPv4.
As such, I've changed the mirror to a mirror which is acessible via
IPv6 and which I've verified works for me.
Change-Id: I2b4ccd16189038ce2f054d7403775b012796aea3
In the recent change to cinder, to address CVE-2023-2088,
cinder changed the policy rules and behavior for unbinding,
or "detaching" a volume. This was because of a vulnerability
in compute nodes where a volume which was in use by a VM
could be detached outside of Nova, and nova wouldn't become
aware the volume was detached, and the volume could be accessible
to the next VM.
This vulnerability doesn't apply to bare metal operations as
volumes are attached to whole baremetal nodes with Ironic.
We now generate and use a service token when interacting with
Cinder which allows cinder to recognize "this request is
coming from a fellow OpenStack service", and by-pass
checking with Nova if the "instance" is managed by Nova,
or Not. This allows the volumes to be attached, and detached
as needed as part of the power operation flow and overall
set of lifecycle operations.
Related-Bug: 2004555
Closes-Bug: 2019892
Change-Id: Ib258bc9650496da989fc93b759b112d279c8b217
The troubleshooting kernel command line option nomodeset
unfortunately changes the way framebuffer interactions work
with graphics devices which in some cases can result in kernel
memory to be used for graphics updates. When this happens on
some specific hardware common in rack mount servers with baseboard
management controllers, this can cause the memory bus to become
locked for a brief time while the graphics update is occuring.
This locked memory bus means disk IO can become blocked,
and network cards can overflow their buffers resulting in
packet loss on top of the latency incurred by the graphics
update executing.
As such, we've removed the nomodeset option from default usage and
added a note describing its removal to the documentation along
with a release note.
Change-Id: I9084d88c3ec6f13bd64b8707892758fa87dd7f86
It appears we are getting an opcode error when attempting to boot
Centos 9-stream utilizing the EFI artifacts from Ubuntu.
Technically this should work, however further aftifacts in the boot
chain may be signed with other key credentials that Ubuntu's
grub does not know about, because the chain of trust is
MSFT -> Vendor shim (slow change rate) -> Vendor GRUB -> Kernel
Where vendor differences should never work, is if Secure Boot
is enforcing.
Exception on launch:
X64 Exception Type - 06(#UD - Invalid Opcode) CPU Apic ID - 00000000 !!!!
A similar Debian bug is open for a very similar issue:
https://groups.google.com/g/linux.debian.bugs.dist/c/BOiLLeROrmo
However, no additional comments or information have been in follow
up to that reported issue. So in the mean time, we're going to try
and do what those smarter than I recommend, use the vendor's
binaries for their distribution.
There is one further, potentially far more depressing possibility,
that centos9's kernel doesn't support the type of hardware
we're getting. This is suggested by the precise opcode error, UD,
https://xem.github.io/minix86/manual/intel-x86-and-64-manual-vol3/o_fe12b1e2a880e0ce-212.html
But again, easiest possibility first.
Change-Id: Id9bd30bc3c2f1076555317e4a3f277725fa7c1f4
The IRONIC_VM_MACS_CSV_FILE is generated only if we execute the
ironic basic ops, so when IRONIC_BAREMETAL_BASIC_OPS is True.
In some jobs we set IRONIC_BAREMETAL_BASIC_OPS to False but we
still look for that file causing a "file not found" error which
does not trigger a trap until focal, but it does in jammy.
Let's create the file if it does not exist.
Change-Id: Ib938abe0723072419f336159cbffff33e46ea39b
The RC_DIR does not existed (and it never existed, it was SRC_DIR)
Change that to TOP_DIR which is what we use commonly in other
sections.
Change-Id: I4a400fd434a20938cd38c0bb876da21fec7473a1
We're builing tinyipa using tinycore 13.x since a while, we should use
the same version for the base ramdisk image.
Change-Id: I9d144f122c20f717ff946282ef7ffa16d82812f5
In [1] we finally got rid of the unfinished lib/neutron module and kept
only lib/neutron-legacy. It's renamed to lib/neutron now and it's the
only neutron related module in Devstack.
So this patch removes leftovers related to the old lib/neutron-legacy.
[1] https://review.opendev.org/c/openstack/devstack/+/865014
Change-Id: Id938deab7188743e754d028dee8e0b2591ab6f7b
All services except Ironic (and keystone to support ironic
with system scope deployement), will not have system scope in
their API policy instead they are default to project scoped.
Change-Id: Id13a359086f9b24dbfcd2b565a42c50d0dab7736
Adding an upgrade check to provide awareness to the state of
the database in regards if an unexpected engine is in use or
if the character set encoding is also not UTF8.
These will raise non-fatal warnings on the upgrade status
check.
Change-Id: Ide0eb4690a056be557e5ea7d5ba5f6be37b50d0a
Story: 2010384
Introduces additional job configuration to enable automated
integration testing via tempest of the anaconda deployment
interface.
Also, configures a private subnet with DNS, which is required
by anaconda executing, in order to facilitate processing of URLs.
Change-Id: I61b5205cf2c9f83dfcabf4314247c76fb6a56acd
Instance network boot (not to be confused with ramdisk, iSCSI or
anaconda deploy methods) is insecure, underused and difficult to
maintain. This change removes a lot of related code from Ironic.
The so called "netboot fallback" is still supported for legacy boot when
boot device management is not available or is unreliable.
Change-Id: Ia8510e4acac6dec0a1e4f5cb0e07008548a00c52
In the case of CI test nodes natively supporting and using ipv6,
we don't need to actually setup a fake IPv6 network for ports
to bind to on the local system. So before doin gso, lets check
to see if we can ping the address first. If not, then set it up.
Change-Id: Ib68c706c1f9ef0ad0cf27e7a6acffd2c50ff37ea
q35 is recommended as emulating modern baremetal with better support
for UEFI in general, and Secure Boot in particular.
Old pc type usage is removed, like IDE controller, PS2 mouse, manual
PSI addresses.
Change-Id: Ic33e0f23c5c514a45541534ddd68329d7b4d0480
* Fixes the IPv6 job by utilizing HOST_IPV6 instead of
SERVICE_IPV6, as Devstack now automatically wraps
SERVICE_IPV6 with brackets as if it is for a URL.
* Locks ipv6 job to bios mode. Ubuntu Focal OVMF/EDK2 does not
support IPv6 PXE boot by default.
* Split from Devstack in terms of IP usage, since full explicit
V6 usage is not a thing anymore. 4+6 is the default in devstack
and regardless of what we set on the job we see both now used.
So we delineate apart our usage for our own sanity.
* Reduce VM Interface count for IPv6 in an attempt to eliminate
in-kernel routing confusion by two interfaces on the same physical
network.
* Set IPv6 mode to dhcpv6-stateless due to fun issues in dhcp clients.
When we move to UEFI, this will need to be changed to stateful as
stateless is not supported in general by OVMF/E2DK.
Once the job has run in normal non-voting for a while, and we
ensure that it seems to be stable, we can make it voting again.
Change-Id: Ia833bfb64c6c3cc8e48cbe34ed200536652a8adf
So... We can't do this in a single patch, and we *actually*
need to merge the vxlan fix before the subnode will ever pickup
the configuration
From the logs, I can observe the vxlan tunnel connects between
the nodes. Awesome.
Where things break is in the local setup of the local bridges
used to wire everything together.
setup_vxlan_network:3274 : sudo ovs-vsctl add-port sub1brbm phy-brbm-infra
ovs-vsctl: Error detected while setting up 'phy-brbm-infra':
could not open network device phy-brbm-infra (No such device).
See ovs-vswitchd log for details.
Basically, with the same change on a separate patch, we're able to
observe the controller node work perfectly. It is the subnode
connectivity which is just broken.
So, activate the bridge interfaces seems ideal. This likely broke
at some point due to behavior changes in OpenVSwitch.
Change-Id: I11dbba1957d67187d859a1ef60563c0301da9812
The standalone job changes boot_option in runtime, so local boot
can be used even when the default boot option is netboot.
Change-Id: Ia538907f3662e8cd84d988ea5d862c7f488558e1
Cirros partition images are not compatible with local boot since they
don't ship grub (nor a normal root partition). This change adds a script
that builds a partition image with UEFI artifacts present. It still
cannot be booted in legacy mode, but it's a progress.
Set the tempest plugin's partition_netboot option. We need it to inform
the tempest plugin about the ability to do local boot. This option
already exists but is never set.
Also set the new default_boot_option parameter, which will be introduced
and used in Iaba563a2ecbca029889bc6894b2a7f0754d27b88.
Remove netboot from most of the UEFI jobs.
Change-Id: I15189e7f5928126c6b336b1416ce6408a4950062
The standalone job is failing because of a bug in IPA. To fix it we need
to make DIB jobs operational, and they're failing because of CentOS repos.
Change-Id: I8bd051ea709d328cb5efa2c2cbd5a226bdb4cfd3
This change makes it easier to configure power and management interfaces
(and thus vendor drivers) by figuring out reasonable defaults.
Story: #2009316
Task: #43717
Change-Id: I8779603e566be5a84daf6f680c0bbe2f191923d9
This change removes the documentation to copy master_grub_cfg.txt to
/tftpboot/grub/grub.cfg and instead writes it on conductor startup.
This grub config is a simple redirect config requested by grub network
boot. "master" has been renamed to "initial" as a more accurate label
of its function.
New configuration option [pxe]initial_grub_template allows the deployer
to specify a different initial grub template.
Change-Id: I71191dd399a6c49607f91d69b5b1673799a38624