10460 Commits

Author SHA1 Message Date
Julia Kreger
6dfc409133 Force RAX hosts to run tinyipa
The CPU overhead of nested virtualization on rax hosts simply
is too much for Ironic's CI to justify using full size IPA images.

The failure rate is simply too high. As a result, lets use TinyIPA
images when we are not building a ramdisk to reduce that failure rate.

Change-Id: Ifa81397519833201b737cff89f61178c8835e3ca
2020-07-23 16:33:34 +00:00
Julia Kreger
67e51af6d5 Extend PXE boot retry timeout for RAX hosts
When extending the timeouts for jobs to execute with-in,
we've observed a case where RAX hosts are cutting off at
the time limit of 900 seconds (as being asserted by another
change set). This is both good and bad. We know the timeout
feature works, but the agent was not quite online yet.

As such, we should also auto-extend base retry timeouts
so there is hope for the job to complete.

Change-Id: I8efa3a52188de558a7964d1daafd2225e102e251
2020-07-22 10:41:07 -07:00
Zuul
b5ae75a406 Merge "Use native oslo.concurrency execution timeout in ipmitool" 2020-07-22 15:58:19 +00:00
Zuul
1f63525a1f Merge "Iso booting via redfish virtual media" 2020-07-22 04:55:59 +00:00
Zuul
c2b8e3f80e Merge "Stop running test_schedule_to_all_nodes in the multinode job" 2020-07-21 14:45:25 +00:00
Zuul
a1e2fc35b4 Merge "Remove Link type" 2020-07-21 14:45:19 +00:00
Zuul
f85c23fe4c Merge "Remove File type" 2020-07-21 14:45:13 +00:00
Zuul
15211fd886 Merge "Add wsme core types, remove WSME" 2020-07-21 09:05:14 +00:00
Zuul
bb6fad941c Merge "Update number of VM on ironic-base" 2020-07-21 01:02:39 +00:00
Zuul
6ea66bea67 Merge "Set min version of tox to 3.2.1" 2020-07-20 23:03:16 +00:00
Dmitry Tantsur
1cb1df76d9 Stop running test_schedule_to_all_nodes in the multinode job
After the recent changes we're running 5 tests already, some of them
using several VMs. This should cover scheduling to different conductors
well enough, the nova test just adds random failures on top.

This allows reducing the number of test VMs to 3 per testing node
(6 totally), reducing the resource pressure and allowing giving
each VM a bit more RAM.

Also adding missing VM_SPECS_DISK to the subnode configuration.

Change-Id: Idde2891b2f15190f327e4298131a6069c58163c0
2020-07-20 12:35:16 +02:00
Steve Baker
b1328fa996 Remove Link type
This type is only used for output response formatting, not for input
validation, so it can be replaced with a basic dict equivalent without
disruption.

This results in fields in WSME types which shouldn't be handled by
WSME because they are already in a dict format. This is handled by
relaxing the validation in the (ex-WSME) types so that a None type
means that WSME shouldn't serialize that attribute. This will allow
old style type serialization to be mixed with plain dicts during the
transition period.

Story: 1651346
Task: 10551

Change-Id: Ifae9bd005fb7cf951b069ade0c92b8d61e095e0f
2020-07-20 08:58:32 +12:00
Steve Baker
5e12d502fe Remove File type
The only use of the File type is for wrapping passthru responses in a
io.BytesIO, so this change does this wrapping directly and removes the
File type.

Change-Id: I6759bc304839bd89a50fc3bf9e26b1cd20537a0a
Story: 1651346
Task: 10551
2020-07-20 08:58:32 +12:00
Zuul
d867488e90 Merge "Auto extend the timeout for RAX hosts" 2020-07-19 16:47:23 +00:00
Zuul
c269b36304 Merge "Replace oslo_utils.netutils type compares with ipaddress" 2020-07-19 16:05:48 +00:00
Zuul
f5828077bc Merge "Allow disabling retries in AgentClient.get_command_statuses" 2020-07-19 16:05:42 +00:00
Iury Gregory Melo Ferreira
dc87a189cb Update number of VM on ironic-base
Since we merged the change to have partition and wholedisk
testing on basic_ops most of the jobs started requiring 2 VMs
to run the tempest tets.

Let's increase on the ironic-base so all jobs will be default to 2.

Removing IRONIC_VM_COUNT=2 from jobs that uses ironic-base as parent.

Change-Id: I13da6275c04ffc6237a7f2edf25c03b4ddee936a
2020-07-18 10:57:21 +02:00
Zuul
c9a0bce01b Merge "Follow-up on blocking port deletions" 2020-07-18 04:22:06 +00:00
Zuul
674ed29347 Merge "Add missing agent RAID compatibility for ilo5 and idrac" 2020-07-18 03:51:00 +00:00
Zuul
521d796037 Merge "Explicitly set jobs to ML2/OVS" 2020-07-18 03:35:08 +00:00
Zuul
0238034827 Merge "Ironic to use DevStack's neutron"-legacy" module" 2020-07-18 03:17:03 +00:00
Julia Kreger
b1dd9147d2 Replace oslo_utils.netutils type compares with ipaddress
We used netutils earlier on to have a backportable change
however the longer term goal was to replace the change with
using the python native ipaddress module directly.

For the cases where we can change IP version type compares,
we change them with this change.

Note: other uses of netutils still exist, and we should
eventually see if we can phase them out, however the remaining
uses are around MAC address validations.

Change-Id: I44336423194eed99f026c44b6390030a94ed0522
2020-07-17 17:05:54 -07:00
Julia Kreger
3750ba62df Auto extend the timeout for RAX hosts
Rax hosts uses qemu software emulated VMs without leveraging the
magic with-in the processors to help ensure speedy execution.

As such, they can be substantially slower in some operations, such
decompressing ramdisks. This adds an unpredictable element into our
CI and causes job failures when they should ahve succeeded, which
causes more rechecks, which consumes more resources... and the cycle
continues.

So instead, we'll extend the timeout a little, to hopefully give the
job time to complete without causing failures.

Change-Id: I0cd08e527763f0626fd1e43cc3b87163a4b0d018
2020-07-17 16:16:59 -07:00
Dmitry Tantsur
3b6163afd2 Allow disabling retries in AgentClient.get_command_statuses
For the agent power interface it will be required to check if the agent
is running without making too many retries.

Change-Id: I63b5348ecfd55e9ac889fb12b0212d76785edaca
Story: #2007771
Task: #40380
2020-07-17 17:30:06 +02:00
Zuul
c871622ff5 Merge "Add json and param parsing to args" 2020-07-17 14:47:04 +00:00
Zuul
2876fd1790 Merge "Decompose the core deploy step on iscsi and ansible deploy" 2020-07-17 14:46:55 +00:00
Jakub Libosvar
02d3efbd7e Explicitly set jobs to ML2/OVS
Devstack is changing the Neutron default to OVN backend. This patch is
to make sure Ironic gate will not get broken by this change as currently
OVN doesn't support baremetal nodes.

Change-Id: I0745e07d32e3455fad2a2249c31f279fd1d38b5b
Signed-off-by: Jakub Libosvar <libosvar@redhat.com>
2020-07-17 14:44:29 +02:00
Zuul
96f3904dbc Merge "Change non-tinyipa jobs to use multiple cores" 2020-07-16 22:22:24 +00:00
Zuul
203bf50ae9 Merge "Stop wiping driver_internal_info on node.driver updates" 2020-07-16 21:06:51 +00:00
Zuul
8b67330c45 Merge "Do not validate driver on changing non-driver fields" 2020-07-16 20:50:46 +00:00
Julia Kreger
ba0dc574bc Follow-up on blocking port deletions
A recent comment on https://review.opendev.org/#/c/665835
pointed out that we should likely make some changes and a fix
a missing check for the introspection_vif_port_id which was
likely introduced after this functionality was originally
written.

Also adds some documentation on the subject since we lack
docs even pointing out how to delete a port. :\

Change-Id: I0ba8a3741eefa80eb56e25a1b339f8433b3fc0dc
2020-07-16 12:47:07 -07:00
Zuul
1062567531 Merge "Wipe agent token during reboot or power off" 2020-07-16 17:27:03 +00:00
Zuul
44533d7b49 Merge "Implement get_deploy_steps for AgentRAID" 2020-07-16 17:27:00 +00:00
Zuul
4d9407f99b Merge "Fixes to skip validation of in-band deploy steps before agent boot" 2020-07-16 15:20:38 +00:00
Zuul
59e27224d9 Merge "Add get_node_network_data to Neutron NetworkInterface" 2020-07-16 15:18:19 +00:00
Dmitry Tantsur
5f557f47f4 Stop wiping driver_internal_info on node.driver updates
It brings more harm than good, e.g. it breaks fast-track. Since
driver-specific fields are name-spaced, there should be no much
harm in keeping them around.

Change-Id: I397d23af64dfd56074cb563eedbe2d1ef8efff53
2020-07-16 17:18:17 +02:00
Lucas Alvares Gomes
bec00bd85d Ironic to use DevStack's neutron"-legacy" module
In the last PTG the Neutron team discussed and decided to undeprecate
the neutron-legacy module in DevStack because that's the module being
used (almost) everywhere and it works. The lib/neutron was an attempt
to refactor the old module but, in the last few years it hasn't gained
any traction and due to the lack of features and people to work on it,
it's going to be removed from DevStack eventually.

Below is a snippet from the PTG summary email [0] about this topic:

<snippet>
In Devstack there are currently 2 modules which can configure
Neutron. Old one called "lib/neutron-legacy" and the new one called
"lib/neutron". It is like that since many cycles that "lib/neutron-legacy"
is deprecated. But it is still used everywhwere. New module isn't still
finished and isn't working fine.  This is very confusing for users as
really maintained and recommended is still "lib/neutron-legacy" module.

During the discussion Sean Collins explained us that originally this
new module was created as an attempt to refactor old module, and to
make Neutron in the Devstack better to maintain. But now we see that
this process failed as new module isn't still used and we don't have
any cycles to work on it. So our final conclusion is to "undeprecate"
old "lib/neutron-legacy" and get rid of the new module.
</snippet>

This patch changes the Ironic jobs to use the old Neutron module in
DevStack.

[0]
http://lists.openstack.org/pipermail/openstack-discuss/2020-June/015368.html
[1]
http://codesearch.openstack.org/?q=neutron-api%3A%20true&i=nope&files=&repos=

Change-Id: Ief043a0a01a800ea2d01a602000f0854df9e629f
Signed-off-by: Lucas Alvares Gomes <lucasagomes@gmail.com>
2020-07-16 15:45:35 +01:00
Zuul
b8f2745b2d Merge "Allow deleting nodes with a broken driver" 2020-07-16 14:07:53 +00:00
Zuul
cb76f7115b Merge "Use default timeout for all jobs" 2020-07-16 14:05:36 +00:00
Shivanand Tendulker
8c191ceb5a Fixes to skip validation of in-band deploy steps before agent boot
Validation should not fail even if any deploy step requested in the deployment
template is not available, unless inband deploy steps are retrieved.

Change-Id: I173e6b1a8037698d41f355c7ef55f7389594be1e
2020-07-16 02:41:01 -04:00
Dmitry Tantsur
a7976b3491 Implement get_deploy_steps for AgentRAID
This allows using software RAID as an in-band deploy step.

Change-Id: I66103598cf58267010a09b1bd654dc90f714c202
2020-07-15 16:31:43 +02:00
Riccardo Pittau
d430d1bdb9 Set min version of tox to 3.2.1
As recommended, since version 3.2.0 tox switches pip invocations
to use the module -m pip instead of direct invocation.
We set min version to 3.2.1 [1] to also fix the behavior of
--parallel--safe-build

[1] https://tox.readthedocs.io/en/latest/changelog.html#v3-2-1-2018-08-10

Change-Id: Ia7209934d30f6a55879319ab6ca94d8bf8886073
2020-07-15 15:25:09 +02:00
Zuul
079b22a800 Merge "Use min_command_interval when ironic does IPMI retries" 2020-07-15 12:43:47 +00:00
Iury Gregory Melo Ferreira
f27fcda53b Use default timeout for all jobs
Let's use the default timeout from ironic-base for all jobs
so we can avoid job timeout in our CI.

Change-Id: I5e753c4bbcb8075a1889754a468d9c3dd8310a08
2020-07-15 11:02:34 +02:00
Dmitry Tantsur
2d4d375358 Wipe agent token during reboot or power off
Otherwise a reboot during fast-track will leave the newly booted
agent without an ability to request a token.

Change-Id: I963276efae5599bfed6cbb4df18e8dd3bd1b9839
2020-07-15 10:54:31 +02:00
Dmitry Tantsur
c376cb3219 Add missing agent RAID compatibility for ilo5 and idrac
Software RAID relies on it.

Change-Id: Id220ce3a2c2821ad1841add20f2138e65d1786bf
2020-07-14 16:26:16 +02:00
Steve Baker
44cc6dd792 Add wsme core types, remove WSME
The header for the file types.py denotes its dual-licensed status as
MIT with copyright to the original WSME authors, plus apache licensed
as part of Ironic.

Story: 1651346
Task: 10551

Change-Id: I986cc4a936c8679e932463ff3c91d1876a713196
2020-07-14 10:34:13 +12:00
Steve Baker
8006c9dfd2 Add json and param parsing to args
Some unused HTTP param to arg parsing has not been implemented to
reduce code complexity. This includes the following types:
- DictType
- complex types

Asserts are added to confirm these param types are not used in ironic
currently, and to prevent them being used in future development.

Story: 1651346
Task: 10551

Change-Id: Idfcf99216f10e8928fe4ba6202a7d69bfa916459
2020-07-14 10:34:13 +12:00
Julia Kreger
44d56c559b Change non-tinyipa jobs to use multiple cores
A large ramdisk image tends to take an undesirable amount
of time performing the initial uncompression into memory before
the system is booted and available. This sets the number of CPU cores
by default for all jobs to 2, and only sets that back to 1 where
TinyIPA is being used.

Change-Id: I88c57a1345edb1b14c760753638ad927641b34a2
2020-07-13 14:29:05 -07:00
Julia Kreger
3d778db0c4 Add knob for read-only and "erase_devices"
In https://review.opendev.org/#/c/704725 we merged a change to
allow the agent to navigate read-only block devices. By default
we always failed on the more secure "erase_devices" clean step as
meta-data only erasure still leaves any sensitive information on
the storage medium.

That being said, it may be operationally okay for read-only devices
to be ignored during the "erase_devices" clean step. Only the
operator can make that call, and we should enable them to be able
to assert that in the configuration to IPA.

Change-Id: I475f0215eb0bd149c2d21e6962429181b63e8bdb
2020-07-13 10:04:37 -07:00