1829 Commits

Author SHA1 Message Date
Dmitry Tantsur
30a85bd0ce API to force manual cleaning without booting IPA
Adds a new argument disable_ramdisk to the manual cleaning API.
Only steps that are marked with requires_ramdisk=False can be
run in this mode. Cleaning prepare/tear down is not done.

Some steps (like redfish BIOS) currently require IPA to detect
a successful reboot. They are not marked with requires_ramdisk
just yet.

Change-Id: Icacac871603bd48536188813647bc669c574de2a
Story: #2008491
Task: #41540
2021-03-16 16:08:46 +01:00
Zuul
441ed4fe9a Merge "Rework the standalone guide" 2021-03-10 11:46:11 +00:00
Zuul
a0f940bf5a Merge "Volume targets/connectors Project Scoped RBAC" 2021-03-08 20:20:37 +00:00
Dmitry Tantsur
49fcbd4910 Rework the standalone guide
Split the monolithic guide into several pages: configuration, enrollment
and deployment. Merge duplicating docs into the common locations.
Use code-block for nicer highlighting.

Change-Id: Iaeef9e0cf8deba20a125d3cfacd4ca8ca2f52e84
2021-03-08 18:40:26 +01:00
Dmitry Tantsur
472ffca269 docs: move overriding interfaces to the standalone documentation
Chances are much higher the users will find it there. Also correct some
wording (node interfaces -> hardware interfaces), use double ticks for
field names and mention the Wallaby release.

Story: #2008652
Task: #42015
Change-Id: I33956976a9420ade836ab8d37a9488b9a207cef0
2021-03-08 18:01:50 +01:00
Zuul
4865511ad2 Merge "Add support for using NVMe specific cleaning" 2021-03-08 12:24:55 +00:00
Zuul
d3dd6b29b2 Merge "Revert "Update iDRAC doc with missing interfaces"" 2021-03-08 10:09:34 +00:00
Zuul
42df92e7dd Merge "Allow instance_info to override node interface" 2021-03-08 03:19:40 +00:00
Tzu-Mainn Chen
a165fe3264 Allow instance_info to override node interface
This change allows instance_info values to override node interface
definitions, so non-admins can make temporary changes to various
interfaces.

Story: #2008652
Task: #41918
Change-Id: I6c3dc74705bde02bd02882d14838f184f8d4a5e3
2021-03-05 18:32:46 +00:00
Aija Jauntēva
a8b5137db3 Revert "Update iDRAC doc with missing interfaces"
This reverts commit b0df0960e2c53a4fe6673ba0a1ed546ffd156dc7.

Reason for revert: Need to split in separate patches and backport virtual media boot part.

Change-Id: Ib182ee6f2894fcdcea369a60dc5bd922a16434e2
2021-03-05 11:35:33 +00:00
Julia Kreger
e870bd34d0 Volume targets/connectors Project Scoped RBAC
This patch adds project scoped access, as part of the work
to delineate system and project scope access.

Adds policies:
* baremetal:volume:list_all
* baremetal:volume:list
* baremetal:volume:view_target_properties

Change-Id: I898310b515195b7065a3b1c7998ef3f29f5e8747
2021-03-04 09:47:36 -08:00
Julia Kreger
e9dfe5ddaa Port/Portgroup project scoped access
This patch implements the project scoped rbac policies for a
system and project scoped deployment of ironic. Because of the
nature of Ports and Portgroups, along with the subcontroller
resources, this change was a little more invasive than was
originally anticipated. In that process, along with some
discussion in the #openstack-ironic IRC channel, that it
would be most security concious to respond only with 404s if
the user simply does not have access to the underlying node
object.

In essence, their view of the universe has been restricted as
they have less acess rights, and we appropriately enforce that.
Not expecting that, or not conciously being aware of that, can
quickly lead to confusion though. Possibly a day or more of
Julia's life as well, but it comes down to perceptions and
awareness.

Change-Id: I68c5f2bae76ca313ba77285747dc6b1bc8b623b9
2021-03-02 15:45:03 -08:00
Julia Kreger
f1641468bb Project Scoping Node endpoint
* Adds additional policies:
  * baremetal:node_get:last_error
  * baremetal:node:get:reservation
  * baremetal:node:get:driver_internal_info
  * baremetal:node:get:driver_info
  * baremetal:node:update:driver_info
  * baremetal:node:update:properties
  * baremetal:node:update:chassis_uuid
  * baremetal:node:update:instance_uuid
  * baremetal:node:update:lessee
  * baremetal:node:update:driver_interfaces
  * baremetal:node:update:network_data
  * baremetal:node:update:conductor_group
  * baremetal:node:update:name

* With new policies, responses of filtering and posted data is
  performed. Testing has been added to the RBAC testing files
  to align with this and the defaults where pertinant.

* Adds another variation of the common policy check method
  which may be useful in the long term. This is too soon to
  tell, but the overall purpose is to allow similar logic
  patterns to the authorize behavior. This is because the
  standard policies are, at present, also used to control
  behavior of response, and node response sanitization needs
  to be carefully navigated.

This change excludes linked resources such as /nodes/<uuid>/ports,
portgroups, volumes/[targets|connectors]. Those will be in later
changes, as the node itself is quite a bit.

Special note:
* The indicator endpoint code in the API appears to be broken
  and given that should be fixed in a separate patch.

Change-Id: I2869bf21f761cfc543798cf1f7d97c5500cd3681
2021-03-02 15:43:29 -08:00
Jacob Anders
aa42582ac4 Add support for using NVMe specific cleaning
This change adds support for utilising NVMe specific cleaning tools
on supported devices. This will remove the neccessity of using shred to
securely delete the contents of a NVMe drive and enable using nvme-cli
tools instead, improving cleaning performance and reducing wear on the device.
(this specific change adds extra documentation to the earlier set of
patches implementing this).

Story: 2008290
Task: 41168
Change-Id: Ia6d34b31680967a0d14687e5a54d68a1f1644308
2021-03-03 07:23:05 +10:00
Zuul
148dda163e Merge "[doc-only] Add BFV basic flow and networking context" 2021-02-24 13:39:24 +00:00
Zuul
71ebba5cf3 Merge "Add some tuning documentation" 2021-02-15 15:41:20 +00:00
Zuul
a2cc1baa86 Merge "Address some rbac review feedback in merged patches" 2021-02-15 07:03:59 +00:00
Julia Kreger
bb30f9945c Add some tuning documentation
Change-Id: I56e3c45bf7ae89b3f96ee826565bf153908d1bf7
2021-02-13 14:28:07 +00:00
Zuul
4b6a18f24c Merge "Trivial: update version for deploy steps" 2021-02-12 18:12:30 +00:00
Zuul
52ff615c98 Merge "Guard conductor from consuming all of the ram" 2021-02-12 18:11:57 +00:00
Dmitry Tantsur
7eadc52403 Trivial: update version for deploy steps
Change-Id: I4aac0a9f2e9bd1ae40f41722ab75e92f2a09cfef
2021-02-12 17:04:06 +01:00
Zuul
766d8f11b4 Merge "Add 'deploy steps' parameter for provisioning API" 2021-02-12 16:01:33 +00:00
Julia Kreger
e3ccb9ec22 Address some rbac review feedback in merged patches
Some of the early test changes for the RBAC work have merged
which is awesome, but a couple minor follow-up items should be
addressed. They are so minor it doesn't really make sense to merge
in with one of the patches in the chain.

Change-Id: I85de4d953237f240c3c220f6a57169c633fb295f
2021-02-12 06:56:31 -08:00
Steve Baker
606549c1c9 Populate existing policy tests
Testing every combination of role, endpoint and policy rule would
result in a huge test count, so to make testing the existing policy
rules complete and practical, the following guidelines are suggested:

- Only the default policy is tested, so inactive rules such as
  is_node_owner, is_node_lessee are ignored.
- Each rule is tested completely on one endpoint which uses it.
- A rule (such as baremetal:node:list) which inherits a parent rule
  (baremetal:node:get) is considered covered by the parent test.
- All endpoints need at least one test, but other endpoints which share
  a fully tested rule only need one denied test which shows that they
  are covered by some policy.

Also adds the initial pass of contributor documentation on how the
rbac testing works to try and express the mechanics and what to
expect to aid in reviewing/updating/editing the rules.

Co-Authored-By: Julia Kreger <juliaashleykreger@gmail.com>
Change-Id: I1cd88210e40e42f86464e6a817354620f5ab1d9c
2021-02-11 10:34:52 -08:00
Zuul
4e5c034187 Merge "Make boot_mode more consistent with other capabilities" 2021-02-11 14:24:31 +00:00
Dmitry Tantsur
cf22604c58 Prevent redfish-virtual-media from being used with Dell nodes
Indicate that idrac-redfish-virtual-media must be used instead,
otherwise a confusing failure will happen.

Change-Id: I3b6ced6dcf03580903f5ea7237fc057f372999f9
2021-02-05 12:09:00 +01:00
Aija Jauntēva
3138acc836 Add 'deploy steps' parameter for provisioning API
Story: 2008043
Task: 40705
Change-Id: I3dc2d42b3edd2a9530595e752895e9d113f76ea8
2021-02-03 11:47:53 -05:00
Zuul
f4197a12ef Merge "Redfish secure boot management" 2021-02-03 14:43:06 +00:00
Dmitry Tantsur
ccc6c551c3 Make boot_mode more consistent with other capabilities
All capabilities, except for boot_mode, are read from instance_info.
This change makes instance_info.capabilities[boot_mode] work as well
and deprecates instance_info.deploy_boot_mode.

Note that the special handling of properties.capabilities[boot_mode]
is kept in this patch.

Change-Id: Ic2e7fd4c71b7a7bc2950d17f7e1bbdad73bbb8a7
2021-02-02 12:06:17 +01:00
Dmitry Tantsur
a5f7d75ba2 Apply force_persistent_boot_device to all boot interfaces
For some (likely historical) reasons we only use it for PXE and iPXE,
but the same logic applies to any boot interface (since it depends
on how the management interface and the BMC work, not on the boot
method). This change moves its handling to conductor utils.

Change-Id: I948beb4053034d3c1b4c5b7c64100e41f6022739
2021-02-01 13:37:20 +01:00
Julia Kreger
d9913370de Guard conductor from consuming all of the ram
One of the biggest frustrations larger operators have is when they
trigger a massive number of concurrent deployments. As one would
expect, the memory utilization of the conductor goes up. Except,
even with the default number of worker threads, if we're requested
to convert 80 images at the same time, or to perform the write-out
to the remote node at the same time, we will consume a large amount
of system RAM. Or more specifically, qemu-img will consume a large
amount of memory.

If the amount of memory goes too low, the system can trigger
OOMKiller which will slay processes using ram. Ideally, we do not
want this to happen to our conductor process, much less the work
that is being performed, so we need to add some guard rails to help
keep us from entering into situations where we may compromise the
conductor by taking on too much work.

Adds a guard in the conductor to prevent multiple parallel
deployment operations from running the conductor out of memory.

With the defaults, the conductor will attempt to throttle back
automatically and hold worker threads which will slow down the
amount of work also proceeding through the conductor, as we are
in a memory condition where we should be careful about the work.

The defaults allow this to occur for a total of 15 seconds between
re-check of available RAM, for a total number of six retries.
The minimum default is 1024 (MB), as this is the amount of memory
qemu-img allocates when trying to write images. This quite literally
means no additional qemu-img process can spawn until the default
memory situation has resolved itself.

Change-Id: I69db0169c564c5b22abd0cb1b890f409c13b0ac2
2021-01-29 14:33:57 -08:00
Zuul
fd34d3c437 Merge "Add centralized secure boot documentation" 2021-01-27 13:36:39 +00:00
Dmitry Tantsur
4c4c7a869a Add a few words about UEFI user images
Change-Id: I37a686e6f48a422d38ac5921a188d894519b7530
2021-01-26 21:22:56 +01:00
Dmitry Tantsur
33d51f221f Redfish secure boot management
Story: #2008270
Task: #41137
Change-Id: Ied53f8dc5b93522ac9ffc25ec93ad2347a7d1c7c
2021-01-26 17:15:46 +01:00
Dmitry Tantsur
04400eea47 Add centralized secure boot documentation
Move the bits from iLO and iRMC, clean them up a bit.

Change-Id: I5b6da854ae0214141ae25a17b8ea3c7874636372
2021-01-26 17:00:50 +01:00
Dmitry Tantsur
bb318008b9 redfish-virtual-media: allow a link to raw configdrive image
For historical reasons we always base64+gzip configdrives, even
when accessing them via a URL. This change allows binary images
to work for the redfish-virtual-media case.

Change-Id: If19144de800b67275e3f8fb297f0a5c4a54b2981
2021-01-22 16:26:44 +01:00
Zuul
5640860c81 Merge "Follow-up for ramdisk deploy configdrive support" 2021-01-21 14:06:14 +00:00
Aija Jauntēva
b0df0960e2 Update iDRAC doc with missing interfaces
Change-Id: I691b76879ba00fb5535d7016c9d6fb53e9dde462
2021-01-20 09:25:19 -05:00
Zuul
67c90e7e4f Merge "Policy json to yaml migration" 2021-01-19 02:11:28 +00:00
Zuul
07bdccea58 Merge "Do not enter maintenance if cleaning fails before running the 1st step" 2021-01-12 07:10:42 +00:00
Dmitry Tantsur
fe380bbbab Follow-up for ramdisk deploy configdrive support
1) Do not issue a warning if the boot interface supports configdrive
2) Implement missing support for Swift URLs in configdrives

Change-Id: I4b06478a14ab514d785f8e3972e5afbd79f8d3b5
2021-01-11 20:02:27 +01:00
Zuul
6af2e2d9d1 Merge "Support configdrive when doing ramdisk deploy with redfish-virtual-media" 2021-01-11 17:28:39 +00:00
Zuul
1c7b5f8259 Merge "docs: Add information on post-branch release tasks for bifrost" 2021-01-08 15:25:17 +00:00
Dmitry Tantsur
ad696c9bac Do not enter maintenance if cleaning fails before running the 1st step
We use maintenance mode to signal that hardware needs additional
intervention, because of potential damage or stuck long-running
processes. This is not the case for PXE booting or invalid requested
manual clean steps, so don't set maintenance if no clean step is
running when the failure occurs.

Change-Id: I8a7ce072359660fc6640e5f20ec2d3c452033557
2021-01-08 14:57:07 +01:00
Zuul
d5f184ea16 Merge "Document using ramdisks with the ramdisk deploy interface" 2021-01-05 18:31:38 +00:00
Julia Kreger
2404d486ac Policy json to yaml migration
Adds the status upgrade check for the JSON to YAML migration
effort and updates the documentation where it seems appropriate
to move from "policy.json" to "policy.yaml"

Mostly shamelessly copied from https://review.opendev.org/#/c/748059/
however is in-line with ironic's configuration and patching methods.

Related Blueprint: policy-json-to-yaml

Change-Id: I1d5b3892451579ebfd4d75a0f7185e0ef3c984c8
2021-01-04 13:40:54 -08:00
Julia Kreger
1e96ecbdbc Add troubleshooting on changing ironic.conf default interfaces
Change-Id: If836d064ed7e8f6eaefbc0cfab8c404d2c3174fb
2021-01-04 09:40:41 -08:00
Zuul
fcf029a0ad Merge "Modify port group document for ironic" 2021-01-04 09:51:49 +00:00
Zuul
0112b33291 Merge "Mark the iSCSI deploy as deprecated in the docs" 2021-01-01 17:51:12 +00:00
Zuul
3864483a76 Merge "update python packages to python3 in quickstart.rst" 2021-01-01 04:08:24 +00:00