diff --git a/doc/source/admin/troubleshooting.rst b/doc/source/admin/troubleshooting.rst index e7e07d1c9b..86859455ab 100644 --- a/doc/source/admin/troubleshooting.rst +++ b/doc/source/admin/troubleshooting.rst @@ -1283,3 +1283,62 @@ related to image files. Image safety checks are generally performed as the deployment process begins and stages artifacts, however a late stage check is performed when needed by the ironic-python-agent. + +Using /dev/sda does not write to the first disk +=============================================== + +Alternative name: I chose /dev/sda but I found it as /dev/sdb after rebooting. + +Historically, Linux users have grown accustom to a context where /dev/sda is +the first device in a physical machine. Meaning, if you look at the device +by_path information or the HCTL, or device LUN, the device ends with a zero. + +For example, assuming 3 disks, two controllers, with a single disk on the +second controller would look something like this: + +* /dev/sda maps to a device with lun 0, HCTL 0:0:0:0 +* /dev/sdb maps to a device with lun 1, HCTL 0:0:1:0 +* /dev/sdc maps to a device with lun 2, HCTL 0:1:0:0 + +However, this was a pattern we grew accustom to because the order of device +discovery was sequential *and* synchronous. In other words the kernel stepped +through all possible devices one at a time. Where this breaks is when the +kernel is operating in a mode where device initialization is asynchronous as +some distributions have decided to adopt. + +The result of a move to an asynchronous initialization is /dev/sda has always +been the *first* device to initialize, *not* the first device in the system. +As a result, we can end up with something looking like: + +* /dev/sda maps to a device with lun 1, HCTL 0:0:1:0 +* /dev/sdb maps to a device with lun 2, HCTL 0:1:0:0 +* /dev/sdc maps to a device with lun 0, HCTL 0:0:0:0 + +Generally, most operators might then consider referencing the +/dev/disk/by-path structure to match disk devices because that seems to imply +a static order, *however* a kernel operating with asynchronous device +initialization will order *everything*, including PCI devices the same way, +meaning by-path can also be unreliable. Furthermore, if your server hardware +is using multipath IO, you should be operating with multipath enabled such +that the device is used. + +The net result is the best criteria to match on is: + +* Serial Number +* World Wide Name +* Device HCTL, which *does* appear to be static in these cases, but is not + applicable for hosts using multipathing. It may, ultimately, not be static + enough, just depending on the hardware in use. + +.. NOTE: Some RAID controllers will generate fake WWN and Serial numbers for + "disks" being supplied by the RAID controller. Some may also use the same + WWN for *all* devices, which is a valid approach as the device Logical Unit + Numbers or Device identifier number would be different. Ultimately this + means labels on disks may not be able to be matched to volumes through a + RAID controller, and operators will need to simply "know their hardware" + to navigate the best path depending on the configuration and behavior of + their hardware. + +.. NOTE: Centos Stream-9 appears to have a probe_type="sync" option which + reverts this behavior. For more information please see + this `centos stream-9 changeset `_. diff --git a/doc/source/install/include/root-device-hints.inc b/doc/source/install/include/root-device-hints.inc index e31bd225b4..89c2e770ec 100644 --- a/doc/source/install/include/root-device-hints.inc +++ b/doc/source/install/include/root-device-hints.inc @@ -9,6 +9,12 @@ which disk it should pick for the deployment. The list of supported hints is: * model (STRING): device identifier * vendor (STRING): device vendor * serial (STRING): disk serial number + + .. note:: + Some RAID controllers will generate serial numbers to represent volumes + provided to the operating system which do not match or align to physical + disks in a system. + * size (INT): size of the device in GiB .. note:: @@ -18,7 +24,9 @@ which disk it should pick for the deployment. The list of supported hints is: should be the actual size. For example, for a 128 GiB disk ``local_gb`` will be 127, but size hint will be 128. -* wwn (STRING): unique storage identifier +* wwn (STRING): unique storage identifier and typically mapping to a device. + This can be a single device, or a SAN storage controller, + or a RAID controller. * wwn_with_extension (STRING): unique storage identifier with the vendor extension appended * wwn_vendor_extension (STRING): unique vendor storage identifier * rotational (BOOLEAN): whether it's a rotational device or not. This @@ -28,6 +36,11 @@ which disk it should pick for the deployment. The list of supported hints is: e.g '1:0:0:0' * by_path (STRING): the alternate device name corresponding to a particular PCI or iSCSI path, e.g /dev/disk/by-path/pci-0000:00 + + .. note:: + Device identification by-path may not be reliable on Linux kernels using + asynchronous device initialization. + * name (STRING): the device name, e.g /dev/md0 @@ -39,6 +52,13 @@ which disk it should pick for the deployment. The list of supported hints is: devices like /dev/sda and /dev/sdb `switching around at boot time `_. + .. warning:: + Furthermore, recent move to asynchronous device initialization among + some Linux distribution kernels means that the actual device name string + is entirely unreliable when multiple devices are present in the host, as + the device name is claimed by the device which responded first, as opposed + to the previous pattern where it was the first initialized device in + a synchronous process. To associate one or more hints with a node, update the node's properties with a ``root_device`` key, for example::