137 Commits

Author SHA1 Message Date
Michel Thebeau
9b4ba083ed Add metadata behaviour action for software deploy
Vault application should not be applied while platform-integ-apps, and
its other dependencies, are applying.

The condition is observed during USM upgrade software deploy activate.

This change replaces these earlier proposals, which were introduced
before the implementation of USM had started:
  https://review.opendev.org/c/starlingx/config/+/913405
  https://review.opendev.org/c/starlingx/config/+/913406

Test Plan:
PASS  vault sanity
PASS  lock/unlock test to assert vault is not reapplied (std 2+1)

Partial-Bug: 2058038

Change-Id: Ib484b0456e67b2aa1863015f0ef45b724e6d1dd6
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2024-09-03 13:25:13 +00:00
Michel Thebeau
4e0c5bfaf2 adjust livenessProbe intervals
A proposed set of solutions to high platform CPU usage requests
livenessProbe intervals to be greater than 10 and not a multiple of 5.
It is also requested that timeout values be more forgiving at greater
than 5 seconds.

livenessProbe is enabled by default for vault injector and vault
manager.  Vault-manager is not a high performance component, so use
rather large intervals.  Keep vault injector closer to the requested
threshold.

Test plan:
PASS  selected values are presented in pod spec during runtime

Story: 2011073
Task: 50887

Change-Id: I72a335957ec88fcc7f2a0c417da6815b363934ba
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2024-08-21 21:19:57 +00:00
Michel Thebeau
adc792cd48 do not rekey when vault server pods need upgrade
Changes in the upgrade procedure cause vault server pods to require
restart in order to update to new server version.  The work for restart
pods is performed in another commit.

Defer a request for vault rekey until the server pods match the expected
version.  The rekey procedure will not proceed if vault pods are being
restarted, and so we should not start a rekey when it is anticipated
that vault pods will be restarted.

Test Plan:
PASS  bashate
PASS  unit test
PASS  vault sanity master branch, rekey
PASS  simplex upgrade (manual server pod restart)
PASS  duplex 2+1 (vault ha, 3 replicas) application-update

Story: 2011073
Task: 50814

Change-Id: I91334d0577148c1e3f7bc674ab2a3edfaced1d1c
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2024-08-15 17:10:25 +00:00
Michel Thebeau
96c265ed20 mount-helper: run as root
When performing upgrade, conversion of the PVC storage to k8s secrets
requires the helper pod to run as root in order to delete the files in
the volume.  The old vault-manager pod was running as root.

Commit c849a3bb did not account for this and did not test
application-update.

Test Plan:
PASS  platform, application upgrade
PASS  unit test, manual execution of the procedure with pvc-attach.yaml

Story: 2011073
Task: 50522

Change-Id: I9ab340544408e9478e6fce68d78337d2fded9a09
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2024-08-09 22:06:54 +00:00
Michel Thebeau
0b50363b36 vault-manager: rebrand pause as a feature
Vault was added to backup/restore in ansible-playbooks backup.yml,
vault_backup.yml, etc.  The feature upgrades vault-manager pause to a
more than a debugging feature.

Rephrase the descriptions for the pause feature to include the use
current use cases.

Test plan:
PASS  bashate
PASS  application-apply

Change-Id: I6ea571a9c4ae221acb15a4c34ed5ffaa3edda99d
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2024-07-31 13:22:36 +00:00
Zuul
5f18f9c889 Merge "HealthCheck Function and Enable Livenessprobe" 2024-07-30 19:00:52 +00:00
Tae Park
c91dba85b9 HealthCheck Function and Enable Livenessprobe
Adding a new healthcheck function to the vault manager, and adding
livenessprobe to the vault manager configuration. The healthcheck
function fails one of two ways: last heartbeat took longer than the
threshold value, or a health_check_fail was found. There are also excuse
files to excuse healthcheck failures. Network excuse for vault API
access, init excuse for vault manager initialization, and pause excuse
for pausing vault manager is added. All can be enabled/disabled by
corresponding helm chart values.

Test Plan:
PASS	The vault manager pod contains liveness check
PASS	The stub function fails under correct conditions
PASS	The vault manager pods restarts once liveness fails
PASS	Unit tests
PASS	Vault manager sanity

Story: 2011073
Task: 50547

Change-Id: I727919efc4580641f18d11cadd17861e827e36c6
Signed-off-by: Tae Park <tae.park@windriver.com>
2024-07-29 16:08:37 -04:00
Michel Thebeau
c849a3bb3d vault-manager: run as non-root user
Use the docker image with default USER 'manager'.

Test Plan:
PASS: vault application sanity
PASS: the correct image is used
PASS: vault-manager processes are run as 'manager' user

Depends-on: I52aafcdc86eb0e043b033fe163b6a71942f5d53d

Change-Id: If9bc0fb4d0701e459ca4cc98a80d3542f7d6244d
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2024-07-23 17:50:57 +00:00
Michel Thebeau
a700326eb0 change vault version to 1.14.0
Change the vault server version to match the chart's original content.
The 1.14.8 version contains BUSL licenses, picked back as bug fixes from
the non-opensource version of vault.

Test Plan:
PASS  build and run the app
PASS vault sanity

Story: 2011073
Task: 50575

Change-Id: Ie4cad11c55b5515e098c302545a0ae62fd6ba4e2
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2024-07-16 12:55:18 +00:00
Michel Thebeau
b253c2a056 run vault-manager as non-root
Docker image security scan complains about running as root.  Add a
'manager' user/group for vault-manager.

Test Plan:
PASS  vault application sanity
PASS  Twistlock scan

Story: 2011073
Task: 50522

Change-Id: I87a00a8bc41a39a00e871dbe84aa32f76e8ec768
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2024-07-15 19:30:21 +00:00
Michel Thebeau
b792021365 Update vault-manager image tag
Update vault-manager to use the image tag for Starlingx 10.0 release.

Test Plan:
PASS  Vault sanity
PASS  Check the image used

Story: 2011073
Task: 50468

Depends-On: I47e8c94f9b230ab4f2c49880b735c57297f4652f

Change-Id: Id3b97f938ac0521e3b15fb85461c8bb3d54ad486
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2024-07-04 17:12:53 +00:00
Michel Thebeau
f994829fcb override pre-existing anti-affinity
commit f7a37e6a, "Removing default injector anti-affinity rules",
disables anti-affinity for the injector pod.  This is sufficient for
future application updates.  However, during application-update an old
pod that still has anti-affinity will still prevent scheduling of a new
pod.  This is observed on AIO-SX when testing application-update in
preparation for USM.

Using injector.strategy.rollingUpdate.maxUnavailable
(DeploymentStrategy) set to 100% the old pod will immediately terminate
while the new pod waits for its termination.

This is the workaround described in the original in starlingx bug:
https://bugs.launchpad.net/starlingx/+bug/2030901.

Test Plan:
PASS  AIO-SX vault sanity
PASS  application-update

Partial-Bug: 2030901
Story: 2011073
Task: 50484

Change-Id: I66fe336ece7f1ccd68caa665aabc693f1b9a5c18
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2024-06-28 20:58:00 +00:00
Michel Thebeau
72d7b82fdf Updating supported kubectl version list
Update the list of installed kubectl versions within the vault manager
docker image.  Support versions 1.24 through 1.29 for the Starlingx
release 10.  Update versions for kubectl for those versions previously
included within the image.

Also: capitalize "FROM", to fix docker build warning

Test Plan:
PASS  Manual build of the image
PASS  vault sanity with the new image
PASS  AIO-SX

Story: 2011073
Task: 50459

Change-Id: I4727466347035c9959b96f2b2520547b6d1addb5
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2024-06-27 12:59:13 +00:00
Edson Dias
44a56a155e Update helm/fluxcd api version.
Right now, vault app is using beta
versions of the Fluxcd and Helm APIs, and for
this reason, some warnings are being thrown.

This change aims to update api versions, removing
beta values following this logic:
Fluxcd:
  - source.toolkit.fluxcd.io/v1beta1
  + source.toolkit.fluxcd.io/v1

Helm:
  - helm.toolkit.fluxcd.io/v2beta1
  + helm.toolkit.fluxcd.io/v2

No changes to yaml file structure are required
for this change.

Test Plan:
PASS: Build ISO & Bootstrap AIO-SX
PASS: Upload and apply vault app
PASS: Confirm that sysinv.log does not have any
      warnings about beta versions related to
      vault.

Story: 2011129
Task: 50449

Change-Id: I4f5516f24ecfa23e5f5983e0917bca7f200da82c
Signed-off-by: Edson Dias <edson.dias@windriver.com>
2024-06-26 15:04:10 -03:00
Zuul
a675f1220d Merge "vault-manager: add functions for backup and restore" 2024-04-09 14:45:51 +00:00
Michel Thebeau
aa1f8b4afd vault-manager: add functions for backup and restore
Functions including pre backup/restore checks, take a snapshot of the
vault, and restore a snapshot from tarball.  These functions will
support ansible-playbook for backup and restore of the vault.

Test Plan:
PASS  bashate
PASS  unit test
PASS snapshot and snapshot restore procedure, as presented in
     ansible-playbooks I324b270ec738f864410068c4ac661301ca8176fd

Change-Id: Id786105aa8ddba2e77085b3897c0c8efd7e98c9b
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2024-04-08 21:49:30 +00:00
Tae Park
53db02a48e Overwrite PSP enable option for kube ver. 1.25+
Adding a check under get_override function for vault. This checks if PSP
is enabled by the user for systems with kubernetes version 1.25 and
above, and if it is, then it will be disabled.

Test Plan:
PASS	Unit Tests
PASS	User override with global.psp.enable=true will be changed to
	false during first/repeated application-apply
PASS	User override with global.psp.enable=true will be changed to
	false during application-update
PASS	Vault application install
PASS	Vault application update after kubernetes upgrade from version
	1.24 to 1.25 or newer
PASS	AIO-SX vault sanity

Story: 2011073
Task: 49799

Change-Id: Ia78e5a0c4423ff110a31d002904e82dee2316d65
Signed-off-by: Tae Park <tae.park@windriver.com>
2024-04-03 13:40:54 -04:00
Tae Park
b00a768784 Create vault manager class for lifecycle code
Creating a new vault manager class within vault.py, so that sysinv can
interact with the new vault manager helm chart. The get_override
function is modified so that the new vault manager fluxcd override is
properly applied.

Test Plan:
PASS	AIO-SX vault sanity
PASS	AIO-DX plus 1 worker vault fresh install and sanity
PASS	Vault HA test for AIO-DX plus 1 worker
PASS	Disable new vault manager helm chart with system
	helm-chart-attribute-modify

Story: 2010929
Task: 49600

Change-Id: I71f0050a9cfd1be1c867f13926c84827d74f71de
Signed-off-by: Tae Park <tae.park@windriver.com>
2024-03-19 14:15:11 -04:00
Tae Park
96c4965be3 Separate vault-manager to a new package
Isolating all vault-manager helm chart and related content into a new
package. Per STX.APP.12, STX.APP.13, vault-manager should be allowed to
be disabled so that another solution can be used to manage vault. The
file structure is also changed, so that vault-helm is under
helm-charts/upstream, and vault-manager-helm is under helm-chart/custom

Test Plan:
PASS	build all vault-related packages
PASS	Create new vault application tarball
PASS 	test existing vault features:
PASS		AIO-SX vault sanity
PASS		Vault rekey feature test
PASS		vault application update and watch PVC conversion

Story: 2010929
Task: 49600

Change-Id: I87cce3466ad905d00da715ce582baa28371135c1
Signed-off-by: Tae Park <tae.park@windriver.com>
2024-03-11 14:49:10 -04:00
Tae Park
05ccd6fea5 Remove warning log for PVC currently terminating
Adding an extra check in the post-convert PVC existence check. The old
vault manager pod may exist beyond the set wait time in the conversion,
preventing the PVC from finishing termination. This is intended
behaviour, so a separate debug log indicating such is issued instead.
Includes a 5 second wait time after PVC conversion is completed, so that
the PVC termination process is started before verification

Test Plan:
PASS Bashate
PASS AIO-SX vault sanity
PASS During application update, the debug log is seen instead of the
warning log if the PVC has status "Terminating"
PASS No log is reported, if the PVC is correctly deleted before the
verification

Closes-bug: 2054824

Change-Id: Ib9cd45a93550d22dee9d45b5994e89ea2191849a
Signed-off-by: Tae Park <tae.park@windriver.com>
vf/bookworm vr/stx.9.0
2024-02-26 14:53:08 -05:00
Michel Thebeau
d921df347e chart version auto-increment scheme
Refer to the example for auto-increment presented by Bob Church:
https://review.opendev.org/c/starlingx/platform-armada-app/+/904464

Implement these specifics for vault-helm:

 - Use StarlingX debian git revcount packaging mechanisms to derive the
   semver BUILD version for upstream helm charts which maintains the
   upstream chart version and adds a versioned BUILD extension.

     <valid semver> ::= <version core> "+" <build>

   Chart version (MAJOR.MINOR.PATCH+STX.REV) is passed to 'helm package'
   command to force the version, where REV == 'git revcount'

 - Update the rules to automatically update the chart versions in the
   fluxCD helmrelease.yaml files.

Test Plan:
PASS  file byte level comparison of package before/after
PASS  AIO-SX vault sanity
PASS  application-update

Story: 2010929
Task: 49399

Change-Id: Id40547c1001ab8fa2d7c83abbcc5c9d44185ee2f
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2024-02-20 21:37:09 +00:00
Tae Park
fae21895d7 Update helm charts for new vm docker image
Updating helm charts with the latest vault manager docker image tag.

Test Plan:
PASS Vault sanity
PASS Check for installation of correct image

Story: 2010930
Task: 49526

Depends-On: https://review.opendev.org/c/starlingx/root/+/908336

Change-Id: I54d33f7c9c8d58df7424c51f9bf366d8746264f0
Signed-off-by: Tae Park <tae.park@windriver.com>
2024-02-08 13:18:06 -05:00
Tae Park
4f504c064c Fix KUBE_LATEST_VERSION to installed version
After updating the kubectl versions, we noticed that there is a mismatch
between version in the KUBE_LATEST_VERSION variable, and the actual
installed version. This change fixes the mismatch, and makes
KUBE_LATEST_VERSION point to the correct version.

Test Plan:
PASS Manual build of the image
PASS verify in docker history for correct version

Story: 2010930
Task: 49526

Change-Id: I5055fd204527c49cc47478d62d01e1afa18d3556
Signed-off-by: Tae Park <tae.park@windriver.com>
2024-02-07 15:13:29 -05:00
Zuul
969e0626b2 Merge "Add minimum Kubernetes version supported" 2024-02-06 19:45:27 +00:00
Tae Park
a6d4436e6e Updating supported kubectl version list
Updating the list of installed kubectl versions within the vault manager
docker image, to support the correct list of version for the master
branch.

Test Plan:
PASS Manual build of the image
PASS vault sanity with the new image
PASS test rekey for each version of kubectl

Story: 2010930
Task: 49526

Change-Id: I7d2103ec62e587c4cc8a6725ab5f2e53f4e9e93d
Signed-off-by: Tae Park <tae.park@windriver.com>
2024-02-06 13:37:31 -05:00
Igor Soares
0313845b34 Add minimum Kubernetes version supported
Add the supported minimum Kubernetes version into the application
metadata file.

The minimum Kubernetes version is set to 1.24.4 and should be changed
accordingly for future application updates.

The "supported_k8s_version:minimum" field is optional but it will become
mandatory in the near future.

This also contains a fix to properly trigger the Tox metadata checks.

Test Plan
PASS: build-pkgs && build-image
PASS: Apply application

Story: 2010929
Task: 49507

Change-Id: I6d698b94cf7008f574d4170e3bd1a8d494d5e619
Signed-off-by: Igor Soares <Igor.PiresSoares@windriver.com>
2024-02-06 15:14:34 -03:00
Tae Park
6fccda0818 Add configuration for pod termination wait time
Adding new configuration options for pod termination wait sequence. The
options set the number of times the new vault-manager pod will check
that the old vault-manager pod is still running, and the number of
seconds to wait between each check.
The total default wait time is now 60s.

Test Plan:
PASS vault build succesfully with the changes
PASS vault sanity on AIO-SX
PASS Test the new helm values

Story: 2010930
Task: 49476

Change-Id: Ie0d4c1fffccf59618cb10bc1e201468f5ffceed0
Signed-off-by: Tae Park <tae.park@windriver.com>
2024-01-31 09:40:29 -05:00
Tae Park
7c22500b16 Include kubectl v1.21 and v1.22 to supported list
Adding kubectl version 1.21.14 and 1.22.17 to the list of supported
kubectl versions in the vault manager container image. This is for
supporting platform upgrade from stx.6.0 to stx.8.0.

Test Plan:
PASS Manual build of the image
PASS vault sanity with the new image

Story: 2010930
Task: 49423

Change-Id: Ie10abc6473790cf44b9d69c4d706338d5063aa5b
Signed-off-by: Tae Park <tae.park@windriver.com>
vf/kernel-6.6
2024-01-18 15:06:52 +00:00
Zuul
9e4244e492 Merge "update vault helm chart to 0.25.0" 2024-01-11 18:01:20 +00:00
Sabyasachi Nayak
f61e33f6e1 update vault helm chart to 0.25.0
Replace references of 0.24.1 with 0.25.0.  Refresh the patches for
vault-manager and agent image reference. Update the image tags to match new vault chart. The vault helm chart uses vault server 1.14.0 version. The latest version of the vault server in the 1.14.x series is 1.14.8. Verified that the changes between vault v1.14.0 and v1.14.8 tags most of them are 'backport'', "cherry-pick" of commits i:e bug fixes. So used 1.14.8 version of vault sever.

Test plan:
 PASSED AIO-sx and Standard 2+2
 PASSED vault aware and un-aware applications
 PASSED HA tests
 PASSED test image pulls from private registry with external network
      restriction

story: 2010393
Task: 49391

Change-Id: I6bd022fed79ead6e1dc224e323a179d1dcd3ab0f
Signed-off-by: Sabyasachi Nayak <sabyasachi.nayak@windriver.com>
2024-01-10 17:47:38 +00:00
Igor Soares
e4504dd0e1 Application versioning based on build release
This change will automatically adjust versioning of the application
tarball and python plugins to reflect the same version reported by
SW_VERION in /etc/build.info.

Test plan:
PASS: build-pkgs -a & build-image
PASS: Confirm that the tarball version matches the platform version
PASS: Apply application

Story: 2010929
Task: 49354

Change-Id: Ib7afcce8b43db358ed7fa6b9bf83c4d3abd8db64
Signed-off-by: Igor Soares <Igor.PiresSoares@windriver.com>
2023-12-29 12:45:46 -03:00
Zuul
bde0b6c4da Merge "Update app Zuul Check Jobs." 2023-12-20 16:19:49 +00:00
Tae Park
857fedecc6 Issue a Warning for Vault-Manager PVC Storage
This commit adds an additional check for PVC storage for vault-manager
after PVC-to-k8s conversion. If the storage is found then it will log a
warning during start-up of vault manager.

Test Plan:
PASS bashate
PASS AIO-SX vault sanity
PASS New code issues logs only when the PVC storage persists after
     conversion

Story: 2010930
Task: 49293

Change-Id: I2d669b06927b9d396ce5d6e582983ab78a3cc5fc
Signed-off-by: Tae Park <tae.park@windriver.com>
2023-12-18 16:53:33 -05:00
Reed, Joshua
fd1d13a008 Update app Zuul Check Jobs.
Modify code to conform to flake8 and pylint.

Jobs are now flake8, pylint, py39 and metadata.

Test Plan
PASS - All zuul jobs pass as expected.

Story: 2010929
Task: 49283

Change-Id: I3e3f5191a2dac94e35b75bccdd563dc108f187bf
Signed-off-by: Reed, Joshua <Joshua.Reed@windriver.com>
2023-12-18 15:34:44 -06:00
Michel Thebeau
494edafaa9 Remove hardcoded vault and sva-vault
The vault namespace and full-name are in variables and should not have
been hardcoded.

Test Plan:
PASS  bashate of rendered init.sh
PASS  vault sanity
PASS  all affected code paths

Story: 2010930
Task: 49232

Change-Id: I1c4765b907ce8ce4200e98575922467edb34e9fd
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-12-15 15:40:47 +00:00
Michel Thebeau
1aa869135b Fix removal of rekey milestone secrets
When vault-manager is killed during finalizeRekey the k8s secrets may
not be deleted.  Especially: the kubectl command deleting multiple
secrets may be interrupted.

It is unclear in what order kubectl/k8s would delete the secrets when
they are specified in a single command - i.e., it is observed to be a
different order than what was specified.  Use one kubectl command for
each milestone secret.

Use cluster-rekey-audit as the final milestone.  Fix needsRekey to allow
the procedure to resume as long as cluster-rekey-audit persists.

Also adjust some comments and remove some chatty logs.

Test Plan:
PASS  bashate of rendered init.sh
PASS  vault sanity, including rekey
PASS  application-update
PASS  kubectl delete vault-manager pod tests
PASS  kill -9 vault-manager tests

Story: 2010930
Task: 49174

Change-Id: I2e5e15b4f89f9f9495381d33064c631cde6da193
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-12-15 15:40:37 +00:00
Tae Park
65b38b925d Prevent multiple vault-manager pods from acting
This commit adds new check in the main loop of vault manager
for multiple instances of vault manager. Only one vault manager is
needed, so it will be put on sleep or be
terminated until only one is left

Story: 2010930
Task: 49199

Test Plan:
PASS Bashate
PASS Vault sanity test

Change-Id: I0fd881aa4078528ba3f804087db87069dae58f7e
Signed-off-by: Tae Park <tae.park@windriver.com>
2023-12-13 19:58:07 +00:00
Michel Thebeau
be0e85ec77 stability fixes for vault-manager rekey
Continue/complete the rekey procedure when vault-manager is interrupted
(kill -9). Fixes include:
  - Refactor logic of rekeyRecover function
  - additionally handle specific failure scenarios to permit the rekey
    procedure to continue
  - correct return codes of procedure functions to fall through to the
    recovery procedure
  - resort the tests of needsShuffle
  - misc adjustment of logs and comments

The additional handling of failure scenarios includes:
  - partial deletion of cluster-rekey secrets after copying to
    cluster-key
  - restart rekey on failure during authentication

Test Plan: PASS  vault sanity, ha sanity
PASS  IPv4 and IPv6
PASS  system application-update, and platform application update
PASS  rekey operation without interuption
PASS  bashate the rendered init.sh

Stability testing includes kubectl deleting pods and kill -9 processes
during rekey operation at intervals spread across the procedure, with
slight random time added to each interval

PASS  delete a standby vault server pod
PASS  delete the active vault server pod
PASS  delete the vault-manager pod
PASS  delete the vault-manager pod and a random vault server pod
PASS  delete the vault-manager pod and the active pod
PASS  delete the vault-manager pod and a standby pod
PASS  kill -9 vault-manager process
PASS  kill -9 active vault server process
PASS  kill -9 standby vault server process
PASS  kill -9 random selection of vault and vault-manager processes

Story: 2010930
Task: 49174

Change-Id: I508e93a36de9ca8b4c8fa1da7941fe49936de159
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-12-07 13:30:32 +00:00
Michel Thebeau
615d6e4657 use the vault-manager image stx.9.0-v1.28.4
This new image adds uuidgen and multiple versions of kubectl, which
vault-manager now supports.

Test Plan:
PASS  sanity test of vault application
PASS  watch vault-manager log over kubernetes upgrade

Depends-On: Ib0a105306cecb38379f9d28a70e83ed156681f08
Depends-On: I03e37af31514c3fa3b95e0560a6d6f83879ec9de

Story: 2010930
Task: 49177

Change-Id: I7f578ac7e8d2aab98fb1e104f336fd750d7d7933
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-12-06 21:23:55 +00:00
Michel Thebeau
733ca0e9a6 Add multiple version support of kubectl
Allow vault-manager to pick the version of kubectl that matches the
currently running server.  Add a helm override option to pick a
particular version available within the image.

Refresh the helm chart patches on top of this change.

Test Plan:
PASS  Unit test the code
PASS  helm chart override
PASS  sanity of vault application
PASS  watch vault manager log during kubernetes upgrade

Story: 2010930
Task: 49177

Change-Id: I2459d0376efb6b7e47a25f59ee82ca74b277361f
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-12-04 20:46:29 +00:00
Michel Thebeau
65e8589183 add vault rekey option during upgrade
Allow the vault to be rekeyed after conversion from PVC storage to k8s
storage of the shard secrets.

Update the vault-manager patch to include rekey enable/disable and
timing parameters in helm values.yaml. Refresh the other patches
(include git long log descriptions in those patch files omitting
description).

Test Plan:
PASS  vault sanity, ha sanity
PASS  IPv4 and IPv6
PASS  system application-update, and platform application update
PASS  rekey operation without interuption
PASS  helm chart options
PASS  bashate the rendered init.sh

Stability testing includes kubectl deleting pods and kill -9 processes
during rekey operation at intervals spread across the procedure, with
slight random time added to each interval

PASS  delete a standby vault server pod
PASS  delete the active vault server pod
PASS  delete the vault-manager pod
PASS  delete the vault-manager pod and a random vault server pod
PASS  delete the vault-manager pod and the active pod
PASS  delete the vault-manager pod and a standby pod
TBD  kill -9 vault-manager process
TBD  kill -9 active vault server process
TBD  kill -9 standby vault server process
TBD  kill -9 random selection of vault and vault-manager processes

Story: 2010930
Task: 48850

Change-Id: I87911819c27caaf30be69b3c969a20ed97be42cb
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-12-04 19:21:11 +00:00
Michel Thebeau
dfcfa46061 improve error handling in vaultInitialized
A rare condition can result in vault servers not responding to this
early initialization status check.  The omission has no effect after
vault is initialized, but fails the application if it happens before
vault is initialized.

Test Plan:
PASS  Unit test the changes
PASS  vault sanity

Story: 2010930
Task: 49168

Change-Id: I6b5270f89ccea27f6c10edc6e1bc250b248f4054
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-12-04 19:21:07 +00:00
Zuul
2f047e86f3 Merge "update vault-manager docker image" 2023-12-04 18:11:31 +00:00
Michel Thebeau
8c6d86ea3b improve error handling of unsealVault
Add generic and specific error handling for unsealVault function.
Changes include:

  Recognize unseal success from the API response
  Recognize and stop unseal procedure if the response indicates
    authentication failure
  Always 'reset' unseal in progress, if any
  Recognize if the requested server is already unsealed
  Handle return code from vaultAPI function
  Remove key_error check as it is printed as DEBUG by vaultAPI
  Refactor reused variables to be less specific

Test Plan:
PASS  unit test the function
PASS  vault sanity including HA test

Story: 2010930
Task: 49167

Change-Id: If55589d207bbb374a6137922f62e2d494278e72c
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-11-30 21:17:34 +00:00
Michel Thebeau
394f20c28a update vault-manager docker image
Add uuidgen and add multiple versions of kubectl.

Test Plan:
PASS  Manual build of the image
PASS  Verify uuidgen and kubectl version in the running container
PASS  vault sanity with the new image

Story: 2010930
Task: 48849

Change-Id: Ib0a105306cecb38379f9d28a70e83ed156681f08
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-11-30 17:21:04 +00:00
Michel Thebeau
8669743ae2 add vault-manager pause debugging option
A debug feature to allow vault manager function to be paused.  Use case
may include setting up specific conditions for test.

Include a helm override for initial pause condition, which may be
difficult to reach as a pod starts.

Test Plan:
PASS  vault sanity
PASS  unit test the pause_on_trap code, helm override
PASS  misc usage of the option

Story: 2010930
Task: 49048

Change-Id: Icd69a79685427268d7d59b3fbe655b9b93e8ece8
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-11-13 19:04:30 +00:00
Michel Thebeau
f2d02300a9 add interactive mode for init.sh
Allow the init.sh script to be sourced by an author to permit
development and test activity.

Test Plan:
PASS vault sanity test
PASS enter vault-manager pod and source init.sh
PASS bashate on the rendered script

Story: 2010930
Task: 49047

Change-Id: I899dcf6df793ee69b51b63a8b214320282d091fa
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-11-13 19:02:36 +00:00
Michel Thebeau
c91580ebd2 add generic function for vault REST API calls
Replace curl REST API calls with a generic function. This prepares for
adding more functionality to vault manager; more REST API calls.

Main feature includes error/debug logging of the responses.

Also includes:
Define variables for server targets
Refactor ubiquitous global 'row' variable, covert to parameter
Explicitly declare the curl's default connect-timeout (120s)

Test plan:
PASS vault sanity
PASS vault HA test
PASS all code paths with REST API calls
PASS misc examples GET, POST, DELETE
PASS unit test the new function
PASS bashate of the rendered document

Story: 2010930
Task: 49042

Change-Id: Ic329f075ba1c0480f5d507f9768f76fa86fc2094
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-11-13 19:02:25 +00:00
Michel Thebeau
464f9d0e76 Conversion of storage during application update
Add lifecycle code to read secrets from PVC mounted to running
vault-manager, and vault-manager code for conversion of storage from PVC
to k8s secrets.

The lifecycle code is added because the previous version of
vault-manager does not respond to SIGTERM from kubernetes for
termination.  And yet the pod will be terminating when the new
vault-manager pod runs.  Reading the PVC data in lifecycle code before
helm updates the charts simplifies the process when vault-manager is
running during application-update.

The new vault-manager also handles the case where the application is not
running at the time the application is updated, such as if the
application is removed, deleted, uploaded and applied.

In general the procedure for conversion of the storage from PVC to k8s
secrets is:
 - read the data from PVC
 - store the data in k8s secrets
 - validate the data
 - confirm the stored data is the same as what was in PVC
 - delete the original data only when the copy is confirmed

The solution employs a 'mount-helper', an incarnation of init.sh,
that mounts the PVC resource so that vault-manager can read it.  The
mount-helper mounts the PVC resource and waits to be terminated.

Test plan:
PASS  vault sanity
PASS  vault sanity via application-update
PASS  vault sanity update via application remove, delete, upload, apply
      (update testing requires version bump similar to change 881754)
PASS  unit test of the code
PASS  bashate, flake8, bandit
PASS  tox

Story: 2010930
Task: 48846

Change-Id: Iace37dad256b50f8d2ea6741bca070b97ec7d2d2
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-11-02 15:12:47 +00:00
Michel Thebeau
cd165b8f5c fix flake8 and bandit complaints
Fixes based on local run of flake8 and bandit. Still passes pylint and
py39 unit tests.

Omit enabling the zuul jobs for flake8 and bandit, but fix the
complaints.  The bug https://bugs.launchpad.net/starlingx/+bug/2042457
will track getting those enabled.

Test plan: (with change 899277)
PASS  vault sanity
PASS  vault sanity via application-update
PASS  vault sanity update via application remove, delete, upload, apply
PASS  unit test of the code
PASS  bashate, flake8, bandit
PASS  tox

Story: 2010930
Task: 49026

Change-Id: Iab8c156be7bd5d32d420a500b7abf4f2ea2a2ac6
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-11-02 15:12:44 +00:00