There are lots of exceptions reported in gunicorn.log.
That error was due to a bug present on SysLogHandler
that was reported on cpython repo but was not fixed for python3.9.2.
This change adds python3.9 package to the build system and patches
it in order to update SysLogHandler to fix the reconnection bug.
Test Plan:
PASS: Build python3.9 package. Install
libpython3.9-minimal_3.9.2-1.stx.1_amd64.deb and verify that the
exceptions are no longer present.
Depends-On: https://review.opendev.org/c/starlingx/root/+/873159
Closes-bug: 2006623
Signed-off-by: Enzo Candotti <enzo.candotti@windriver.com>
Change-Id: I6eb44544da5c05e712bc89e69193548667c8ab28
Problem: in a rare situation the add_interface may
fail with RTNETLINK error.
Add logs to help the investigation to check the
device link status and IP address configured.
Test plan ( Debian only )
PASS Fresh install of AIO-SX
PASS Fresh install of AIO-DX
Closes-Bug: #2002346
Signed-off-by: Fabiano Mercer <fabiano.correamercer@windriver.com>
Change-Id: Ice92d54cf87c0b58ff0d1917b2c4b61a277fb961
Make the openvswitch docker image stx-debian based,
following the new convention for StarlingX docker images.
Test Plan:
PASS - Build stx-ovs debian image
PASS - Manually upload stx-ovs built image to a Standard system,
use helm-override to change the openvswitch_db_server and
openvswitch_vswitchd, container images and
reapply stx-openstack.
PASS - Check if the openstack pods start successfully
Story: 2010072
Task: 46976
Signed-off-by: Rafael Cardoso Pereira <rafael.cardosopereira@windriver.com>
Change-Id: Ic43a47698881a51f0fe70c50365f27b94999228e
This change will allow this repo to pass zuul now
that this has merged:
https://review.opendev.org/c/zuul/zuul-jobs/+/866943
Tox 4 deprecated whitelist_externals.
Replace whitelist_externals with allowlist_externals
Partial-Bug: #2000399
Signed-off-by: Al Bailey <al.bailey@windriver.com>
Change-Id: Iceb323a8b7a4b6ec8af81cd1b07c8b98d1e4b3f2
The script in k8s-cni-cache-cleanup is failing to run
because 'WatchdogTimestamp' no longer exists.
Instead we use another timestamp, namely ActiveEnterTimestamp.
Test Plan:
PASS: Lock and unlock the controller and verify that the cleanup
works properly
PASS: Launch pods using extra cni interfaces and make sure
they work properly using specific test cases
Closes-Bug: 1999570
Signed-off-by: Mohammad Issa <mohammad.issa@windriver.com>
Change-Id: I36881a2802d150d7b36d204bf45511752d7f8401
This change enables building the stx-mariadb Docker image within
the Debian build framework. It is now based on stx-debian and
following the new convention for StarlingX images.
Test Plan:
PASS - Build stx-mariadb debian image
PASS - Manually upload mariadb built image to a Standard system,
use helm-override to change the garbd container image and
reapply stx-openstack.
PASS - Check if the garbd pod starts successfully
PASS - Ensure that galera-arbitrator-3 is installed
Story: 2010072
Task: 46975
Signed-off-by: Romulo Leite <romulo.leite@windriver.com>
Change-Id: I1d5dbaa5b58dd2cb68e2dfcbe85d2943184f00ff
This commit adds the sssd (System Security Services Daemon) service
to each of the systemd preset trait files in order to enable it
automatically on startup.
Tests performed:
PASS: sssd service is successfully started.
PASS: sssd service status is enabled and active.
PASS: Kill sssd process multiple times and check if it gets restarted
successfully every time.
PASS: Verify sssd service connects successfully to local openldap server
Story: 2009834
Task: 47022
Signed-off-by: Carmen Rata <carmen.rata@windriver.com>
Change-Id: I0f1c1d63a661d0b15f2ed4d9d64c4b20d52bfdad
This reverts commit 15db2d6990a717f50cb7611b1e4ee76f3c626af7.
While this works fine if we trigger the control plane upgrade on
the active controller first, it fails miserably if we upgrade the
inactive controller first.
The fix is to revert this and instead do it in sysinv-conductor, as
covered in https://bugs.launchpad.net/starlingx/+bug/1999095
Partial-Bug: 1999095
Signed-off-by: Chris Friesen <chris.friesen@windriver.com>
Change-Id: I8f9119ad0fa57bc337883a9263671048f5818c2f
Script was sending error logs only on stdout
while other scripts down the line (eg. sysinv.helm.utils)
expected these error messages on stderr.
Test Plan:
PASS Delete armada deployment and use helm v2 list.
Error msg on stderr.
Partial-Bug: 1999572
Signed-off-by: Leonardo Fagundes Luz Serrano <Leonardo.FagundesLuzSerrano@windriver.com>
Change-Id: Ic6cd2bd844382a47e9b1451ae7c3430951493da8
This change enables building the stx-libvirt Docker image within the
Debian build framework. It is now based on stx-debian and
following the new convention for StarlingX images.
Test Plan:
PASS - Build libvirt debian package
PASS - Build stx-libvirt image
PASS - Manually upload built image to a system, use helm-override to
change the libvirt container image and apply stx-openstack
PASS - Ensure the libvirt Pod successfully starts and is running
PASS - Ensure libvirt Pod Liveness and Readiness probes are healthy
Story: 2010072
Task: 46974
Closes-Bug: 1998630
Depends-On: https://review.opendev.org/c/starlingx/tools/+/866411
Signed-off-by: Thales Elero Cervi <thaleselero.cervi@windriver.com>
Change-Id: I10112a0f1ab3a1f880ebc8b162c42b7b131d6aad
When the '-k/--keydir' option is used, the variable KEYDIR will
still be the fixed value. Allow the customized value of KEYDIR
to be passed.
It's the same as '-s/--sourcedir' option and the variable SRCDIR.
Test-Plan:
PASS: build-pkgs -c -p kpatch
PASS: build-pkgs -c -p kpatch-prebuilt
PASS: build-pkgs -a --parallel 30
PASS: build-image
PASS: Jenkins installation
PASS: Setup the bullseye repo in /etc/apt/sources.list
PASS: sudo ostree admin unlock --hotfix
PASS: sudo apt install bison flex libssl-dev libelf-dev gcc make
patch -y
PASS: Copy linux-source-5.10, linux-keys-5.10 and
linux-image-5.10.0-6-amd64-dbg debian packages to target
PASS:
mkdir -p kpatch-test/linux
dpkg -X linux-[source|keys]-5.10....deb kpatch-test/linux
dpkg -X linux-image-5.10.0-6-amd64-dbg...deb kpatch-test/linux
cd kpatch-test/linux
tar xf kpatch-test/linux/usr/src/linux-source-5.10.tar.xz
PASS: sudo kpatch-build
-s kpatch-test/linux/linux-source-5.10
-c /ostree/1/boot/config-5.10.0-6-amd64
-v kpatch-test/linux/usr/lib/debug/boot/vmlinux-5.10.0-6-amd64
-k kpatch-test/linux/usr/src/kernels/5.10.0-6-amd64
/var/lib/kpatch/test/meminfo-string.patch -R
Story: 2009221
Task: 44580
Signed-off-by: Zhixiong Chi <zhixiong.chi@windriver.com>
Change-Id: I62b973a50e149d51e6bb24223416351a641c10ba
The patch that checks for enable flag of the PCI device was re-added
back in the commit https://review.opendev.org/c/starlingx/integ/+/865422,
namely "Reject-device-configuration-if-not-enabled".
This patch is no longer necessary as it causes failure in configuring
the ACC200 device.
This commit removes that patch.
Test Plan:
Pass: Test host unlock with ACC100, ACC200 and N3000 devices.
Closes-Bug: 1999449
Signed-off-by: Teresa Ho <teresa.ho@windriver.com>
Change-Id: I5364c43d88d130948f55f2f60b1bb2b5f6f6ba77
Immediately after the system installation, fm-api service tries to
start and fails because it needs some files generated during the
bootstrap stage.
With this commits the service remains loaded, and the enabling happens
during bootstrap stage.
Test plan:
PASS: * Deploy a system until the install stage.
* Check that fm-api.service did not fail and remains in
loaded state.
Depends-on: https://review.opendev.org/c/starlingx/ansible-playbooks/+/866990
Closes-bug: 1999267
Signed-off-by: Agustin Carranza <agustin.carranza@windriver.com>
Change-Id: I3d4ffc66e742797453532f22479d7e960a128d3f
The set of changes here,
(k8s-1.22.5: remove feature-gates)
a6a5349d02,
(Add a puppet class to support k8s feature-gate update)
1cdfd78286
and
(apply feature-gate update during upgrade-activate)
cc3cdbd647
were added for stx 6.0 to stx 7.0 upgrade (CentOS) for changes in
feature-gates with respect to k8s 1.22.
Updating feature gates as required per k8s version is now handled
at k8s upgrade in a single script (upgrade-k8s-config) starting
from this change:
(update feature-gates for specific k8s version)
6e7736059a
So, update-k8s-feature-gates script is no longer required.
Closes-Bug: 1990880
Test Plan:
On AIO-SX
PASS: Check script is not present after stx7.0 to stx8.0 platform
upgrade.
PASS: AIO-SX stx7.0 to stx8.0 platform upgrade successful.
Signed-off-by: Kaustubh Dhokte <kaustubh.dhokte@windriver.com>
Change-Id: Id85628a677877048c6c5aa1747a33fa8c72056a3
The service k8s-pod-recovery failed to restore the SRIOV device
plugin, necessary for pods that use SRIOV interfaces to create the
resource, those pods need to add the label 'restart-on-reboot=true'
to be restarted during boot. The failure was observed during an
upgrade, and although rare, it left the operator to actuate by
manually restarting the pods later.
This change adds a wait for the pod stabilization (it is considered
stable when stops the state transitions) and, if still in failure,
execute 2 attempts to restore the plugin. Logs were added to better
register the pod state in case of an error.
Test Plan:
[PASS] execute 7 upgrades in an AIO-SX lab
Closes-Bug: 1999074
Signed-off-by: Andre Fernando Zanella Kantek <AndreFernandoZanella.Kantek@windriver.com>
Change-Id: I838c35d3e0a3557c71344945a8e00f22ccb50eb4
During a K8s feature upgrade from 1.23 to 1.24 we need to remove
the "RemoveSelfLink=false" feature gate from kube-apiserver.
We had previously handled updating the kubeadm configmap, which
was sufficient to handle the running system. However, in order
to properly handle backup and restore after the K8s upgrade to
1.24 (and just for general tidiness) we need to also remove the
feature gate from the saved service parameters and from the
last_kube_extra_config_bootstrap.yaml file.
It's possible that there are other kube-apiserver feature gates
specified by the end user, this adds a bit of complexity to the
code.
Test Plan:
PASS: Test python script and bash script in isolation.
PASS: End-to-end test with k8s upgrade and backup/restore with
manual modification of service parameters and yaml file.
Tested with AIO-DX, AIO-SX unoptimised restore, and
AIO-SX optimised restore.
PASS: K8s upgrade using the new code, ensure service parameter
and last_kube_extra_config_bootstrap.yaml have been
updated with "RemoveSelfLink=false" feature gate removed.
Closes-Bug: 1999095
Signed-off-by: Chris Friesen <chris.friesen@windriver.com>
Change-Id: I82ecd821d4e1745ab0f480f9f9c0178757521038
KUBE_ALLOW_PRIV results in trying to run kubelet with the
"--allow-privileged=true" flag, which has not been supported by
kubelet since K8s 1.15 that in turn causes the kubelet to error out.
Default kubelet.service contains KUBE_ALLOW_PRIV invalid setting due
to the fact that the upstream kubernetes-contrib package hasn't been
updated in years.
This change removes KUBE_ALLOW_PRIV from kubelet.service in the
kubernetes-unversioned package.
Closes-Bug: 1998629
Test-plan:
PASS - Install AIO-SX and ensure that
/lib/systemd/system/kubelet.service doesn't contain
"$KUBE_ALLOW_PRIV"
Signed-off-by: Ramesh Kumar Sivanandam <rameshkumar.sivanandam@windriver.com>
Change-Id: Ide0f9c8db180908cc9c6528f474214966655be95
Add the packages of "integ" to debian_iso_image.inc.
A subsequent commit will be sent to clean up stx-std.lst.
Test Plan:
Pass: build-pkgs -c -a
Pass: build-image
Pass: boot
Story: 2008862
Task: 47004
Signed-off-by: Yue Tao <yue.tao@windriver.com>
Change-Id: Ic3e5ec08ea3742ce56b9a2f36f06d88e94041122
This change modifies upgrade_k8s_config.sh to support updating
k8s feature-gates for different k8s versions. With every k8s release,
default values of some feature-gate are changed and usage of some
feature-gates often gets deprecated.
The script runs during each k8s control plane upgrade before
upgrading first master. It modifies kubeadm-config configmap
with features-gates as required for the specific k8s version
we are upgrading to.
The set of changes here, a6a5349d02
(k8s-1.22.5: remove feature-gates), 1cdfd78286
(Add a puppet class to support k8s feature-gate update), and cc3cdbd647
(apply feature-gate update during upgrade-activate) were added for
stx 6.0 to stx 7.0 upgrade (CentOS) for changes in feature-gates with
respect to k8s 1.22. Instead of adding that script to Debian and
maintaining two different scripts, going forward we can maintain this
single script to accommodate any change in feature-gates
(or any other config in kubeadm-config) with respect to the specific
k8s version we are upgrading to.
Test Plan:
PASS: K8s upgrade 1.21.8 to 1.22.5
PASS: k8s upgrade 1.23.1 to 1.24.4
PASS: shellcheck run
PASS: replace_configmap function was unit tested separately.
Closes-Bug: 1996546
Closes-Bug: 1990880
Signed-off-by: Kaustubh Dhokte <kaustubh.dhokte@windriver.com>
Change-Id: Ib693d7892aee2da91d612789b64ff38a65da5ccb
A previous change set the cgroup cpu.cfs_quota_us value to -1 for
containers in pods in the Guaranteed QoS class.
We can only do this if we're allocating the entire CPU. For non-
integer CPU allocations we need to set the cpu.cfs_quota_us value
to enforce the CPU limit configured on the container.
Test Plan:
Verified the pods that in the "Guaranteed" QoS class, on hosts that
have "kube-cpu-mgr-policy=static" have cpu.cfs_quota_us set to -1 for
integer cpu value.
Closes-Bug: 1997528
Signed-off-by: Boovan Rajendran <boovan.rajendran@windriver.com>
Change-Id: I33662e67706cee4cb0ce005bb09ce3b5fc717239
This reverts commit e24f687606d25424bfd09dd53c7195e6d180a069.
The commit in question has a bug that results in the code not
compiling. Reverting to avoid breaking tonight's build.
Signed-off-by: Chris Friesen <chris.friesen@windriver.com>
Change-Id: I0d0ce0a096cdc5c6bceac39297da391d46a09d8d
The 5.10.74 preempt-rt kernel reports the following warning when
dumping vmcore files due to the use of kernel command line arguments
such as nohz_full=, isolcpus=, rcu_nocbs= with the kexec/kdump kernel.
[ 1.568059] WARNING: CPU: 0 PID: 0 at kernel/time/tick-sched.c:139
tick_sched_do_timer+0x5e/0x70
[ 1.568064] Modules linked in:
[ 1.568066] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G I
5.10.74-200.1648.tis.rt.el7.x86_64 #1
[ 1.568068] Hardware name: Dell Inc. PowerEdge R740/0WRPXK, BIOS
2.10.2 02/24/2021
[ 1.568068] RIP: 0010:tick_sched_do_timer+0x5e/0x70
[ 1.568071] Code: 01 00 75 26 89 15 26 74 6f 01 48 8b 05 1b 87 d5 01
Commit 1655ee30e6("sched/isolation: really align nohz_full with
rcu_nocbs") is included in the 5.10.112 kernel, that had fixed the
warning. So the warning will not be reproduced with 5.10.112 and the
later versions of kernel.
We can remove the irqaffinity, isolcpus, nohz_full, rcu_nocbs, and
kthread_cpus arguments from the kdump kernel's command line arguments,
which will also fix the issue.
Testing:
- An ISO image can be built successfully.
- There are no warnings after the fix with 5.10.74 kernel.
Closes-Bug: 1997932
Signed-off-by: M. Vefa Bicakci <Vefa.Bicakci@windriver.com>
Signed-off-by: Jiping Ma <jiping.ma2@windriver.com>
Reported-by: M. Vefa Bicakci <Vefa.Bicakci@windriver.com>
Change-Id: I7d1dbd864fdfe2533197084d7274ef6ab70892db
During the port from CentOS to Debian there were missing patches for
this application. This change also updates the patches to be used
on the current pf-bb-config version (22.07).
Test Plan:
[PASS] execute the application during FEC accelerator cards config
Closes-Bug: 1997878
Signed-off-by: Andre Kantek <andrefernandozanella.kantek@windriver.com>
Change-Id: I384aa358150e7306692f408d1ae16ef94d2566b8