openstack-helm-infra

Author	SHA1	Message	Date
Andrii Ostapenko	41f02d3c98	Fix service account name for ceph-mon keyring generator Fix issues introduced by https://review.opendev.org/#/c/735648 with extra 'ceph-' in service_account and security context not rendered for keyring generator containers. Change-Id: Ie53b3407dbd7345d37c92c60a04f3badf735f6a6 Signed-off-by: Andrii Ostapenko <andrii.ostapenko@att.com>	2020-07-07 15:06:23 -05:00
Andrii Ostapenko	824f168efc	Undo octal-values restriction together with corresponding code Unrestrict octal values rule since benefits of file modes readability exceed possible issues with yaml 1.2 adoption in future k8s versions. These issues will be addressed when/if they occur. Also ensure osh-infra is a required project for lint job, that matters when running job against another project. Change-Id: Ic5e327cf40c4b09c90738baff56419a6cef132da Signed-off-by: Andrii Ostapenko <andrii.ostapenko@att.com>	2020-07-07 15:42:53 +00:00
Zuul	88b79920db	Merge "[rabbitmq] Upgrade to 3.7.26"	2020-07-07 02:57:14 +00:00
Andrii Ostapenko	2b4cf6a2d9	Completely switch to python3 for developers installation This addresses an issue with using py2 as interpreter while installing required dependencies with py3. Also switch kubeadm-aio image to bionic. Change-Id: I5a9e6678c45fad8288aa6971f57988b46001c665 Signed-off-by: Andrii Ostapenko <andrii.ostapenko@att.com>	2020-07-03 09:47:43 +00:00
Zuul	03273bd61d	Merge "Fix ara installation"	2020-07-02 19:25:03 +00:00
Zuul	5b9c79604a	Merge "Fix developers kubeadm installation"	2020-07-02 19:17:55 +00:00
Chinasubbareddy Mallavarapu	bfe7a99a61	[CEPH] Make ceph-volume as default deployment tool This is to make ceph-volume as default deployment tool since support for ceph-disk got deprectated from Nautilus version of ceph. Change-Id: I10f42fd0cb43a951f480594d269fd998de5678bf	2020-07-02 15:05:03 +00:00
Andrii Ostapenko	ecb58b85be	Fix ara installation Using the latest ara supporting ansible 2.5.5 Change-Id: Id44948986609093b709e23e0d9f9eddd690fa2b8 Signed-off-by: Andrii Ostapenko <andrii.ostapenko@att.com>	2020-07-01 23:56:03 -05:00
Andrii Ostapenko	b49541f300	Fix developers kubeadm installation Waiting for kube-apiserver is failing with not finding python executable. Change-Id: Ib0ff95088c658fec3180f071269041faa7da2ecf Signed-off-by: Andrii Ostapenko <andrii.ostapenko@att.com>	2020-07-01 23:51:45 -05:00
Zuul	7cf8c6f893	Merge "Fix drop databases issue in Postgresql restore"	2020-07-01 23:06:25 +00:00
Zuul	af5a742a17	Merge "Add generate openAPIV3Schema schema for DaemonJob CRD."	2020-07-01 21:56:31 +00:00
Zuul	fc7bb216d2	Merge "Fix return code when backup to remote rgw fails"	2020-07-01 21:46:25 +00:00
Zuul	646ce9bd4c	Merge "Make mariadb chart compatible with mariadb 10.4.13"	2020-07-01 05:35:56 +00:00
Zuul	06a0244ccc	Merge "Updating nagios cluster role for rbd monitoring"	2020-06-30 21:02:53 +00:00
Cliff Parsons	4964ea2a76	Fix drop databases issue in Postgresql restore Recently, the Postgresql backups were modified to generate drop database commands (--clean pgdumpall option). Also for single database restore, a DROP DATABASE command was added before the restore so that the database could be restored without duplicate rows. However, if there are existing database connections (by the applications or other users), then the drop database commands will fail. So for the duration of the restore database operation, the databases being restored need to have their existing connections dropped and new connections prevented until the database(s) restored, then connections should be re-allowed. Also found a problem with psql returning 0 (success code) even though there were errors during its execution. The solution is to check the output for errors and if there are any, dump out the log file for the user to see and let the user know there are errors. Lastly, a problem was found with the single database restortion, where the database dump for a single database was being incorrectly extracted from the psql dump file, resulting in the database not being restored correctly (most of the db being wiped out). This patchset fixes that issue as well. Change-Id: I4db3f6ac7e9fe7cce6a432dfba056e17ad1e3f06	2020-06-30 19:39:00 +00:00
Cliff Parsons	1508324ce7	Fix return code when backup to remote rgw fails In the database backup framework (_backup_main.sh.tpl), the backup_databases function exits with code 1 if the store_backup_remotely function fails to send the backup to the remote RGW. This causes the pod to fail and be restarted by the cronjob, over and over until the backoff retries limit (6 by default) is reached, so it creates many copies of the same backup on the file system, and the default k8s behavior is to delete the job/pods once the backoff limit has been exceeded, so it then becomes more difficult to troubleshoot (although we may have logs in elasticsearch). This patch changes the return code to 0 so that the pod will not fail in that scenario. The error logs generated should be enough to flag the failure (via Nagios or whatever alerting system is being used). Change-Id: Ie1c3a7aef290bf6de4752798821d96451c1f2fa5	2020-06-30 16:29:38 +00:00
Luna Das	64c744d756	Add generate openAPIV3Schema schema for DaemonJob CRD. change docker image to point to the latest metacontroller image. change python image to point to version 3.7 add updateStrategy to CompositeController. add replicas config to DaemonJobController via zuul gate. Change-Id: I2a48bc6472017802267980fe474d81886113fcda	2020-06-30 01:13:41 +05:30
Chinasubbareddy Mallavarapu	3bde9f5b90	[CEPH] OSH-INFRA: use loopback devices for ceph osds - This is to make use of loopback devices for ceph osds since support for directory backed osds going to deprecate. - Move to bluestore from filestore for ceph-osds. - Seperate DB and WAL partitions from data so that gates will validate the scenario where we will have fast storage disk for DB and WAL. Change-Id: Ief6de17c53d6cb57ef604895fdc66dc6c604fd89	2020-06-29 14:09:32 +00:00
Zuul	b1e66fd308	Merge "Add more fields with verbose description to CompositeController."	2020-06-27 05:29:44 +00:00
Luna Das	594645ce39	Add more fields with verbose description to CompositeController. Change-Id: Ib6d9db5a8b1be9c3fa6b4cb988c576a71599a274	2020-06-27 00:15:09 +05:30
Taylor, Stephen (st053q)	153c9ec6f0	[ceph-osd] Liveness probe success in preboot state with noup flag OSDs fail the liveness probe if they can't make it to the 'active' state. The noup flag keeps OSDs in the 'preboot' state, which prevents the liveness probe from succeeding. This change adds an additional check in the liveness probe to allow it to succeed if the noup flag is set and OSDs are in the 'preboot' state. Change-Id: I8df5954f7bc4ef4374e19344b6e0a9130764d60c	2020-06-26 11:37:18 -05:00
Mykyta Karpin	1482193fd4	Make mariadb chart compatible with mariadb 10.4.13 since mariadb 10.4.13 definer of view mysql.user is not root but mariadb.sys user. So when we remove it we break mysql_upgrade, it fails to fix views. It is safe not to remove it because the account by default is locked and cannot login. Change-Id: I5183d7cbb09e18d0e87e0aef8c59bb71ec2f1cb5 Related-Bug: https://jira.mariadb.org/browse/MDEV-22542	2020-06-26 05:11:55 +00:00
Tin Lam	7cb3ef69ae	feat(tls): add tls support to helm-toolkit This patch set: - allows options in the bootstrap job to load the proper TLS secret into the proper envvar so the openstack client can connect properly to perform bootstrap; - adds in certificates to make rally work properly with TLS endpoints; - adds methods to handle TLS secret volume and volumeMount; - updates ingress to handle secure backends. Change-Id: I322cda393f18bfeed0b9f8b1827d101f60d6bdeb Signed-off-by: Tin Lam <tin@irrational.io>	2020-06-26 00:32:57 +00:00
Chris Wedgwood	6d032c3971	[rabbitmq] Upgrade to 3.7.26 Staying current. Many bugfixes. Change-Id: Ib95c30380d89c336774d5c74e02ce5cbd9efb5d7	2020-06-25 23:32:50 +00:00
Zuul	5e316a9ba0	Merge "Mariadb backup/restore enhancements"	2020-06-25 18:48:08 +00:00
Zuul	e48feaefb2	Merge "[ceph-rgw] Add rwg restart job"	2020-06-25 17:17:26 +00:00
Zuul	b4c66cea6a	Merge "Fix problems with DB utilities in HTK and Postgresql"	2020-06-25 16:17:17 +00:00
Alexander Vlasov	70b0b9b266	[ceph-rgw] Add rwg restart job Some updates to rgw config like zone or zonegroup changes that can be done during bootstrap process require rgw restart. Add restart job which when enabled will use 'kubectl rollout restart deployment' in order to restart rgw This will be more useful in greenfield scenarios where we need to setup zone/zonegroups right after rgw svc up which needs to restart rgw svc. Change-Id: I6667237e92a8b87a06d2a59c65210c482f3b7302	2020-06-25 13:15:56 +00:00
Zuul	9655817eae	Merge "Remove duplicate lint job entry and script"	2020-06-25 04:11:51 +00:00
Huang, Sophie (sh879n)	573ac49939	Mariadb backup/restore enhancements Below enhancements are made to Mariadb backup: 1) Used new helm-toolkit function to send/retrieve Mariadb backups to/from RGW via OpenStack Swift API. 2) Modified the backup script such that the database backup tarball can be sent to RGW. 3) Added a keystone user for RGW access. 4) Added a secret for OpenStack Swift API access. 5) Changed the cronjob image and runAsUser 6) Modified the restore script so that archives stored remotely on RGW can be used for the restore data source. 7) Added functions to the restore script to retrieve data from an archive for tables, table rows and table schema of a databse 8) Added a secret containing all the backup/restore related configuration needed for invoking the backup/restore operation from a different application or namespace. Change-Id: Iadb9438fe419cded374897b43337039609077e61	2020-06-24 21:13:21 +00:00
Cliff Parsons	1da7a5b0f8	Fix problems with DB utilities in HTK and Postgresql This PS fixes: 1) Removes printing of the word "Done" after the restore/list command executes, which is not needed and clutters the output. 2) Fixes problem with list_tables related to command output. 3) Fixes parameter ordering problem with list_rows and list_schema 4) Adds the missing menu/parameter parsing code for list_schema 5) Fixes backup-restore secret and handling of PD_DUMPALL_OPTIONS. 6) Fixes single db restore, which wasn't dropping the database, and ended up adding duplicate rows. 7) Fixes cronjob deficiencies - added security context and init containers, fixed backup related service account related typos. 8) Fixes get_schema so that it only finds the table requested, rather than other tables that also start with the same substring. 9) Fixes swift endpoint issue where it sometimes returns the wrong endpoint, due to bad grep command. Change-Id: I0e3ab81732db031cb6e162b622efaf77bbc7ec25	2020-06-24 19:16:04 +00:00
Singh, Jasvinder (js581j)	fd8cdb66af	Updating nagios cluster role for rbd monitoring This patchset is required for the patch set https://review.opendev.org/#/c/737629. The kuberntes python api requires these permissions, for this script to work properly. Change-Id: I69f2ca40ab6068295a4cb2d85073183ca348af1e	2020-06-23 17:59:17 -04:00
Zuul	401d4e70ce	Merge "Add node-problem-detector chart"	2020-06-22 23:11:27 +00:00
Steve Wilkerson	a31bb2b049	Add node-problem-detector chart This adds a chart for the node problem detector. This chart will help provide additional insight into the status of the underlying infrastructure of a deployment. Updated the chart with new yamllint checks. Change-Id: I21a24b67b121388107b20ab38ac7703c7a33f1c1 Signed-off-by: Steve Wilkerson <sw5822@att.com>	2020-06-22 13:00:55 -05:00
Gage Hugo	16676b5b63	Remove duplicate lint job entry and script osh-infra currently has a duplicate linter playbook that is not being used, since the other is used for both osh and osh-infra. This change removes the duplicate entry and playbook. Change-Id: If7040243a45f2166973dc5f0c8cd793431916942	2020-06-22 12:31:25 -05:00
chinasubbareddy mallavarapu	91f60d2884	Revert "[ceph-client] Update ceph-mon port." Reverting this ps since we tried to solve the problem here for the old clients prior to nautilus but nautilus clients thinks its v2 port and try to communicate with server and getting some warnings as shown below: lets make v2 port as default and ovverride mon_host config for old clients prior to nautilus as we did in this ps (https://review.opendev.org/#/c/711648/). better solution will be moving out of old ceph clients by changing the images wherever old ceph clients are installed. log: + ceph auth get-or-create client.cinder mon 'profile rbd' osd 'profile rbd' -o /tmp/tmp.k9PBzKOyCq.keyring 2020-06-19 15:56:13.100 7febee088700 -1 --2- 172.29.0.139:0/2835096817 >> v2:172.29.0.141:6790/0 conn(0x7febe816b4d0 0x7febe816b990 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=0 rx=0 tx=0)._handle_peer_banner peer v2:172.29.0.141:6790/0 is using msgr V1 protocol This reverts commit acde91c87d5e233d1180544df919cb6603e306a9. Change-Id: I08ef968b3e80c80b973ae4ec1f80ba1618f0e0a5	2020-06-19 22:16:16 +00:00
Gage Hugo	26350f37aa	Add new python roles to playbooks With the latest infra update, the images used no longer contain python by default and projects are expected to use the new ensure roles to use packages as needed. This change adds some of the ensure roles to a few playbooks, additional cleanup can be done using these in future changes. Change-Id: Ie14ab297e71195d4fee070af253edf4d25ee5d27	2020-06-19 18:07:13 +00:00
Zuul	e4167fd248	Merge "[ceph-osd] Allow ceph-volume to deploy OSDs on dirty disks"	2020-06-18 22:12:26 +00:00
Tin Lam	587182c779	fix(ovs): add capability to openvswitch While OpenVSwitch works in the gate using kubernetes 1.16, running this in kubernetes 1.18 causes a permission denied error while executing chroot in an init container script [0]. This adds the SYS_CHROOT capability to address the error. [0] https://opendev.org/openstack/openstack-helm-infra/src/branch/master/openvswitch/templates/bin/_openvswitch-vswitchd-init-modules.sh.tpl#L18-L20 Change-Id: I62c01678cce6cd4e98418ed5518613ccd5eecbf9 Signed-off-by: Tin Lam <tin@irrational.io>	2020-06-18 17:07:40 +00:00
Zuul	eaaf0062e4	Merge "(fix) Changed pip to pip3"	2020-06-18 15:47:03 +00:00
Zuul	ee12b4c5db	Merge "Don't run linter on docs changes"	2020-06-18 15:47:01 +00:00
Brian Wickersham	567a7c6c1e	[ceph-osd] Allow ceph-volume to deploy OSDs on dirty disks Currently there are conditions that can prevent Bluestore OSDs from deploying correctly if the disk used was previously deployed as an OSD in another Ceph cluster. This change fixes the ceph-volume OSD init script so it can handle these situations correctly if OSD_FORCE_REPAIR is set. Additionally, there is a race condition that may occur which causes logical volumes to not get tagged with all of the necessary metadata for OSDs to function. This change fixes that issue as well. Change-Id: I869ba97d2224081c99ed1728b1aaa1b893d47c87	2020-06-18 14:04:02 +00:00
Zuul	0a35fd827e	Merge "Enable key-duplicates and octal-values yamllint checks"	2020-06-18 04:49:03 +00:00
Zuul	017f16274d	Merge "ceph-osd: Log the script name, lineno and funcname"	2020-06-18 04:01:58 +00:00
Zuul	7935018d8f	Merge "Don't rely on pip and tox installed on zuul node"	2020-06-18 03:31:44 +00:00
Zuul	6217a5eda3	Merge "[ceph-osd, ceph-client] Weight OSDs as they are added"	2020-06-18 02:22:53 +00:00
Gage Hugo	16ff2531e4	Don't rely on pip and tox installed on zuul node Change-Id: I3b715a4cc5ae064b458694ab98feb2b6cc226e65	2020-06-18 01:00:31 +00:00
Zuul	16414767e0	Merge "Enable yamllint rules for templates"	2020-06-18 00:09:28 +00:00
Gage Hugo	6b5d1a1d4a	Don't run linter on docs changes This change modifies the linting job to not run when a patchset only modifies openstack-helm documentation. Change-Id: I0ed0fd5fff10d81dd34351b7da930d1a340b10d8	2020-06-17 18:06:34 -05:00
Stephen Taylor	59b825ae48	[ceph-osd, ceph-client] Weight OSDs as they are added Currently OSDs are added by the ceph-osd chart with zero weight and they get reweighted to proper weights in the ceph-client chart after all OSDs have been deployed. This causes a problem when a deployment is partially completed and additional OSDs are added later. In this case the ceph-client chart has already run and the new OSDs don't ever get weighted correctly. This change weights OSDs properly as they are deployed instead. As noted in the script, the noin flag may be set during the deployment to prevent rebalancing as OSDs are added if necessary. Added the ability to set and unset Ceph cluster flags in the ceph-client chart. Change-Id: Ic9a3d8d5625af49b093976a855dd66e5705d2c29	2020-06-17 21:49:39 +00:00

1 2 3 4 5 ...

2523 Commits