This adds a new check to make sure msgr2 is enabled if it is
supported by all of the mons. When mon quorum is lost the
mons revert to the v1 protocol, which results in a Ceph
warning state if v2 is supported by all of the available
mons.
Change-Id: Ib85243d38f122c1993aba945b7ae943eed262dbf
Currently postgresql database backup job will fail due to not having
correct permissions on the mounted PVC. This patchset corrects the
permissions on the PVC mount so that the backup pods can write to the
/var/backup directory structure.
Another problem was that pg_dumpall was not able to get the correct
password from the admin_user.conf. This may be due to the extra lines
in the file, so this patchset reads it differently in order to find
the password. This was a change to the backup and restore scripts.
Also there are a number of small corrections made to the error handling
for both backup and restore scripts, to be consistent with the MariaDB
backup/restore scripts.
Change-Id: Ica361764c591099e16d03a0988f73c6976583ceb
This patch set adds in needed override to support OpenStack Train
release by moving the libvirt version to > 3.0.0.
Change-Id: I36097544024df5c6dfc87a032bd8383be98f1a3a
Signed-off-by: Tin Lam <tin@irrational.io>
This patchset fixes a serious database restoration problem where the
user is trying to restore a single database, but in the process of
restoring the database, the script inadvertently also removes all
tables from the other databases.
The root cause was that the mysql "--one-database" restore option
achieves the single database restoration, but somehow corrupts the
other databases. The new approach taken in this patchset is to
create a temporary database user which only has permission to
restore the chosen database, and that will leave the other databases
unharmed. This approach, which can be applied for restoring
individual databases and even database tables, was recommended in (1).
After the database is restored, the temporary user is deleted.
(1) https://mariadb.com/kb/en/restoring-data-from-dump-files/
Also improved some of the error handling as well.
Change-Id: I805c605ed2b424640ad6a0a379b1c0b9c0004e94
The output of 'ceph pg ls-by-pool' changed format in Nautilus,
which caused the checkPGs.py script to fail in some scenarios.
This change addresses that format change and fixes Nautilus
compatibility in the script. Mimic compatibility is maintained.
Change-Id: I11d8337b548f959d0a4b58b7e8f76720a0371e73
This patch set provides a way to specify clean up scripts for rally tests
to clean up orphaned resources in the event of rally test failures.
Change-Id: Ifc988002711d34186975988abb33ecd8a9a2fba4
Signed-off-by: Tin Lam <tin@irrational.io>
Sometimes jobs fail, the default of 6 retries is far too brief to get
logs (which are purged after the final failure); as we need the jobs
to succeed always, having a much higher default here seems prudent.
Change-Id: I7f20a3eb9a98669ae4af657d36a776830b82dfca
This is to fix the logic to find osd id for wal lvm and also
to find correct lvm device for osd disk.
Change-Id: Id4ee1dbd5c82dcbe9893f81c3ad3b9e18d1f9509
This is to fix the logic to use osd device name instaed of whole disk path
while osd initilizing.
also correct the ceph osd ls command to use correct keyring.
Change-Id: I90f0c3fd5d1e1b835326b1c690582990f7ca15cb
This is to wait for all the osd devices before initializing and also
to add few more checks to make sure disk is used or not .
Change-Id: I68e1d4c8c1ade39f856c69333585dfcba3ea35ab
This commit adds an audit user to the postgresql database which
will have only SELECT privileges on the postgresql database tables.
This is accomplished by setting up audit user creation parameters
in the Patroni bootstrap environment settings, according to (1).
(1) https://patroni.readthedocs.io/en/latest/ENVIRONMENT.html
Change-Id: Idf1cd90b5d093f12fa4a3c5c794d4b5bbc6c8831
In this PS we explicitly define the admin user rather than letting
patroni use the default username and password.
Change-Id: I9885314902c3a60e709f96e2850a719ff9586b3d
This patch introduces new cluster status "reboot"
which is set by leader node hence other nodes will
start mysql without "--wsrep-new-cluster" option.
Before this following situation took place:
All pods go down one by one with some offset;
First and second nodes have max seqno;
The script on the first node detects there are no active
backends and starts timeout loop;
The script on the second node detects there are no active
backends and starts timeout loop (with approx. 20 sec offset
from first node) ;
Timeout loop finishes on first node, it checks highest seqno
and lowest hostname and wins the ability to start cluster.
Mysql is started with “--wsrep-new-cluster” parameter.
Seqno is set to “-1” for this node after mysql startup;
Periodic job syncs values from grastate file to configmap;
Timeout loop finishes on second node. It checks node with
highest seqno and lowest hostname and since seqno is already
“-1” for first node, the second node decides that it should
lead the cluster startup and executes mysql with “--wsrep-new-cluster”
option as well which leads to split brain
Change-Id: Ic63fd916289cb05411544cb33d5fdeed1352b380
The values.yaml in the LDAP chart contains a duplicate network_policy:
key in the manifests: section. This patch removes the duplicate.
Change-Id: I677acaf7d96d92fecb93c30782f1e760ab4bec84
Signed-off-by: Tin Lam <tin@irrational.io>
When DPDK is enbaled, configuring CPU resource limits
through Kubernetes affects packet throughput adversely.
DPDK PMD cores could not get 100% busy.
They need to be configured by isolating them in host grub
and later through PMD core mask.
Change-Id: Ia80880302b9c5c02fdb1c00cb62f6640860e898e
An audit user is added to Mariadb with only the SELECT permission
to mysql database user table for database user audit purposes.
Change-Id: I5d046dd263e0994fea66e69359931b7dba4a766c
This updates the overrides provided for deploying fluentd as a
daemonset to get kernel messages from the journal instead of
/var/log/kern.log directly, and also uses the journal to get
messages associated with logging to auth.log (syslog facility
10). This provides additional metadata and
a cleaner interface for gathering these logs via fluentd
Change-Id: I8e832db276095771d6a869e998d7a69795dfee37
Signed-off-by: Steve Wilkerson <sw5822@att.com>
This moves from using the docker profile to the default
runtime profile - which allows container engines other than
docker to work out of the box.
Change-Id: Ica5a48f8c43b90f07969b41e10dc472a772b5b43
Signed-off-by: Pete Birley <pete@port.direct>
Validate that the container bucket exist and if so
delete it and its objects that were orphaned from a
a failed deployment helm-tests.
Change-Id: Ibaa6d0f6dd36b319c354b65e43dc6053418f4d1d
In Ceph Cluster Dashboard the OSDs In, OSDs Out, OSDs Down Panel was
showing wrong values. Updated
the expression from "count" to "sum" to show the correct values.
Change-Id: I1959eeb445bf297c1ec696f3867315f05552b03e
This patch set places in a default kubernetes egress network
policy for postgresql database chart.
Change-Id: I6caa917faf23becc3a1c09b47f457b8b2db996e4
Signed-off-by: Tin Lam <tin@irrational.io>
This change adds a means of introducing new storage classes
and local persistent volumes.
Change-Id: I340c75f3d0a1678f3149f3cf62e4ab104823cc49
Co-Authored-By: Steven Fitzpatrick <steven.fitzpatrick@att.com>