3313 Commits

Author SHA1 Message Date
Parsons, Cliff (cp769u)
7bb5ff5502 Make ceph-client helm test more PG specific
This patchset makes the current ceph-client helm test more specific
about checking each of the PGs that are transitioning through inactive
states during the test. If any single PG spends more than 30 seconds in
any of these inactive states (peering, activating, creating, unknown,
etc), then the test will fail.

Also, if after the three minute PG checking period is expired, we will
no longer fail the helm test, as it is very possible that the autoscaler
could be still adjusting the PGs for several minutes after a deployment
is done.

Change-Id: I7f3209b7b3399feb7bec7598e6e88d7680f825c4
2021-04-16 22:25:53 +00:00
Steven Fitzpatrick
38e6023351 Elasticsearch: Add configurable backoffLimit to templates job
This change allows us to control the backofflimit for this job

Change-Id: I9c3ccc0842a0e5c31b7838576648dae966b15a6e
2021-04-16 18:01:47 +00:00
Zuul
704d808514 Merge "Refactor Ceph OSD Init Scripts - First PS" 2021-04-15 22:25:29 +00:00
Zuul
daca15441b Merge "Removed hard-coded value for backendPort" 2021-04-15 22:24:47 +00:00
Lo, Chi (cl566n)
3b030aa40d Removed hard-coded value for backendPort
This change will retrieve the backend port from values.yaml
instead of a hard-coded value.

Change-Id: I27630d3ead2c8a517f4fe8577e8396776010f9a8
2021-04-15 13:33:22 -07:00
Gage Hugo
14636aa776 Remove releasenotes from irrelevant files for linter
This change removes the releasenotes directory from the
irrelevant-files list in the zuul linter since the linter actually
checks those files, so for issues with the releasenotes it may
be difficult to test fixes when charts become out of date.

Change-Id: I3c4f95a5bc5fb8d9a0ec8dbb8d2f9560f1e46f9a
2021-04-15 15:01:05 -05:00
Ritchie, Frank (fr801x)
207da4426a Update tls overrides
Updated tls overrides for proper gate functionality.

Change-Id: I59d9e0425b41a5121fc0a6d0d75b7f6e3d54bec6
2021-04-15 18:09:48 +00:00
Huy Tran
c60c138777 Enhancements to make stats cachedump configurable
Memcached stats cachedump is enabled by default. Changes in this
pathset provide an option to configure stats cachedump as desired
during deployment i.e. the stats cachedump can be disabled to
prevent user obtaining sensitive info via the cachedump data.

Change-Id: Ic6254f89b1478a414ac275436ddd659b16b75f98
2021-04-14 22:52:18 +00:00
Roy Tang
a671d40a52 Support override of ovs probes
Currently ovs liveness and readiness probes commands are statically
defined in the templates, this change allow them to be change
as needed.  This helps with debuging and making quick adjustment.

Change-Id: I75b4b5a335b75a52f4efbd4ba4ed007106aba4fa
2021-04-14 16:03:19 -04:00
Pai, Radhika (rp592h)
dbb20c786d [fix] Update the ES curator config
The curator actions in the configmap gets set to
null which is causing error when redering any actions downstream.
Adding the {} should resolve this issue.

Change-Id: I8c337ee1f089c13f75cb7a9997a7bf6f04246160
2021-04-14 14:35:00 -05:00
Zuul
413bd4f850 Merge "Adjust Prometheus http readiness probe path from /status to /-/ready" 2021-04-14 17:06:53 +00:00
DeJaeger, Darren (dd118r)
be2584fd7c Adjust Prometheus http readiness probe path from /status to /-/ready
Prometheus documentation shows that /-/ready can be used to check that
it is ready to service traffic (i.e. respond to queries) [0]. I've
witnessed cases where Prometheus's readiness probe is passing during
initial deployment using /status, which in turn triggers its helm test
to start. Said helm test then fails because /status is not a good a
reliable indicator that Prometheus is actually ready to serve traffic
and the helm test is performing actions that require it to be proprely
up and ready.

[0]: https://prometheus.io/docs/prometheus/latest/management_api/

Change-Id: Iab22d0c986d680663fbe8e84d6c0d89b03dc6428
2021-04-13 13:17:49 -04:00
Zuul
561f398ad7 Merge "Elasticsearch: Make templates job more robust" 2021-04-13 16:04:57 +00:00
Parsons, Cliff (cp769u)
aaa85e3fc5 Refactor Ceph OSD Init Scripts - First PS
This is the first of multiple updates to ceph-osd where the OSD
init code will be refactored for better sustainability.

This patchset makes 2 changes:

1) Removes "ceph-disk" support, as ceph-disk was removed from the
   ceph image since nautilus.
2) Separates the initialization code for the bluestore, filestore,
   and directory backend configuration options.

Change-Id: I116ce9cc8d3bac870adba8b84677ec652bbb0dd4
2021-04-12 19:36:32 +00:00
Steven Fitzpatrick
d3c6069be3 Elasticsearch: Make templates job more robust
This change primarily changes the type of the api_objects yaml structure
to a map, which allows for additional objects to be added by values
overrides (Arrays/Lists are not mutable like this)

Also, in the previous change, some scripts in HTK were modified, while
other were copied over to the Elasticsearch chart. To simplify the chart's
structure, this change also moves the create_s3_bucket script to Elasticsearch,
and reverts the changes in HTK.

Those HTK scripts are no longer referenced by osh charts, and could be candidates
for removal if that chart needed to be pruned

Change-Id: I7d8d7ef28223948437450dcb64bd03f2975ad54d
2021-04-12 18:40:11 +00:00
Gage Hugo
25c897fb89 Move shaker chart from osh-addons
This change moves the shaker chart from the osh-addons repo
to this one.

Change-Id: Ica2c7668a7ab047f8ed2361234b5810eedc9c1e2
2021-04-08 04:06:00 +00:00
Zuul
c6786de152 Merge "Enable TLS for Ceph RGW" 2021-04-06 20:53:03 +00:00
Ritchie, Frank (fr801x)
e954253a1a Enable TLS for Ceph RGW
This PS is to optionally enable tls for ceph-rgw.

Change-Id: I4797ef41612143f8065ac8fec20ddeae2c0218a3
2021-04-06 18:44:59 +00:00
Steven Fitzpatrick
6de864110e Elasticsearch S3 Update
This change updates how the Elasticsearch chart handles
S3 configuration and snapshot repository registration.

This allows for
  - Multiple snapshot destinations to be configued
  - Repositories to use a specific placement target
  - Management of multiple account credentials

Change-Id: I12de918adc5964a4ded46f6f6cd3fa94c7235112
2021-04-06 15:12:34 +00:00
Chris Wedgwood
20cf2db961 [htk] Jobs; put labels only in the template spec
This is an update to address a behavior change introduced with
0ae8f4d21ac2a091f1612e50f4786da5065d4398.

Job labels if empty/unspecified are taken from the template.  If (any)
labels are specified on the job we do not get this behavior.

Specifically if we *apply*:

    apiVersion: batch/v1
    kind: Job
    metadata:
      # no "labels:" here
      name: placement-db-init
      namespace: openstack
    spec:
      template:
        metadata:
          labels:
            application: placement
            component: db-init
            release_group: placement
        spec:
          containers:
          # do stuffs

then *query* we see:

    apiVersion: batch/v1
    kind: Job
    metadata:
      # k8s did this for us!
      labels:
        application: placement
        component: db-init
        job-name: placement-db-init
        release_group: placement
      name: placement-db-init
      namespace: openstack
    spec:
      template:
        metadata:
          labels:
            application: placement
            component: db-init
            release_group: placement
        spec:
          containers:
          # do stuffs

The aforementioned change causes objects we apply and query to look
like:

    apiVersion: batch/v1
    kind: Job
    metadata:
      # k8s did this for us!
      labels:
        application: placement
        # nothing else!
      name: placement-db-init
      namespace: openstack
    spec:
      template:
        metadata:
          labels:
            application: placement
            component: db-init
            release_group: placement
        spec:
          containers:
          # do stuffs

Current users rely on this behavior and deployment systems use job
labels for synchronization, those labels being only specified in the
template and propagating to the job.

This change preserves functionality added recently and restores the
previous behavior.

The explicit "application" label is no longer needed as the
helm-toolkit.snippets.kubernetes_metadata_labels macro provides it.

Change-Id: I1582d008217b8848103579b826fae065c538aaf0
2021-04-02 16:54:03 -05:00
Zuul
7351586a7d Merge "Allow Ceph RBD pool job to leave failed pods" 2021-03-30 08:04:06 +00:00
Zuul
3ef3ad1432 Merge "HTK: Override the expiry of Ingress TLS certificate" 2021-03-29 21:23:14 +00:00
Zuul
f1384caca6 Merge "[Update] NPD systemd-monitor lookback duration" 2021-03-29 19:54:10 +00:00
Zuul
4ed24de14b Merge "[ceph-osd] Update directory-based OSD deployment for image changes" 2021-03-29 19:48:43 +00:00
Parsons, Cliff (cp769u)
f20eff164f Allow Ceph RBD pool job to leave failed pods
This patchset will add the capability to configure the
Ceph RBD pool job to leave failed pods behind for debugging
purposes, if it is desired. Default is to not leave them
behind, which is the current behavior.

Change-Id: Ife63b73f89996d59b75ec617129818068b060d1c
2021-03-29 19:38:55 +00:00
Chinasubbareddy Mallavarapu
734b344bf6 [ceph-provisioners] Update ceph_mon config as per new ceph clients
As new ceph clients expecting the ceph_mon config as shown below , this
ps will update the configmap.

mon_host = [v1:172.29.1.139:6789/0,v2:172.29.1.139:3300/0],
[v1:172.29.1.140:6789/0,v2:172.29.1.140:3300/0],
[v1:172.29.1.145:6789/0,v2:172.29.1.145:3300/0]

Change-Id: I6b96bf5bd4fb29bf1e004fc2ce8514979da706ed
2021-03-29 15:02:08 +00:00
Stephen Taylor
131ea21512 [ceph-osd] Update directory-based OSD deployment for image changes
Directory-based OSDs are failing to deploy because 'python' has
been replaced with 'python3' in the image. This change updates the
python commands to use python3 instead.

There is also a dependency on forego, which has been removed from
the image. This change also modifies the deployment so that it
doesn't depend on forego.

Ownership of the OSD keyring file has also been changed so that it
is owned by the 'ceph' user, and the ceph-osd process now uses
--setuser and --setgroup to run as the same user.

Change-Id: If825df283bca0b9f54406084ac4b8f958a69eab7
2021-03-29 14:40:28 +00:00
Zuul
1f52a1c24c Merge "Set strict permission on mariadb data dir" 2021-03-26 22:20:32 +00:00
Zuul
0d8331d7ec Merge "fix(script): removes replacement overrides" 2021-03-26 18:00:33 +00:00
Radhika Pai
e9fce11161 [Update] NPD systemd-monitor lookback duration
This ps adds the lookback duration of 5m to the systemd-monitor to avoid
looking back indefinitely in journal log and causing the alert to stick around.

Change-Id: Ia32f043c0c7484d0bb92cfc4b68b506eae8e9d72
2021-03-26 15:24:10 +00:00
Gupta, Sangeet (sg774j)
f4ce1c8681 HTK: Override the expiry of Ingress TLS certificate
v1.2.0 of cert-manager noew supports overriding the default value
of ingress certificate expiry via annotations. This PS add the
required annotation.

Change-Id: Ic81e47f24d4e488eb4fc09688c36a6cea324e9e2
2021-03-25 22:18:57 +00:00
Huang, Sophie (sh879n)
6eec615b39 Set strict permission on mariadb data dir
For security reasons, strict access permission is given to
the mariadb data directory /var/lib/mysql

Change-Id: I9e55a7e564d66874a35a54a72817fa1237a162e9
2021-03-24 20:20:03 +00:00
Zuul
b3888df131 Merge "Elasticsearch Disable Curator in Gate & Chart Defaults" 2021-03-24 02:08:39 +00:00
Parsons, Cliff (cp769u)
167b9eb1a8 Fix ceph-client helm test
This patch resolves a helm test problem where the test was failing
if it found a PG state of "activating". It could also potentially
find a number of other states, like premerge or unknown, that
could also fail the test. Note that if these transient PG states are
found for more than 3 minutes, the helm test fails.

Change-Id: I071bcfedf7e4079e085c2f72d2fbab3adc0b027c
2021-03-22 22:06:27 +00:00
Steven Fitzpatrick
4fb159f7a3 Elasticsearch Disable Curator in Gate & Chart Defaults
Since chart v0.1.3 SLM policies have been supported, but we still
run curator in the gate, and its manifest toggles still default to
true

Change-Id: I5d8a29ae78fa4f93cb71bdf6c7d1ab3254c31325
2021-03-22 21:16:59 +00:00
Tin Lam
b72f750e87 fix(script): removes replacement overrides
This removes the functionality to perform envsubst in the feature
gate script to prevent users with specific env set running into
unexpected error. This feature will be re-visited in the future to
be made more robust.

Signed-off-by: Tin Lam <tin@irrational.io>
Change-Id: I6dcfd4dad138573294a9222e4e7af80c9bff4ac0
2021-03-19 01:14:09 -05:00
Zuul
43226de6e3 Merge "Enable TLS between Prometheus and Grafana" 2021-03-18 15:28:34 +00:00
Zuul
f78cbde672 Merge "Enable TLS for Prometheus" 2021-03-18 07:00:03 +00:00
Lo, Chi (cl566n)
86112314ed Enable TLS between Prometheus and Grafana
This patchset enables TLS path between Prometheus and Grafana.
Grafana pull data from Prometheus. As such, Prometheus is the
server and Grafana is the client for TLS handshake.

Change-Id: I50cb6f59472155415cff16a81ebaebd192064d65
2021-03-18 02:12:16 +00:00
Lo, Chi (cl566n)
1892fca645 Enable TLS for Prometheus
This patchset enabled TLS path for Prometheus when it acts as
a server.  Note that TLS is not directly terminated at Prometheus.
TLS is terminated at apache proxy which in turn route request
to Prometheus.

Change-Id: I0db366b6237a34da2e9a31345d96ae8f63815fa2
2021-03-17 17:06:07 -07:00
Zuul
8c2bcb1429 Merge "Disable mariadb mysql history client logging" 2021-03-17 19:15:32 +00:00
Smith, David (ds3330)
96b751465a Upgrade Prometheus to v2.25 change/Remove deprecated flags
The flag storage.tsdb.retention is deprecated and generates warnings
on startup storage.tsdb.retention.time is the new flag.

storage.tsdb.wal-compression is now set as the default in v2.20
and above and is no longer needed

Change-Id: I66f861a354a3cdde69a712ca5fd8a1d1a1eca60a
2021-03-16 18:19:49 +00:00
Zuul
58d9a62e73 Merge "Pin a few Java configuration values to 8-13" 2021-03-16 05:50:45 +00:00
Ritchie, Frank (fr801x)
05cad716e5 Add support for rgw placement targets
This PS adds support for rgw placement targets:

https://docs.ceph.com/en/latest/radosgw/placement/#placement-targets

Change-Id: I6fc643994dcf2c15a04f07b8703968a76c009c18
2021-03-12 22:16:41 +00:00
Huang, Sophie (sh879n)
87429ebb86 Disable mariadb mysql history client logging
Environment variable MYSQL_HISTFILE is added to mariadb container
to disable storing client mysql history to ~/.mysql_history file.

Change-Id: Ie95bc1f830fbf34d30c73de07513299115d8e8c5
2021-03-12 20:50:15 +00:00
Stephen Taylor
69a7916b92 [ceph-client] Disable autoscaling before pools are created
When autoscaling is disabled after pools are created, there is an
opportunity for some autoscaling to take place before autoscaling
is disabled. This change checks to see if autoscaling needs to be
disabled before creating pools, then checks to see if it needs to
be enabled after creating pools. This ensures that autoscaling
won't happen when autoscaler is disabled and autoscaling won't
start prematurely as pools are being created when it is enabled.

Change-Id: I8803b799b51735ecd3a4878d62be45ec50bbbe19
2021-03-12 15:03:51 +00:00
Kiran Kumar Surapathi (ks342f)
4b42f3f57f Fix Helm tests for the Ceph provisioners
We are adding the node selectors to helm tests for Ceph provisioners

Change-Id: I0fc9a78dcd27a92486dc724ce9294da96826eac9
2021-03-11 17:34:12 +00:00
Zuul
ff81e97301 Merge "Stop using fsGroup inside container securityContext" 2021-03-11 03:11:42 +00:00
Zuul
8a3151a7c6 Merge "Replace brace expansion with more standardized Posix approach" 2021-03-10 23:57:29 +00:00
Mohammed Naser
737f5610e3 Pin a few Java configuration values to 8-13
The newer versions of ElasticSearch use Java 15 which has dropped
some of those options, we can keep backwards compatibility by
pinning to certain versions[1].

[1]: https://discuss.elastic.co/t/elasticsearch-wont-start-after-7-9-1-to-7-9-2-upgrade/249878/2

Change-Id: Iaa29bc202d9eb9c5eda3040b38596f0524a0c453
2021-03-10 17:23:36 -05:00