24 Commits

Author SHA1 Message Date
Pete Birley
6ea6a85198 Ceph: Update default to use OSH image
This PS udpates the default image in the chart to the latest OSH image.

Change-Id: Ib8d2a72ad48049fe02560dc4405f0088890b6f64
Signed-off-by: Pete Birley <pete@port.direct>
2019-02-01 21:25:13 +00:00
Matthew Heler
c0d028e245 Uplift Ceph charts to the Mimic release
Change the release of Ceph from 12.2.3 (Luminous) to latest 13.2.2
(Mimic). Additionally use supported RHEL/Centos Images rather then
Ubuntu images, which are now considered deprecated by Redhat.

- Uplift all Ceph images to the latest 13.2.2 ceph-container images.
- RadosGW by default will now use the Beast backend.
- RadosGW has relaxed settings enabled for S3 naming conventions.
- Increased RadosGW resource limits due to backend change.
- All Luminous specific tests now test for both Luminous/Mimic.
- Gate scripts will remove all none required ceph packages. This is
required to not conflict with the pid/gid that the Redhat container
uses.

Change-Id: I9c00f3baa6c427e6223596ade95c65c331e763fb
2019-01-05 14:38:38 +00:00
Chris Wedgwood
0c4e37391f 'NOP' cleanup for more consistent white-space use in charts
Where we have the style '{{ ...' we should use the style '... }}'.

Change-Id: Ic3e779e4681370d396f95d3804ca27db5b9d3642
2019-01-03 22:45:49 +00:00
Pete Birley
90700f5a76 Ceph: Add labels to secrets created by charts
This PS adds labels to secrets created by charts, which allows them
to be easily identified in deployed sites.

PS4: This PS resolves undefined variable "$envAll" issue

Change-Id: Icbe3584b0ac18b23e32489c4a04ad5aa7aad67e6
Signed-off-by: Pete Birley <pete@port.direct>
2018-12-06 04:15:29 +00:00
Matthew Heler
4ad893eb1a Additional Ceph tunning parameters for openstack-helm
osd_scrub_load_threshold set to 10.0 (default 0.5)
 - With the number of multi-core processors nowadays, it's fairly
   typical to see systems over a load of 1.0. We need to adjust the
   scrub load threshold so that scrubbing runs as scheduled even
   when a node is moderately/lightly under load.

filestore_max_sync_interval set to 10s (default 5s)
 - Larger default journal sizes (>1GB) will not be effectively used
   unless the max sync interval time is increased for Filestore. The
   benefit of this change is increased performance especially around
   sequential write workloads.

mon_osd_down_out_interval set to 1800s (default 600s)
 - OSD PODs can take longer then several minutes to boot up. Mark
   an OSD as 'out' in the CRUSH map only after 15 minutes of being
   'down'.

Change-Id: I62d6d0de436c270d3295671f8c7f74c89b3bd71e
2018-12-04 20:27:52 -06:00
Matthew Heler
35cce6cb43 Switch Ceph to IPs when DNS is down
Add helper scripts that are called by a POD to switch
Ceph from DNS to IPs. This POD will loop every 5 minutes
to catch cases where the DNS might be unavailable.

On a POD's Service start switch ceph.conf to using IPs rather
then DNS.

Change-Id: I402199f55792ca9f5f28e436ff44d4a6ac9b7cf9
2018-12-03 10:51:37 -06:00
Zuul
5bf9c26bd8 Merge "Move default CEPH journal size from 5GB to 10GB" 2018-11-13 05:28:45 +00:00
Matthew Heler
55446e1f41 Move default CEPH journal size from 5GB to 10GB
Request from downstream to use 10GB journal sizes. Currently journals 
are created manually today, but there is upcoming work to have the
journals created by the Helm charts themselves. This value needs to be
put in as a default to ensure journals are sized appropiately.

Change-Id: Idaf46fac159ffc49063cee1628c63d5bd42b4bc6
2018-11-08 17:34:12 +00:00
Matthew Heler
e1c82f3465 Fix the checkPGs cronjob
Currently the cronjob is broken due to syntax and
permission issues.

Additionally move the cronjob from once a month to
every 15 minutes, and automatically disable the job
unless explicitly enabled.

Change-Id: Id72bdb286c805ccb0ea4e9fcf65fabca94a180dd
2018-11-06 19:39:23 -06:00
Steve Wilkerson
45da8c2b69 Ceph: Update log directory host mount path
This updates the ceph-mon and ceph-osd charts to use the release
name for the hostpath defined for mounting the /var/log/ceph
directories to. This gives us a mechanism for creating unique log
directories for multiple releases of the same chart without the
need for specifying an override for each deployment of that chart

Change-Id: Ie6e05b99c32f24440fbade02d59c7bb14d8aa4c8
2018-10-29 13:05:46 -05:00
Matthew Heler
6ef48d3706 Further performance tuning changes for Ceph
- Throttle down snap trimming as to lessen it's performance impact
(Setting just osd_snap_trim_priority isn't effective enough to throttle
down the impact)
osd_snap_trim_sleep: 0.1 (default 0)
osd_pg_max_concurrent_snap_trims: 1 (default 2)

- Align filestore_merge_threshold with upstream Ceph values
(A negative number disables this function, no change in behavior)
filestore_merge_threshold: -10 (formerly -50, default 10)

- Increase RGW pool thread size for more concurrent connections
rgw_thread_pool_size: 512 (default 100)

- Disable in-memory logs for the ms subsytem.
debug_ms: 0/0 (default 0/5)

- Formating cleanups

Change-Id: I4aefcb6e774cb3e1252e52ca6003cec495556467
2018-10-26 15:10:50 +00:00
Chinasubbareddy M
a1b8f394b2 ceph: make log directory configurable
this is make log directory configurable incase if  another mon or
osd running on same host can point to other directory

Change-Id: I2db6dffd45599386f8082db8f893c799d139aba3
2018-10-25 14:34:14 +00:00
Zuul
f49461acc4 Merge "cronjob-checkPGs failure fix" 2018-10-23 20:21:46 +00:00
Zuul
4c4e947e17 Merge "Ceph: A script to check object replication across the hosts" 2018-10-23 18:25:43 +00:00
Matthew Heler
154fcd894f Use the correct socket file for the Ceph mon check.
Change-Id: If8c40c3c0501b78db88d3a7f33bf3838c0e60199
Closes-Bug: 1796313
2018-10-22 04:56:13 +00:00
Chinasubbareddy M
26991ad182 Ceph: A script to check object replication across the hosts
this script will  create  an object and see if the object is
getting replicated across diffrent hosts  or not.

Change-Id: Ic5056c1a07dc5d5b6a5d6fc24e3d9a75fa46458f
2018-10-21 15:38:26 +00:00
Steve Wilkerson
92717bdc72 Ceph: Remove fluentbit sidecars, mount hostpath for logs
This removes the fluentbit sidecars from the ceph-mon and ceph-osd
charts. Instead, we mount /var/log/ceph as a hostpath, and use the
fluentbit daemonset to target the mounted log files instead

This also updates the fluentd configuration to better handle the
correct configuration type for flush_interval (time vs int), as
well as updates the fluentd elasticsearch output values to help
address the gate failures resulting from the Elasticsearch bulk
endpoints failing

Change-Id: If3f2ff6371f267ed72379de25ff463079ba4cddc
2018-10-17 11:05:03 -05:00
kranthi guttikonda
549bf29fd8 cronjob-checkPGs failure fix
Added role and rolebindings to fix permissions.
Added volumes definitions for ceph-bin, ceph-etc
and ceph-client-adminkeyring
serviceaccount and node selectors

Implements: Bug 1797589
Closes-Bug: #1797589
Change-Id: Ib0e77e088c6aa82e441aba72bebc4b258deb88c4
Signed-off-by: Kranthi Kiran Guttikonda <kranthi.guttikonda@b-yond.com>
2018-10-13 18:45:10 -04:00
Chinasubbareddy M
2f2cb7d567 Ceph: Add configmap hash as annotation
adding configmap  hash to following ds/deployments to trigger
rolling updates if there are any update for configmap

- ceph-mon
- ceph-mds
- ceph-mgr
- ceph-rgw

Change-Id: I4173cb12c18640c9b1a0e5a698d48f4735e250fb
2018-09-22 07:26:52 +00:00
Zuul
c10f9ce59e Merge "Modify Ceph default settings for improved performance" 2018-09-20 22:44:11 +00:00
Jean-Charles Lopez
c6cad19d11 Modify Ceph default settings for improved performance
Change-Id: Ia0d856e53f3bfdc1414264b468b576003dc23b6e
2018-09-13 07:47:42 -07:00
Pete Birley
bb3ff98d53 Add release uuid to pods and rc objects
This PS adds the ability to attach a release uuid to pods and rc
objects as desired. A follow up ps will add the ability to add arbitary
annotations to the same objects.

Change-Id: Iceedba457a03387f6fc44eb763a00fd57f9d84a5
Signed-off-by: Pete Birley <pete@port.direct>
2018-09-13 05:35:35 +00:00
Al Lau
d6cfd78c4d A script to check the failure domains of OSDs in PGs
The checkPGs script is implemented to check the Object Storage
Daemons (OSDs) in Placement Groups (PGs) of ceph pools to make
sure OSDs were not allocated from the same failure domain.  This
script is intended to run from any one of the ceph-mon pods.

Invoke the checkPGs script with --help to get the details on how
to run it.

A Kubernetes cron job is created to schedule the execution of
this script at a regular interval.  The execution frequency is
defined in the ceph-mon/values.yaml file.

Change-Id: I5d46bc824e88545cde1cc448ae714d7d3c243817
2018-09-06 06:06:28 -07:00
Steve Wilkerson
25bc83b580 Ceph: Move Ceph charts to openstack-helm-infra
This continues the work of moving infrastructure related services
out of openstack-helm, by moving the ceph charts to openstack
helm infra instead.

Change-Id: I306ccd9d494f72a7946a7850f96d5c22f36eb8a0
2018-08-28 15:03:35 -05:00