41 Commits

Author SHA1 Message Date
Pete Birley
6ea6a85198 Ceph: Update default to use OSH image
This PS udpates the default image in the chart to the latest OSH image.

Change-Id: Ib8d2a72ad48049fe02560dc4405f0088890b6f64
Signed-off-by: Pete Birley <pete@port.direct>
2019-02-01 21:25:13 +00:00
Zuul
b30012a616 Merge "[CEPH] Fixes for the OSD defrag cronjob" 2019-01-31 16:05:14 +00:00
Matthew Heler
fc76091261 [CEPH] Fixes for the OSD defrag cronjob
Fix a naming issue with the cronjob's binary, and schedule the cron
job to run every 15 minutes for the gates. Additonally check to
to ensure we are only running on block devices. Also update the
script to work with ceph-volume created devices.

Change-Id: I8aedab0ac41c191ef39a08034fff3278027d7520
2019-01-31 06:13:05 -06:00
Matthew Heler
f48c365cd3 [CEPH] Clean up PG troubleshooting option specific to Luminous
Clean up the PG troubleshooting method that was needed for
Luminous images. Since we are now on Mimic, this function is now
not needed.

Change-Id: Iccb148120410b956c25a1fed5655b3debba3412c
2019-01-29 18:57:23 +00:00
Zuul
f0f1b57b3c Merge "[CEPH] Journal automation and disk cleanup updates" 2019-01-28 06:05:45 +00:00
Matthew Heler
61b93c6b46 [CEPH] Journal automation and disk cleanup updates
Refactor the OSD Block initialization code that performs clean ups
to use all the commands that ceph-disk zap uses.

Extend the functionality when an OSD initializes to create journal
partitions automatically. For example if /dev/sdc3 is defined as a
journal disk, the chart will automatically create that partition.
The size of the journal partition is determined by the
osd_journal_size that is defined in ceph.conf.

Change the OSD_FORCE_ZAP option to OSD_FORCE_REPAIR to automatically
recreate/self-heal Filestore OSDs. This option will now call a
function to repair a journal disk, and recreate partitions. One
caveat to this, is that the device paritions must be defined (ex.
/dev/sdc1) for a journal. Otherwise the OSD is zapped and re-created
if the whole disk (ex. /dev/sdc) is defined as the journal disk.

Change-Id: Ied131b51605595dce65eb29c0b64cb6af979066e
2019-01-24 11:47:30 -06:00
Matthew Heler
d966085321 [CEPH] Setup a cronjob to run OSD defrags for FileStore
Create a cron and associated script to run monthly OSD defrags.
When the script runs it will switch the OSD disk to the CFQ I/O
scheduler to ensure that this is a non-blocking operation for ceph.
While this cron job will run monthly, it will only execute on OSDs
that are HDD based with Filestore.

Change-Id: I06a4679e0cbb3e065974d610606d232cde77e0b2
2019-01-22 04:27:41 +00:00
Matthew Heler
b0da8d78d1 [CEPH] Fix a race condition with udev on OSD start
Under some conditions udev may not trigger correctly and create
the proper uuid symlinks required by Ceph. In order to work around
this we manually create the symlinks.

Change-Id: Icadce2c005864906bcfdae4d28117628c724cc1c
2019-01-18 15:03:27 -06:00
kranthi guttikonda
6771440f4a Fix for ceph-osd regression
When ceph-osd journal as a directory and data as
 a block device ceph-osd fails to deploy while
waiting for the journal file in
/var/lib/ceph/journal/journal.<id>

Added the condition before checking bluestore for
directory and removed the same later in the script

Closes-Bug: #1811154
Change-Id: Ibd4cf0be5ed90dfc4de5ffab554a91da1b62e5f4
Signed-off-by: Kranthi Guttikonda <kranthi.guttikonda@b-yond.com>
Signed-off-by: kranthi guttikonda <kranthi.guttikonda9@gmail.com>
2019-01-11 18:11:23 -05:00
Matthew Heler
e9c7aab6fd [CEPH] Directory OSD regression fix
Fix a regression with the Directory OSD logic.

Change-Id: I793cf0869bda5c640eb945cbb8190cd89b30c4d0
2019-01-08 13:45:32 -06:00
Matthew Heler
4a85c21996 [CEPH] OSD directory permission fixes
In the event the base image is changed, the uid of the ceph OSD
directory may not align with the uid of the ceph user of the image.
In this case we check permissions and set them correctly.

Change-Id: I3bef7f6323d1de7c62320ccd423c929349bedb42
2019-01-07 19:08:11 -06:00
Matthew Heler
c0d028e245 Uplift Ceph charts to the Mimic release
Change the release of Ceph from 12.2.3 (Luminous) to latest 13.2.2
(Mimic). Additionally use supported RHEL/Centos Images rather then
Ubuntu images, which are now considered deprecated by Redhat.

- Uplift all Ceph images to the latest 13.2.2 ceph-container images.
- RadosGW by default will now use the Beast backend.
- RadosGW has relaxed settings enabled for S3 naming conventions.
- Increased RadosGW resource limits due to backend change.
- All Luminous specific tests now test for both Luminous/Mimic.
- Gate scripts will remove all none required ceph packages. This is
required to not conflict with the pid/gid that the Redhat container
uses.

Change-Id: I9c00f3baa6c427e6223596ade95c65c331e763fb
2019-01-05 14:38:38 +00:00
Chris Wedgwood
0c4e37391f 'NOP' cleanup for more consistent white-space use in charts
Where we have the style '{{ ...' we should use the style '... }}'.

Change-Id: Ic3e779e4681370d396f95d3804ca27db5b9d3642
2019-01-03 22:45:49 +00:00
Matthew Heler
e581a79807 [CEPH] Cleanup the ceph-osd helm-chart
- Split off duplicate code across multiple bash scripts into a common
file.
- Simplify the way journals are detected for block devices.
- Cleanup unused portions of the code.
- Standardize the syntax across all the code.
- Use sgdisk for zapping disks rather then ceph-disk.

Change-Id: I13e4a89cab3ee454dd36b5cdedfa2f341bf50b87
2018-12-28 13:09:21 -06:00
Zuul
5cca3e74d4 Merge "[CEPH] Fix race conditions with OSD POD initialization" 2018-12-24 22:48:53 +00:00
Matthew Heler
30b57ba671 [CEPH] Fix race conditions with OSD POD initialization
Under POD restart conditions there is a race condition with lsblk
causing the helm chart to zap a fully working OSD disk. We refactor
the code to remove this requirement.

Additonally the new automatic journal partitioning code has a race
condition in which the same journal partition could be picked twice
for OSDs on the same node. To resolve this we share a common tmp
directory from the node to all of the OSD pods on that node.

Change-Id: I807074c4c5e54b953b5c0efa4c169763c5629062
2018-12-21 15:05:54 -06:00
Matthew Heler
e1a3819a0d [CEPH] Support a troubleshooting option to reset PG metadata
Ceph upstream bug: https://tracker.ceph.com/issues/21142 is
impacting the availability of our sites in pipeline. Add an option
to reset the past interval metadata time on an OSDs PG to solve for
this issue if it occurs.

Change-Id: I1fe0bee6ce8aa402c241f1ad457bbf532945a530
2018-12-18 23:26:18 -06:00
Matthew Heler
de69c68365 [Ceph] Update ceph helm tests
- Ensure the helm tests are logging all commands and variables

Change-Id: I4f4c553a3fbb4d77e9d1ab41c1c0c763c963cfd3
2018-12-15 13:47:43 -06:00
Zuul
b76acd6dd6 Merge "Ceph: Journal partition automation" 2018-12-14 18:37:15 +00:00
Zuul
62dce1852e Merge "Increase the cpu and memory resource limits for Ceph OSDs" 2018-12-14 01:52:57 +00:00
Renis Makadia
17df1c5df5 Ceph: Journal partition automation
- Use whole disk /dev/sdc format.
- Don't specify partition and let ceph-osd util create
and manage partition.
- On an OSD disk failure, during manintanance window,
Journal partition for failed OSD should be deleted.
This will allow ceph-osd util to reuse space for new partition.
- Disk partition count num will continue to
increase as more OSD fails.

Change-Id: I87522db8cabebe8cb103481cdb65fc52f2ce2b07
2018-12-13 16:37:15 +00:00
Pete Birley
c256cce537 Ceph: Allow multiple test pods to be present in clusters
This ps allows multiple ceph test pods to be present in cluster with
more than one ceph deployment.

Change-Id: I002a8b4681d97ed6ab95af23e1938870c28f5a83
Signed-off-by: Pete Birley <pete@port.direct>
2018-12-12 07:29:01 -06:00
Matthew Heler
2e67eeb955 Increase the cpu and memory resource limits for Ceph OSDs
The minimium requirements for a Ceph OSD have changed in the latest
Luminous release to accomodate Bluestore changes. We need to support
these changes as we look into upgrading Ceph to the latest Luminous
and beyond releases.

Change-Id: I3eddffe73cfd188ff012db7c74702de6921711e7
2018-12-11 01:27:43 +00:00
Pete Birley
7608d2c9d7 Ceph: Update failure domain overrides to support dynamic config
This PS updates the ceph failure domain overrides to support
dynamic configuration based on host/label based overrides.

Also fixes typo identified in the following ps for directories:
 * https://review.openstack.org/#/c/623670/1

Change-Id: Ia449be23353083f9a77df2b592944571c907e277
Signed-off-by: Pete Birley <pete@port.direct>
2018-12-08 13:54:17 -06:00
Matthew Heler
d50bd2daad Fix detection of failure domain type
Small typo in the logic filtering of the failure domain type for
an OSD pod. This wasn't initially found since it didn't break any
expected behavior tests.

Change-Id: I2b895bbc83c6c71fffe1a0db357b120b3ffb7f56
2018-12-08 12:45:07 -06:00
Matthew Heler
4ad893eb1a Additional Ceph tunning parameters for openstack-helm
osd_scrub_load_threshold set to 10.0 (default 0.5)
 - With the number of multi-core processors nowadays, it's fairly
   typical to see systems over a load of 1.0. We need to adjust the
   scrub load threshold so that scrubbing runs as scheduled even
   when a node is moderately/lightly under load.

filestore_max_sync_interval set to 10s (default 5s)
 - Larger default journal sizes (>1GB) will not be effectively used
   unless the max sync interval time is increased for Filestore. The
   benefit of this change is increased performance especially around
   sequential write workloads.

mon_osd_down_out_interval set to 1800s (default 600s)
 - OSD PODs can take longer then several minutes to boot up. Mark
   an OSD as 'out' in the CRUSH map only after 15 minutes of being
   'down'.

Change-Id: I62d6d0de436c270d3295671f8c7f74c89b3bd71e
2018-12-04 20:27:52 -06:00
Matthew Heler
35cce6cb43 Switch Ceph to IPs when DNS is down
Add helper scripts that are called by a POD to switch
Ceph from DNS to IPs. This POD will loop every 5 minutes
to catch cases where the DNS might be unavailable.

On a POD's Service start switch ceph.conf to using IPs rather
then DNS.

Change-Id: I402199f55792ca9f5f28e436ff44d4a6ac9b7cf9
2018-12-03 10:51:37 -06:00
Renis Makadia
b1005b23b4 Helm tests for Ceph-OSD and Ceph-Client charts
Change-Id: If4a846f0593b8679558662205a8560aa3cbb18ae
2018-12-01 08:08:00 +00:00
Matthew Heler
6e8c289c13 Add failure domains, and device classes for custom CRUSH rules
Largely inspired and taken from Kranthi's PS.

 - Add support for creating custom CRUSH rules based off of failure
domains and device classes (ssd & hdd)
- Basic logic around the PG calculator to autodetect the number of
OSDs globally and per device class (required when using custom crush
rules that specify device classes).

Change-Id: I13a6f5eb21494746c2b77e340e8d0dcb0d81a591
2018-11-27 09:37:30 -06:00
Matthew Heler
5ce9f2eb3b Enable Ceph charts to be rack aware for CRUSH
Add support for a rack level CRUSH map. Rack level CRUSH support is
enabled by using the "rack_replicated_rule" crush rule.

Change-Id: I4df224f2821872faa2eddec2120832e9a22f4a7c
2018-11-20 09:07:36 -06:00
Matthew Heler
55446e1f41 Move default CEPH journal size from 5GB to 10GB
Request from downstream to use 10GB journal sizes. Currently journals 
are created manually today, but there is upcoming work to have the
journals created by the Helm charts themselves. This value needs to be
put in as a default to ensure journals are sized appropiately.

Change-Id: Idaf46fac159ffc49063cee1628c63d5bd42b4bc6
2018-11-08 17:34:12 +00:00
Steve Wilkerson
45da8c2b69 Ceph: Update log directory host mount path
This updates the ceph-mon and ceph-osd charts to use the release
name for the hostpath defined for mounting the /var/log/ceph
directories to. This gives us a mechanism for creating unique log
directories for multiple releases of the same chart without the
need for specifying an override for each deployment of that chart

Change-Id: Ie6e05b99c32f24440fbade02d59c7bb14d8aa4c8
2018-10-29 13:05:46 -05:00
Matthew Heler
6ef48d3706 Further performance tuning changes for Ceph
- Throttle down snap trimming as to lessen it's performance impact
(Setting just osd_snap_trim_priority isn't effective enough to throttle
down the impact)
osd_snap_trim_sleep: 0.1 (default 0)
osd_pg_max_concurrent_snap_trims: 1 (default 2)

- Align filestore_merge_threshold with upstream Ceph values
(A negative number disables this function, no change in behavior)
filestore_merge_threshold: -10 (formerly -50, default 10)

- Increase RGW pool thread size for more concurrent connections
rgw_thread_pool_size: 512 (default 100)

- Disable in-memory logs for the ms subsytem.
debug_ms: 0/0 (default 0/5)

- Formating cleanups

Change-Id: I4aefcb6e774cb3e1252e52ca6003cec495556467
2018-10-26 15:10:50 +00:00
Chinasubbareddy M
a1b8f394b2 ceph: make log directory configurable
this is make log directory configurable incase if  another mon or
osd running on same host can point to other directory

Change-Id: I2db6dffd45599386f8082db8f893c799d139aba3
2018-10-25 14:34:14 +00:00
Matthew Heler
f8ac6c3f21 ceph co-location journal and permission fixes
Support co-located journals with Ceph helm chart
Ensure proper ownership set on OSD/Journal disks

Change-Id: Ic954d75c8bd7532991dc9b3184ad6d74b97855d1
2018-10-25 08:21:31 +00:00
Steve Wilkerson
92717bdc72 Ceph: Remove fluentbit sidecars, mount hostpath for logs
This removes the fluentbit sidecars from the ceph-mon and ceph-osd
charts. Instead, we mount /var/log/ceph as a hostpath, and use the
fluentbit daemonset to target the mounted log files instead

This also updates the fluentd configuration to better handle the
correct configuration type for flush_interval (time vs int), as
well as updates the fluentd elasticsearch output values to help
address the gate failures resulting from the Elasticsearch bulk
endpoints failing

Change-Id: If3f2ff6371f267ed72379de25ff463079ba4cddc
2018-10-17 11:05:03 -05:00
Matthew Heler
5efac315f7 Initialize OSDs with a crush weight of 0 to prevent automatic rebalancing.
Weight the OSDs based on reported disk size when ceph-client chart runs.

Change-Id: I9f4080a9843f1a63564cf71154841b351382bfe2
2018-10-16 21:33:49 +00:00
Zuul
c10f9ce59e Merge "Modify Ceph default settings for improved performance" 2018-09-20 22:44:11 +00:00
Jean-Charles Lopez
c6cad19d11 Modify Ceph default settings for improved performance
Change-Id: Ia0d856e53f3bfdc1414264b468b576003dc23b6e
2018-09-13 07:47:42 -07:00
Pete Birley
bb3ff98d53 Add release uuid to pods and rc objects
This PS adds the ability to attach a release uuid to pods and rc
objects as desired. A follow up ps will add the ability to add arbitary
annotations to the same objects.

Change-Id: Iceedba457a03387f6fc44eb763a00fd57f9d84a5
Signed-off-by: Pete Birley <pete@port.direct>
2018-09-13 05:35:35 +00:00
Steve Wilkerson
25bc83b580 Ceph: Move Ceph charts to openstack-helm-infra
This continues the work of moving infrastructure related services
out of openstack-helm, by moving the ceph charts to openstack
helm infra instead.

Change-Id: I306ccd9d494f72a7946a7850f96d5c22f36eb8a0
2018-08-28 15:03:35 -05:00