This PS udpates the default image in the chart to the latest OSH image.
Change-Id: Ib8d2a72ad48049fe02560dc4405f0088890b6f64
Signed-off-by: Pete Birley <pete@port.direct>
Fix a naming issue with the cronjob's binary, and schedule the cron
job to run every 15 minutes for the gates. Additonally check to
to ensure we are only running on block devices. Also update the
script to work with ceph-volume created devices.
Change-Id: I8aedab0ac41c191ef39a08034fff3278027d7520
Clean up the PG troubleshooting method that was needed for
Luminous images. Since we are now on Mimic, this function is now
not needed.
Change-Id: Iccb148120410b956c25a1fed5655b3debba3412c
Refactor the OSD Block initialization code that performs clean ups
to use all the commands that ceph-disk zap uses.
Extend the functionality when an OSD initializes to create journal
partitions automatically. For example if /dev/sdc3 is defined as a
journal disk, the chart will automatically create that partition.
The size of the journal partition is determined by the
osd_journal_size that is defined in ceph.conf.
Change the OSD_FORCE_ZAP option to OSD_FORCE_REPAIR to automatically
recreate/self-heal Filestore OSDs. This option will now call a
function to repair a journal disk, and recreate partitions. One
caveat to this, is that the device paritions must be defined (ex.
/dev/sdc1) for a journal. Otherwise the OSD is zapped and re-created
if the whole disk (ex. /dev/sdc) is defined as the journal disk.
Change-Id: Ied131b51605595dce65eb29c0b64cb6af979066e
Create a cron and associated script to run monthly OSD defrags.
When the script runs it will switch the OSD disk to the CFQ I/O
scheduler to ensure that this is a non-blocking operation for ceph.
While this cron job will run monthly, it will only execute on OSDs
that are HDD based with Filestore.
Change-Id: I06a4679e0cbb3e065974d610606d232cde77e0b2
Under some conditions udev may not trigger correctly and create
the proper uuid symlinks required by Ceph. In order to work around
this we manually create the symlinks.
Change-Id: Icadce2c005864906bcfdae4d28117628c724cc1c
When ceph-osd journal as a directory and data as
a block device ceph-osd fails to deploy while
waiting for the journal file in
/var/lib/ceph/journal/journal.<id>
Added the condition before checking bluestore for
directory and removed the same later in the script
Closes-Bug: #1811154
Change-Id: Ibd4cf0be5ed90dfc4de5ffab554a91da1b62e5f4
Signed-off-by: Kranthi Guttikonda <kranthi.guttikonda@b-yond.com>
Signed-off-by: kranthi guttikonda <kranthi.guttikonda9@gmail.com>
In the event the base image is changed, the uid of the ceph OSD
directory may not align with the uid of the ceph user of the image.
In this case we check permissions and set them correctly.
Change-Id: I3bef7f6323d1de7c62320ccd423c929349bedb42
Change the release of Ceph from 12.2.3 (Luminous) to latest 13.2.2
(Mimic). Additionally use supported RHEL/Centos Images rather then
Ubuntu images, which are now considered deprecated by Redhat.
- Uplift all Ceph images to the latest 13.2.2 ceph-container images.
- RadosGW by default will now use the Beast backend.
- RadosGW has relaxed settings enabled for S3 naming conventions.
- Increased RadosGW resource limits due to backend change.
- All Luminous specific tests now test for both Luminous/Mimic.
- Gate scripts will remove all none required ceph packages. This is
required to not conflict with the pid/gid that the Redhat container
uses.
Change-Id: I9c00f3baa6c427e6223596ade95c65c331e763fb
- Split off duplicate code across multiple bash scripts into a common
file.
- Simplify the way journals are detected for block devices.
- Cleanup unused portions of the code.
- Standardize the syntax across all the code.
- Use sgdisk for zapping disks rather then ceph-disk.
Change-Id: I13e4a89cab3ee454dd36b5cdedfa2f341bf50b87
Under POD restart conditions there is a race condition with lsblk
causing the helm chart to zap a fully working OSD disk. We refactor
the code to remove this requirement.
Additonally the new automatic journal partitioning code has a race
condition in which the same journal partition could be picked twice
for OSDs on the same node. To resolve this we share a common tmp
directory from the node to all of the OSD pods on that node.
Change-Id: I807074c4c5e54b953b5c0efa4c169763c5629062
Ceph upstream bug: https://tracker.ceph.com/issues/21142 is
impacting the availability of our sites in pipeline. Add an option
to reset the past interval metadata time on an OSDs PG to solve for
this issue if it occurs.
Change-Id: I1fe0bee6ce8aa402c241f1ad457bbf532945a530
- Use whole disk /dev/sdc format.
- Don't specify partition and let ceph-osd util create
and manage partition.
- On an OSD disk failure, during manintanance window,
Journal partition for failed OSD should be deleted.
This will allow ceph-osd util to reuse space for new partition.
- Disk partition count num will continue to
increase as more OSD fails.
Change-Id: I87522db8cabebe8cb103481cdb65fc52f2ce2b07
This ps allows multiple ceph test pods to be present in cluster with
more than one ceph deployment.
Change-Id: I002a8b4681d97ed6ab95af23e1938870c28f5a83
Signed-off-by: Pete Birley <pete@port.direct>
The minimium requirements for a Ceph OSD have changed in the latest
Luminous release to accomodate Bluestore changes. We need to support
these changes as we look into upgrading Ceph to the latest Luminous
and beyond releases.
Change-Id: I3eddffe73cfd188ff012db7c74702de6921711e7
This PS updates the ceph failure domain overrides to support
dynamic configuration based on host/label based overrides.
Also fixes typo identified in the following ps for directories:
* https://review.openstack.org/#/c/623670/1
Change-Id: Ia449be23353083f9a77df2b592944571c907e277
Signed-off-by: Pete Birley <pete@port.direct>
Small typo in the logic filtering of the failure domain type for
an OSD pod. This wasn't initially found since it didn't break any
expected behavior tests.
Change-Id: I2b895bbc83c6c71fffe1a0db357b120b3ffb7f56
osd_scrub_load_threshold set to 10.0 (default 0.5)
- With the number of multi-core processors nowadays, it's fairly
typical to see systems over a load of 1.0. We need to adjust the
scrub load threshold so that scrubbing runs as scheduled even
when a node is moderately/lightly under load.
filestore_max_sync_interval set to 10s (default 5s)
- Larger default journal sizes (>1GB) will not be effectively used
unless the max sync interval time is increased for Filestore. The
benefit of this change is increased performance especially around
sequential write workloads.
mon_osd_down_out_interval set to 1800s (default 600s)
- OSD PODs can take longer then several minutes to boot up. Mark
an OSD as 'out' in the CRUSH map only after 15 minutes of being
'down'.
Change-Id: I62d6d0de436c270d3295671f8c7f74c89b3bd71e
Add helper scripts that are called by a POD to switch
Ceph from DNS to IPs. This POD will loop every 5 minutes
to catch cases where the DNS might be unavailable.
On a POD's Service start switch ceph.conf to using IPs rather
then DNS.
Change-Id: I402199f55792ca9f5f28e436ff44d4a6ac9b7cf9
Largely inspired and taken from Kranthi's PS.
- Add support for creating custom CRUSH rules based off of failure
domains and device classes (ssd & hdd)
- Basic logic around the PG calculator to autodetect the number of
OSDs globally and per device class (required when using custom crush
rules that specify device classes).
Change-Id: I13a6f5eb21494746c2b77e340e8d0dcb0d81a591
Add support for a rack level CRUSH map. Rack level CRUSH support is
enabled by using the "rack_replicated_rule" crush rule.
Change-Id: I4df224f2821872faa2eddec2120832e9a22f4a7c
Request from downstream to use 10GB journal sizes. Currently journals
are created manually today, but there is upcoming work to have the
journals created by the Helm charts themselves. This value needs to be
put in as a default to ensure journals are sized appropiately.
Change-Id: Idaf46fac159ffc49063cee1628c63d5bd42b4bc6
This updates the ceph-mon and ceph-osd charts to use the release
name for the hostpath defined for mounting the /var/log/ceph
directories to. This gives us a mechanism for creating unique log
directories for multiple releases of the same chart without the
need for specifying an override for each deployment of that chart
Change-Id: Ie6e05b99c32f24440fbade02d59c7bb14d8aa4c8
- Throttle down snap trimming as to lessen it's performance impact
(Setting just osd_snap_trim_priority isn't effective enough to throttle
down the impact)
osd_snap_trim_sleep: 0.1 (default 0)
osd_pg_max_concurrent_snap_trims: 1 (default 2)
- Align filestore_merge_threshold with upstream Ceph values
(A negative number disables this function, no change in behavior)
filestore_merge_threshold: -10 (formerly -50, default 10)
- Increase RGW pool thread size for more concurrent connections
rgw_thread_pool_size: 512 (default 100)
- Disable in-memory logs for the ms subsytem.
debug_ms: 0/0 (default 0/5)
- Formating cleanups
Change-Id: I4aefcb6e774cb3e1252e52ca6003cec495556467
this is make log directory configurable incase if another mon or
osd running on same host can point to other directory
Change-Id: I2db6dffd45599386f8082db8f893c799d139aba3
This removes the fluentbit sidecars from the ceph-mon and ceph-osd
charts. Instead, we mount /var/log/ceph as a hostpath, and use the
fluentbit daemonset to target the mounted log files instead
This also updates the fluentd configuration to better handle the
correct configuration type for flush_interval (time vs int), as
well as updates the fluentd elasticsearch output values to help
address the gate failures resulting from the Elasticsearch bulk
endpoints failing
Change-Id: If3f2ff6371f267ed72379de25ff463079ba4cddc
This PS adds the ability to attach a release uuid to pods and rc
objects as desired. A follow up ps will add the ability to add arbitary
annotations to the same objects.
Change-Id: Iceedba457a03387f6fc44eb763a00fd57f9d84a5
Signed-off-by: Pete Birley <pete@port.direct>
This continues the work of moving infrastructure related services
out of openstack-helm, by moving the ceph charts to openstack
helm infra instead.
Change-Id: I306ccd9d494f72a7946a7850f96d5c22f36eb8a0