Fix capacity calculations in the CephFS driver
The driver inflated total and available capacity due to an incorrect calculation. The driver was also ignoring the configuration option "reserved_share_percentage" that allows deployers to set aside space from scheduling to prevent oversubscription. While this bugfix may have an upgrade impact, some things must be clarified: - Inflating the total, free space will allow manila to schedule workloads that may run out of space - this may cause end user downtime and frustration, because shares are created (empty subvolumes on ceph occupy no space) easily, but they could get throttled as they start to fill up. - CephFS shares are always thinly provisioned but, the driver does not support oversubscription via manila. So, real free space is what determines capacity based scheduler decisions. Users however expect share sizes to be honored, and manila will allow provisioning as long as there is free space on the cluster. This means that Ceph cluster administrators must manage oversubscription outside of manila to prevent misbehavior. Depends-On: Ic96b65d2caab788afca8bfc45575f3c05dc88008 Change-Id: I6ab157d6d099fe910ec1d90193783b55053ce8f6 Closes-Bug: #1890833 Signed-off-by: Goutham Pacha Ravi <gouthampravi@gmail.com>
This commit is contained in:
parent
5f433b59ba
commit
22d6fe98a3
@ -314,6 +314,22 @@ using the section name, ``cephfsnfs1``.
|
|||||||
enabled_share_backends = generic1, cephfsnfs1
|
enabled_share_backends = generic1, cephfsnfs1
|
||||||
|
|
||||||
|
|
||||||
|
Space considerations
|
||||||
|
~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
The CephFS driver reports total and free capacity available across the Ceph
|
||||||
|
cluster to manila to allow provisioning. All CephFS shares are thinly
|
||||||
|
provisioned, i.e., empty shares do not consume any significant space
|
||||||
|
on the cluster. The CephFS driver does not allow controlling oversubscription
|
||||||
|
via manila. So, as long as there is free space, provisioning will continue,
|
||||||
|
and eventually this may cause your Ceph cluster to be over provisioned and
|
||||||
|
you may run out of space if shares are being filled to capacity. It is advised
|
||||||
|
that you use Ceph's monitoring tools to monitor space usage and add more
|
||||||
|
storage when required in order to honor space requirements for provisioned
|
||||||
|
manila shares. You may use the driver configuration option
|
||||||
|
``reserved_share_percentage`` to prevent manila from filling up your Ceph
|
||||||
|
cluster, and allow existing shares to grow.
|
||||||
|
|
||||||
Creating shares
|
Creating shares
|
||||||
~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
@ -167,8 +167,8 @@ class CephFSDriver(driver.ExecuteMixin, driver.GaneshaMixin,
|
|||||||
def _update_share_stats(self):
|
def _update_share_stats(self):
|
||||||
stats = self.volume_client.rados.get_cluster_stats()
|
stats = self.volume_client.rados.get_cluster_stats()
|
||||||
|
|
||||||
total_capacity_gb = stats['kb'] * units.Mi
|
total_capacity_gb = round(stats['kb'] / units.Mi, 2)
|
||||||
free_capacity_gb = stats['kb_avail'] * units.Mi
|
free_capacity_gb = round(stats['kb_avail'] / units.Mi, 2)
|
||||||
|
|
||||||
data = {
|
data = {
|
||||||
'vendor_name': 'Ceph',
|
'vendor_name': 'Ceph',
|
||||||
@ -182,7 +182,8 @@ class CephFSDriver(driver.ExecuteMixin, driver.GaneshaMixin,
|
|||||||
'total_capacity_gb': total_capacity_gb,
|
'total_capacity_gb': total_capacity_gb,
|
||||||
'free_capacity_gb': free_capacity_gb,
|
'free_capacity_gb': free_capacity_gb,
|
||||||
'qos': 'False',
|
'qos': 'False',
|
||||||
'reserved_percentage': 0,
|
'reserved_percentage': self.configuration.safe_get(
|
||||||
|
'reserved_share_percentage'),
|
||||||
'dedupe': [False],
|
'dedupe': [False],
|
||||||
'compression': [False],
|
'compression': [False],
|
||||||
'thin_provisioning': [False]
|
'thin_provisioning': [False]
|
||||||
|
@ -75,8 +75,10 @@ class MockVolumeClientModule(object):
|
|||||||
self.get_used_bytes = mock.Mock(return_value=self.mock_used_bytes)
|
self.get_used_bytes = mock.Mock(return_value=self.mock_used_bytes)
|
||||||
self.rados = mock.Mock()
|
self.rados = mock.Mock()
|
||||||
self.rados.get_cluster_stats = mock.Mock(return_value={
|
self.rados.get_cluster_stats = mock.Mock(return_value={
|
||||||
"kb": 1000,
|
"kb": 172953600,
|
||||||
"kb_avail": 500
|
"kb_avail": 157123584,
|
||||||
|
"kb_used": 15830016,
|
||||||
|
"num_objects": 26,
|
||||||
})
|
})
|
||||||
|
|
||||||
|
|
||||||
@ -352,10 +354,15 @@ class CephFSDriverTestCase(test.TestCase):
|
|||||||
|
|
||||||
def test_update_share_stats(self):
|
def test_update_share_stats(self):
|
||||||
self._driver.get_configured_ip_versions = mock.Mock(return_value=[4])
|
self._driver.get_configured_ip_versions = mock.Mock(return_value=[4])
|
||||||
self._driver._volume_client
|
self._driver.configuration.local_conf.set_override(
|
||||||
|
'reserved_share_percentage', 5)
|
||||||
|
|
||||||
self._driver._update_share_stats()
|
self._driver._update_share_stats()
|
||||||
result = self._driver._stats
|
result = self._driver._stats
|
||||||
|
|
||||||
|
self.assertEqual(5, result['pools'][0]['reserved_percentage'])
|
||||||
|
self.assertEqual(164.94, result['pools'][0]['total_capacity_gb'])
|
||||||
|
self.assertEqual(149.84, result['pools'][0]['free_capacity_gb'])
|
||||||
self.assertTrue(result['ipv4_support'])
|
self.assertTrue(result['ipv4_support'])
|
||||||
self.assertFalse(result['ipv6_support'])
|
self.assertFalse(result['ipv6_support'])
|
||||||
self.assertEqual("CEPHFS", result['storage_protocol'])
|
self.assertEqual("CEPHFS", result['storage_protocol'])
|
||||||
|
@ -0,0 +1,22 @@
|
|||||||
|
---
|
||||||
|
upgrade:
|
||||||
|
- |
|
||||||
|
This version includes a fix to the CephFS drivers to address `an issue
|
||||||
|
<https://launchpad.net/bugs/1890833>`_ with total and free space calculation
|
||||||
|
in the CephFS driver. When you update, you will notice that the space
|
||||||
|
calculations reflect reality in your Ceph clusters, and provisioning may
|
||||||
|
fail if the share sizes exceed the cluster's free space. CephFS shares are
|
||||||
|
always thin provisioned, and the driver does not support oversubscription
|
||||||
|
via Manila; so space can be claimed for new shares as long as there is free
|
||||||
|
space on the cluster. Use the "reserved_share_percentage" back end
|
||||||
|
configuration option to ensure there's always space left aside for
|
||||||
|
provisioned workloads to grow over time.
|
||||||
|
fixes:
|
||||||
|
- |
|
||||||
|
The CephFS driver has now been fixed to report total and available space on
|
||||||
|
the storage system correctly. See `Launchpad bug#1890833
|
||||||
|
<https://launchpad.net/bugs/1890833>`_ for more details.
|
||||||
|
- |
|
||||||
|
The CephFS driver now honors the configuration option
|
||||||
|
"reserved_share_percentage", and it can be used to prevent save
|
||||||
|
space for provisioned workloads to grow over time.
|
Loading…
x
Reference in New Issue
Block a user