Add auto eviction for rook-ceph app
When there is a buggy cephfs client, the ceph health detail output will show a message like the following: HEALTH_WARN 1 clients failing to respond to capability release; \ 1 MDSs report slow requests When this happens, the cephfs client cannot read or write to the volume. To restore the communication, it is necessary to force a client reconnection. To force this reconnection, the client must be evicted by Ceph. The mds_cap_revoke_eviction_timeout parameter is used to set a timeout for a response made by the client and mds_session_blocklist_on_evict is used to not add the client to the blacklist when it is detected that the client has been evicted, to allow it to reconnect again after eviction. Test Plan: - PASS: Starts a pod reading and writing to a cephfs pvc in an infinite loop - PASS: Verifies that the mds client will automatically evict when the message is displayed in the 'ceph health detail' command Closes-Bug: 2095024 Change-Id: I0b71d9b01d114d2fc27625ae6ac4ae5055f2d9db Signed-off-by: Gustavo Ornaghi Antunes <gustavo.ornaghiantunes@windriver.com>
This commit is contained in:
parent
3d6aacdb9b
commit
8aaecf0af2
@ -1,5 +1,5 @@
|
||||
#
|
||||
# Copyright (c) 2024 Wind River Systems, Inc.
|
||||
# Copyright (c) 2024-2025 Wind River Systems, Inc.
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
@ -28,6 +28,10 @@ configOverride: |
|
||||
|
||||
operatorNamespace: rook-ceph
|
||||
cephClusterSpec:
|
||||
cephConfig:
|
||||
mds:
|
||||
mds_cap_revoke_eviction_timeout: "30"
|
||||
mds_session_blocklist_on_evict: "false"
|
||||
dataDirHostPath: /var/lib/ceph/data
|
||||
cephVersion:
|
||||
image: quay.io/ceph/ceph:v18.2.2
|
||||
|
Loading…
Reference in New Issue
Block a user