swift/test/probe/test_object_expirer.py

408 lines
17 KiB
Python
Raw Normal View History

Fix object-expirer for missing objects Currently if the object-expirer goes to delete an object and the primary nodes are unavailable, or the object is on handoffs - the object servers are unable to verify the x-if-delete-at timestamp and return 412, without writing a tombstone or updating the containers. The expirer treats 412 as success and the dark data is not removed form the object servers nor the object removed in the listing. As a side effect of this bug, if the expirer encounters split brain the delete would never get processed in the correct storage policy. It seems it's just not correct to treat the lack of data as success. Now the object server will treat x-if-delete at against a non-existent object as a 404, and to distinguish from a successfull process of an x-if-delete-at request, will return 204. The expirer will treat a 404 response from swift as a failure, and will continue to attempt to expire the object until it is older that it's configurable reclaim age. However swift will only return 404 if the majority of nodes are able to return success, or if only even a single node is able to accept the x-if-delete-at request the containers will get updated and replicaiton will settle the tombstone - the subsequent x-if-delete-at request will 412 and be removed from the queue. It's worth noting that if an object with x-delete-at meta is DELETED (by a client request) an async update for the expiring update containers will be processed to remove the queue entry - but if no primary nodes handle the DELETE request replication will never remove the expiring entry and assuming it's scheduled for beyond the tombstones reclaim age - the queue entry will not be processable. In this case the expirer will attempt to DELETE the object (and get 404s) in vain until the queue entry passes the configurable reclaim age. DocImpact Implements: blueprint storage-policies Change-Id: I66260e99fda37e97d6d2470971b6f811ee9e01be
2014-06-06 11:35:34 -07:00
#!/usr/bin/python -u
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
# implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import json
Fix object-expirer for missing objects Currently if the object-expirer goes to delete an object and the primary nodes are unavailable, or the object is on handoffs - the object servers are unable to verify the x-if-delete-at timestamp and return 412, without writing a tombstone or updating the containers. The expirer treats 412 as success and the dark data is not removed form the object servers nor the object removed in the listing. As a side effect of this bug, if the expirer encounters split brain the delete would never get processed in the correct storage policy. It seems it's just not correct to treat the lack of data as success. Now the object server will treat x-if-delete at against a non-existent object as a 404, and to distinguish from a successfull process of an x-if-delete-at request, will return 204. The expirer will treat a 404 response from swift as a failure, and will continue to attempt to expire the object until it is older that it's configurable reclaim age. However swift will only return 404 if the majority of nodes are able to return success, or if only even a single node is able to accept the x-if-delete-at request the containers will get updated and replicaiton will settle the tombstone - the subsequent x-if-delete-at request will 412 and be removed from the queue. It's worth noting that if an object with x-delete-at meta is DELETED (by a client request) an async update for the expiring update containers will be processed to remove the queue entry - but if no primary nodes handle the DELETE request replication will never remove the expiring entry and assuming it's scheduled for beyond the tombstones reclaim age - the queue entry will not be processable. In this case the expirer will attempt to DELETE the object (and get 404s) in vain until the queue entry passes the configurable reclaim age. DocImpact Implements: blueprint storage-policies Change-Id: I66260e99fda37e97d6d2470971b6f811ee9e01be
2014-06-06 11:35:34 -07:00
import random
import time
Fix object-expirer for missing objects Currently if the object-expirer goes to delete an object and the primary nodes are unavailable, or the object is on handoffs - the object servers are unable to verify the x-if-delete-at timestamp and return 412, without writing a tombstone or updating the containers. The expirer treats 412 as success and the dark data is not removed form the object servers nor the object removed in the listing. As a side effect of this bug, if the expirer encounters split brain the delete would never get processed in the correct storage policy. It seems it's just not correct to treat the lack of data as success. Now the object server will treat x-if-delete at against a non-existent object as a 404, and to distinguish from a successfull process of an x-if-delete-at request, will return 204. The expirer will treat a 404 response from swift as a failure, and will continue to attempt to expire the object until it is older that it's configurable reclaim age. However swift will only return 404 if the majority of nodes are able to return success, or if only even a single node is able to accept the x-if-delete-at request the containers will get updated and replicaiton will settle the tombstone - the subsequent x-if-delete-at request will 412 and be removed from the queue. It's worth noting that if an object with x-delete-at meta is DELETED (by a client request) an async update for the expiring update containers will be processed to remove the queue entry - but if no primary nodes handle the DELETE request replication will never remove the expiring entry and assuming it's scheduled for beyond the tombstones reclaim age - the queue entry will not be processable. In this case the expirer will attempt to DELETE the object (and get 404s) in vain until the queue entry passes the configurable reclaim age. DocImpact Implements: blueprint storage-policies Change-Id: I66260e99fda37e97d6d2470971b6f811ee9e01be
2014-06-06 11:35:34 -07:00
import uuid
import unittest
Fix object-expirer for missing objects Currently if the object-expirer goes to delete an object and the primary nodes are unavailable, or the object is on handoffs - the object servers are unable to verify the x-if-delete-at timestamp and return 412, without writing a tombstone or updating the containers. The expirer treats 412 as success and the dark data is not removed form the object servers nor the object removed in the listing. As a side effect of this bug, if the expirer encounters split brain the delete would never get processed in the correct storage policy. It seems it's just not correct to treat the lack of data as success. Now the object server will treat x-if-delete at against a non-existent object as a 404, and to distinguish from a successfull process of an x-if-delete-at request, will return 204. The expirer will treat a 404 response from swift as a failure, and will continue to attempt to expire the object until it is older that it's configurable reclaim age. However swift will only return 404 if the majority of nodes are able to return success, or if only even a single node is able to accept the x-if-delete-at request the containers will get updated and replicaiton will settle the tombstone - the subsequent x-if-delete-at request will 412 and be removed from the queue. It's worth noting that if an object with x-delete-at meta is DELETED (by a client request) an async update for the expiring update containers will be processed to remove the queue entry - but if no primary nodes handle the DELETE request replication will never remove the expiring entry and assuming it's scheduled for beyond the tombstones reclaim age - the queue entry will not be processable. In this case the expirer will attempt to DELETE the object (and get 404s) in vain until the queue entry passes the configurable reclaim age. DocImpact Implements: blueprint storage-policies Change-Id: I66260e99fda37e97d6d2470971b6f811ee9e01be
2014-06-06 11:35:34 -07:00
from swift.common.internal_client import InternalClient, UnexpectedResponse
Fix object-expirer for missing objects Currently if the object-expirer goes to delete an object and the primary nodes are unavailable, or the object is on handoffs - the object servers are unable to verify the x-if-delete-at timestamp and return 412, without writing a tombstone or updating the containers. The expirer treats 412 as success and the dark data is not removed form the object servers nor the object removed in the listing. As a side effect of this bug, if the expirer encounters split brain the delete would never get processed in the correct storage policy. It seems it's just not correct to treat the lack of data as success. Now the object server will treat x-if-delete at against a non-existent object as a 404, and to distinguish from a successfull process of an x-if-delete-at request, will return 204. The expirer will treat a 404 response from swift as a failure, and will continue to attempt to expire the object until it is older that it's configurable reclaim age. However swift will only return 404 if the majority of nodes are able to return success, or if only even a single node is able to accept the x-if-delete-at request the containers will get updated and replicaiton will settle the tombstone - the subsequent x-if-delete-at request will 412 and be removed from the queue. It's worth noting that if an object with x-delete-at meta is DELETED (by a client request) an async update for the expiring update containers will be processed to remove the queue entry - but if no primary nodes handle the DELETE request replication will never remove the expiring entry and assuming it's scheduled for beyond the tombstones reclaim age - the queue entry will not be processable. In this case the expirer will attempt to DELETE the object (and get 404s) in vain until the queue entry passes the configurable reclaim age. DocImpact Implements: blueprint storage-policies Change-Id: I66260e99fda37e97d6d2470971b6f811ee9e01be
2014-06-06 11:35:34 -07:00
from swift.common.manager import Manager
Add two vector timestamps The normalized form of the X-Timestamp header looks like a float with a fixed width to ensure stable string sorting - normalized timestamps look like "1402464677.04188" To support overwrites of existing data without modifying the original timestamp but still maintain consistency a second internal offset vector is append to the normalized timestamp form which compares and sorts greater than the fixed width float format but less than a newer timestamp. The internalized format of timestamps looks like "1402464677.04188_0000000000000000" - the portion after the underscore is the offset and is a formatted hexadecimal integer. The internalized form is not exposed to clients in responses from Swift. Normal client operations will not create a timestamp with an offset. The Timestamp class in common.utils supports internalized and normalized formatting of timestamps and also comparison of timestamp values. When the offset value of a Timestamp is 0 - it's considered insignificant and need not be represented in the string format; to support backwards compatibility during a Swift upgrade the internalized and normalized form of a Timestamp with an insignificant offset are identical. When a timestamp includes an offset it will always be represented in the internalized form, but is still excluded from the normalized form. Timestamps with an equivalent timestamp portion (the float part) will compare and order by their offset. Timestamps with a greater timestamp portion will always compare and order greater than a Timestamp with a lesser timestamp regardless of it's offset. String comparison and ordering is guaranteed for the internalized string format, and is backwards compatible for normalized timestamps which do not include an offset. The reconciler currently uses a offset bump to ensure that objects can move to the wrong storage policy and be moved back. This use-case is valid because the content represented by the user-facing timestamp is not modified in way. Future consumers of the offset vector of timestamps should be mindful of HTTP semantics of If-Modified and take care to avoid deviation in the response from the object server without an accompanying change to the user facing timestamp. DocImpact Implements: blueprint storage-policies Change-Id: Id85c960b126ec919a481dc62469bf172b7fb8549
2014-06-10 22:17:47 -07:00
from swift.common.utils import Timestamp
Fix object-expirer for missing objects Currently if the object-expirer goes to delete an object and the primary nodes are unavailable, or the object is on handoffs - the object servers are unable to verify the x-if-delete-at timestamp and return 412, without writing a tombstone or updating the containers. The expirer treats 412 as success and the dark data is not removed form the object servers nor the object removed in the listing. As a side effect of this bug, if the expirer encounters split brain the delete would never get processed in the correct storage policy. It seems it's just not correct to treat the lack of data as success. Now the object server will treat x-if-delete at against a non-existent object as a 404, and to distinguish from a successfull process of an x-if-delete-at request, will return 204. The expirer will treat a 404 response from swift as a failure, and will continue to attempt to expire the object until it is older that it's configurable reclaim age. However swift will only return 404 if the majority of nodes are able to return success, or if only even a single node is able to accept the x-if-delete-at request the containers will get updated and replicaiton will settle the tombstone - the subsequent x-if-delete-at request will 412 and be removed from the queue. It's worth noting that if an object with x-delete-at meta is DELETED (by a client request) an async update for the expiring update containers will be processed to remove the queue entry - but if no primary nodes handle the DELETE request replication will never remove the expiring entry and assuming it's scheduled for beyond the tombstones reclaim age - the queue entry will not be processable. In this case the expirer will attempt to DELETE the object (and get 404s) in vain until the queue entry passes the configurable reclaim age. DocImpact Implements: blueprint storage-policies Change-Id: I66260e99fda37e97d6d2470971b6f811ee9e01be
2014-06-06 11:35:34 -07:00
from test.probe.common import ReplProbeTest, ENABLED_POLICIES
from test.probe.brain import BrainSplitter
Fix object-expirer for missing objects Currently if the object-expirer goes to delete an object and the primary nodes are unavailable, or the object is on handoffs - the object servers are unable to verify the x-if-delete-at timestamp and return 412, without writing a tombstone or updating the containers. The expirer treats 412 as success and the dark data is not removed form the object servers nor the object removed in the listing. As a side effect of this bug, if the expirer encounters split brain the delete would never get processed in the correct storage policy. It seems it's just not correct to treat the lack of data as success. Now the object server will treat x-if-delete at against a non-existent object as a 404, and to distinguish from a successfull process of an x-if-delete-at request, will return 204. The expirer will treat a 404 response from swift as a failure, and will continue to attempt to expire the object until it is older that it's configurable reclaim age. However swift will only return 404 if the majority of nodes are able to return success, or if only even a single node is able to accept the x-if-delete-at request the containers will get updated and replicaiton will settle the tombstone - the subsequent x-if-delete-at request will 412 and be removed from the queue. It's worth noting that if an object with x-delete-at meta is DELETED (by a client request) an async update for the expiring update containers will be processed to remove the queue entry - but if no primary nodes handle the DELETE request replication will never remove the expiring entry and assuming it's scheduled for beyond the tombstones reclaim age - the queue entry will not be processable. In this case the expirer will attempt to DELETE the object (and get 404s) in vain until the queue entry passes the configurable reclaim age. DocImpact Implements: blueprint storage-policies Change-Id: I66260e99fda37e97d6d2470971b6f811ee9e01be
2014-06-06 11:35:34 -07:00
from swiftclient import client
class TestObjectExpirer(ReplProbeTest):
Fix object-expirer for missing objects Currently if the object-expirer goes to delete an object and the primary nodes are unavailable, or the object is on handoffs - the object servers are unable to verify the x-if-delete-at timestamp and return 412, without writing a tombstone or updating the containers. The expirer treats 412 as success and the dark data is not removed form the object servers nor the object removed in the listing. As a side effect of this bug, if the expirer encounters split brain the delete would never get processed in the correct storage policy. It seems it's just not correct to treat the lack of data as success. Now the object server will treat x-if-delete at against a non-existent object as a 404, and to distinguish from a successfull process of an x-if-delete-at request, will return 204. The expirer will treat a 404 response from swift as a failure, and will continue to attempt to expire the object until it is older that it's configurable reclaim age. However swift will only return 404 if the majority of nodes are able to return success, or if only even a single node is able to accept the x-if-delete-at request the containers will get updated and replicaiton will settle the tombstone - the subsequent x-if-delete-at request will 412 and be removed from the queue. It's worth noting that if an object with x-delete-at meta is DELETED (by a client request) an async update for the expiring update containers will be processed to remove the queue entry - but if no primary nodes handle the DELETE request replication will never remove the expiring entry and assuming it's scheduled for beyond the tombstones reclaim age - the queue entry will not be processable. In this case the expirer will attempt to DELETE the object (and get 404s) in vain until the queue entry passes the configurable reclaim age. DocImpact Implements: blueprint storage-policies Change-Id: I66260e99fda37e97d6d2470971b6f811ee9e01be
2014-06-06 11:35:34 -07:00
def setUp(self):
self.expirer = Manager(['object-expirer'])
self.expirer.start()
err = self.expirer.stop()
if err:
raise unittest.SkipTest('Unable to verify object-expirer service')
Fix object-expirer for missing objects Currently if the object-expirer goes to delete an object and the primary nodes are unavailable, or the object is on handoffs - the object servers are unable to verify the x-if-delete-at timestamp and return 412, without writing a tombstone or updating the containers. The expirer treats 412 as success and the dark data is not removed form the object servers nor the object removed in the listing. As a side effect of this bug, if the expirer encounters split brain the delete would never get processed in the correct storage policy. It seems it's just not correct to treat the lack of data as success. Now the object server will treat x-if-delete at against a non-existent object as a 404, and to distinguish from a successfull process of an x-if-delete-at request, will return 204. The expirer will treat a 404 response from swift as a failure, and will continue to attempt to expire the object until it is older that it's configurable reclaim age. However swift will only return 404 if the majority of nodes are able to return success, or if only even a single node is able to accept the x-if-delete-at request the containers will get updated and replicaiton will settle the tombstone - the subsequent x-if-delete-at request will 412 and be removed from the queue. It's worth noting that if an object with x-delete-at meta is DELETED (by a client request) an async update for the expiring update containers will be processed to remove the queue entry - but if no primary nodes handle the DELETE request replication will never remove the expiring entry and assuming it's scheduled for beyond the tombstones reclaim age - the queue entry will not be processable. In this case the expirer will attempt to DELETE the object (and get 404s) in vain until the queue entry passes the configurable reclaim age. DocImpact Implements: blueprint storage-policies Change-Id: I66260e99fda37e97d6d2470971b6f811ee9e01be
2014-06-06 11:35:34 -07:00
conf_files = []
for server in self.expirer.servers:
conf_files.extend(server.conf_files())
conf_file = conf_files[0]
self.client = InternalClient(conf_file, 'probe-test', 3)
super(TestObjectExpirer, self).setUp()
Fix object-expirer for missing objects Currently if the object-expirer goes to delete an object and the primary nodes are unavailable, or the object is on handoffs - the object servers are unable to verify the x-if-delete-at timestamp and return 412, without writing a tombstone or updating the containers. The expirer treats 412 as success and the dark data is not removed form the object servers nor the object removed in the listing. As a side effect of this bug, if the expirer encounters split brain the delete would never get processed in the correct storage policy. It seems it's just not correct to treat the lack of data as success. Now the object server will treat x-if-delete at against a non-existent object as a 404, and to distinguish from a successfull process of an x-if-delete-at request, will return 204. The expirer will treat a 404 response from swift as a failure, and will continue to attempt to expire the object until it is older that it's configurable reclaim age. However swift will only return 404 if the majority of nodes are able to return success, or if only even a single node is able to accept the x-if-delete-at request the containers will get updated and replicaiton will settle the tombstone - the subsequent x-if-delete-at request will 412 and be removed from the queue. It's worth noting that if an object with x-delete-at meta is DELETED (by a client request) an async update for the expiring update containers will be processed to remove the queue entry - but if no primary nodes handle the DELETE request replication will never remove the expiring entry and assuming it's scheduled for beyond the tombstones reclaim age - the queue entry will not be processable. In this case the expirer will attempt to DELETE the object (and get 404s) in vain until the queue entry passes the configurable reclaim age. DocImpact Implements: blueprint storage-policies Change-Id: I66260e99fda37e97d6d2470971b6f811ee9e01be
2014-06-06 11:35:34 -07:00
self.container_name = 'container-%s' % uuid.uuid4()
self.object_name = 'object-%s' % uuid.uuid4()
self.brain = BrainSplitter(self.url, self.token, self.container_name,
self.object_name)
def _check_obj_in_container_listing(self):
for obj in self.client.iter_objects(self.account,
self.container_name):
if self.object_name == obj['name']:
return True
return False
@unittest.skipIf(len(ENABLED_POLICIES) < 2, "Need more than one policy")
Fix object-expirer for missing objects Currently if the object-expirer goes to delete an object and the primary nodes are unavailable, or the object is on handoffs - the object servers are unable to verify the x-if-delete-at timestamp and return 412, without writing a tombstone or updating the containers. The expirer treats 412 as success and the dark data is not removed form the object servers nor the object removed in the listing. As a side effect of this bug, if the expirer encounters split brain the delete would never get processed in the correct storage policy. It seems it's just not correct to treat the lack of data as success. Now the object server will treat x-if-delete at against a non-existent object as a 404, and to distinguish from a successfull process of an x-if-delete-at request, will return 204. The expirer will treat a 404 response from swift as a failure, and will continue to attempt to expire the object until it is older that it's configurable reclaim age. However swift will only return 404 if the majority of nodes are able to return success, or if only even a single node is able to accept the x-if-delete-at request the containers will get updated and replicaiton will settle the tombstone - the subsequent x-if-delete-at request will 412 and be removed from the queue. It's worth noting that if an object with x-delete-at meta is DELETED (by a client request) an async update for the expiring update containers will be processed to remove the queue entry - but if no primary nodes handle the DELETE request replication will never remove the expiring entry and assuming it's scheduled for beyond the tombstones reclaim age - the queue entry will not be processable. In this case the expirer will attempt to DELETE the object (and get 404s) in vain until the queue entry passes the configurable reclaim age. DocImpact Implements: blueprint storage-policies Change-Id: I66260e99fda37e97d6d2470971b6f811ee9e01be
2014-06-06 11:35:34 -07:00
def test_expirer_object_split_brain(self):
old_policy = random.choice(ENABLED_POLICIES)
wrong_policy = random.choice([p for p in ENABLED_POLICIES
if p != old_policy])
Fix object-expirer for missing objects Currently if the object-expirer goes to delete an object and the primary nodes are unavailable, or the object is on handoffs - the object servers are unable to verify the x-if-delete-at timestamp and return 412, without writing a tombstone or updating the containers. The expirer treats 412 as success and the dark data is not removed form the object servers nor the object removed in the listing. As a side effect of this bug, if the expirer encounters split brain the delete would never get processed in the correct storage policy. It seems it's just not correct to treat the lack of data as success. Now the object server will treat x-if-delete at against a non-existent object as a 404, and to distinguish from a successfull process of an x-if-delete-at request, will return 204. The expirer will treat a 404 response from swift as a failure, and will continue to attempt to expire the object until it is older that it's configurable reclaim age. However swift will only return 404 if the majority of nodes are able to return success, or if only even a single node is able to accept the x-if-delete-at request the containers will get updated and replicaiton will settle the tombstone - the subsequent x-if-delete-at request will 412 and be removed from the queue. It's worth noting that if an object with x-delete-at meta is DELETED (by a client request) an async update for the expiring update containers will be processed to remove the queue entry - but if no primary nodes handle the DELETE request replication will never remove the expiring entry and assuming it's scheduled for beyond the tombstones reclaim age - the queue entry will not be processable. In this case the expirer will attempt to DELETE the object (and get 404s) in vain until the queue entry passes the configurable reclaim age. DocImpact Implements: blueprint storage-policies Change-Id: I66260e99fda37e97d6d2470971b6f811ee9e01be
2014-06-06 11:35:34 -07:00
# create an expiring object and a container with the wrong policy
self.brain.stop_primary_half()
self.brain.put_container(int(old_policy))
Add two vector timestamps The normalized form of the X-Timestamp header looks like a float with a fixed width to ensure stable string sorting - normalized timestamps look like "1402464677.04188" To support overwrites of existing data without modifying the original timestamp but still maintain consistency a second internal offset vector is append to the normalized timestamp form which compares and sorts greater than the fixed width float format but less than a newer timestamp. The internalized format of timestamps looks like "1402464677.04188_0000000000000000" - the portion after the underscore is the offset and is a formatted hexadecimal integer. The internalized form is not exposed to clients in responses from Swift. Normal client operations will not create a timestamp with an offset. The Timestamp class in common.utils supports internalized and normalized formatting of timestamps and also comparison of timestamp values. When the offset value of a Timestamp is 0 - it's considered insignificant and need not be represented in the string format; to support backwards compatibility during a Swift upgrade the internalized and normalized form of a Timestamp with an insignificant offset are identical. When a timestamp includes an offset it will always be represented in the internalized form, but is still excluded from the normalized form. Timestamps with an equivalent timestamp portion (the float part) will compare and order by their offset. Timestamps with a greater timestamp portion will always compare and order greater than a Timestamp with a lesser timestamp regardless of it's offset. String comparison and ordering is guaranteed for the internalized string format, and is backwards compatible for normalized timestamps which do not include an offset. The reconciler currently uses a offset bump to ensure that objects can move to the wrong storage policy and be moved back. This use-case is valid because the content represented by the user-facing timestamp is not modified in way. Future consumers of the offset vector of timestamps should be mindful of HTTP semantics of If-Modified and take care to avoid deviation in the response from the object server without an accompanying change to the user facing timestamp. DocImpact Implements: blueprint storage-policies Change-Id: Id85c960b126ec919a481dc62469bf172b7fb8549
2014-06-10 22:17:47 -07:00
self.brain.put_object(headers={'X-Delete-After': 2})
Fix object-expirer for missing objects Currently if the object-expirer goes to delete an object and the primary nodes are unavailable, or the object is on handoffs - the object servers are unable to verify the x-if-delete-at timestamp and return 412, without writing a tombstone or updating the containers. The expirer treats 412 as success and the dark data is not removed form the object servers nor the object removed in the listing. As a side effect of this bug, if the expirer encounters split brain the delete would never get processed in the correct storage policy. It seems it's just not correct to treat the lack of data as success. Now the object server will treat x-if-delete at against a non-existent object as a 404, and to distinguish from a successfull process of an x-if-delete-at request, will return 204. The expirer will treat a 404 response from swift as a failure, and will continue to attempt to expire the object until it is older that it's configurable reclaim age. However swift will only return 404 if the majority of nodes are able to return success, or if only even a single node is able to accept the x-if-delete-at request the containers will get updated and replicaiton will settle the tombstone - the subsequent x-if-delete-at request will 412 and be removed from the queue. It's worth noting that if an object with x-delete-at meta is DELETED (by a client request) an async update for the expiring update containers will be processed to remove the queue entry - but if no primary nodes handle the DELETE request replication will never remove the expiring entry and assuming it's scheduled for beyond the tombstones reclaim age - the queue entry will not be processable. In this case the expirer will attempt to DELETE the object (and get 404s) in vain until the queue entry passes the configurable reclaim age. DocImpact Implements: blueprint storage-policies Change-Id: I66260e99fda37e97d6d2470971b6f811ee9e01be
2014-06-06 11:35:34 -07:00
# get the object timestamp
metadata = self.client.get_object_metadata(
self.account, self.container_name, self.object_name,
headers={'X-Backend-Storage-Policy-Index': int(old_policy)})
Add two vector timestamps The normalized form of the X-Timestamp header looks like a float with a fixed width to ensure stable string sorting - normalized timestamps look like "1402464677.04188" To support overwrites of existing data without modifying the original timestamp but still maintain consistency a second internal offset vector is append to the normalized timestamp form which compares and sorts greater than the fixed width float format but less than a newer timestamp. The internalized format of timestamps looks like "1402464677.04188_0000000000000000" - the portion after the underscore is the offset and is a formatted hexadecimal integer. The internalized form is not exposed to clients in responses from Swift. Normal client operations will not create a timestamp with an offset. The Timestamp class in common.utils supports internalized and normalized formatting of timestamps and also comparison of timestamp values. When the offset value of a Timestamp is 0 - it's considered insignificant and need not be represented in the string format; to support backwards compatibility during a Swift upgrade the internalized and normalized form of a Timestamp with an insignificant offset are identical. When a timestamp includes an offset it will always be represented in the internalized form, but is still excluded from the normalized form. Timestamps with an equivalent timestamp portion (the float part) will compare and order by their offset. Timestamps with a greater timestamp portion will always compare and order greater than a Timestamp with a lesser timestamp regardless of it's offset. String comparison and ordering is guaranteed for the internalized string format, and is backwards compatible for normalized timestamps which do not include an offset. The reconciler currently uses a offset bump to ensure that objects can move to the wrong storage policy and be moved back. This use-case is valid because the content represented by the user-facing timestamp is not modified in way. Future consumers of the offset vector of timestamps should be mindful of HTTP semantics of If-Modified and take care to avoid deviation in the response from the object server without an accompanying change to the user facing timestamp. DocImpact Implements: blueprint storage-policies Change-Id: Id85c960b126ec919a481dc62469bf172b7fb8549
2014-06-10 22:17:47 -07:00
create_timestamp = Timestamp(metadata['x-timestamp'])
Fix object-expirer for missing objects Currently if the object-expirer goes to delete an object and the primary nodes are unavailable, or the object is on handoffs - the object servers are unable to verify the x-if-delete-at timestamp and return 412, without writing a tombstone or updating the containers. The expirer treats 412 as success and the dark data is not removed form the object servers nor the object removed in the listing. As a side effect of this bug, if the expirer encounters split brain the delete would never get processed in the correct storage policy. It seems it's just not correct to treat the lack of data as success. Now the object server will treat x-if-delete at against a non-existent object as a 404, and to distinguish from a successfull process of an x-if-delete-at request, will return 204. The expirer will treat a 404 response from swift as a failure, and will continue to attempt to expire the object until it is older that it's configurable reclaim age. However swift will only return 404 if the majority of nodes are able to return success, or if only even a single node is able to accept the x-if-delete-at request the containers will get updated and replicaiton will settle the tombstone - the subsequent x-if-delete-at request will 412 and be removed from the queue. It's worth noting that if an object with x-delete-at meta is DELETED (by a client request) an async update for the expiring update containers will be processed to remove the queue entry - but if no primary nodes handle the DELETE request replication will never remove the expiring entry and assuming it's scheduled for beyond the tombstones reclaim age - the queue entry will not be processable. In this case the expirer will attempt to DELETE the object (and get 404s) in vain until the queue entry passes the configurable reclaim age. DocImpact Implements: blueprint storage-policies Change-Id: I66260e99fda37e97d6d2470971b6f811ee9e01be
2014-06-06 11:35:34 -07:00
self.brain.start_primary_half()
# get the expiring object updates in their queue, while we have all
# the servers up
Manager(['object-updater']).once()
self.brain.stop_handoff_half()
self.brain.put_container(int(wrong_policy))
# don't start handoff servers, only wrong policy is available
# make sure auto-created containers get in the account listing
Manager(['container-updater']).once()
# this guy should no-op since it's unable to expire the object
self.expirer.once()
self.brain.start_handoff_half()
self.get_to_final_state()
Fix object-expirer for missing objects Currently if the object-expirer goes to delete an object and the primary nodes are unavailable, or the object is on handoffs - the object servers are unable to verify the x-if-delete-at timestamp and return 412, without writing a tombstone or updating the containers. The expirer treats 412 as success and the dark data is not removed form the object servers nor the object removed in the listing. As a side effect of this bug, if the expirer encounters split brain the delete would never get processed in the correct storage policy. It seems it's just not correct to treat the lack of data as success. Now the object server will treat x-if-delete at against a non-existent object as a 404, and to distinguish from a successfull process of an x-if-delete-at request, will return 204. The expirer will treat a 404 response from swift as a failure, and will continue to attempt to expire the object until it is older that it's configurable reclaim age. However swift will only return 404 if the majority of nodes are able to return success, or if only even a single node is able to accept the x-if-delete-at request the containers will get updated and replicaiton will settle the tombstone - the subsequent x-if-delete-at request will 412 and be removed from the queue. It's worth noting that if an object with x-delete-at meta is DELETED (by a client request) an async update for the expiring update containers will be processed to remove the queue entry - but if no primary nodes handle the DELETE request replication will never remove the expiring entry and assuming it's scheduled for beyond the tombstones reclaim age - the queue entry will not be processable. In this case the expirer will attempt to DELETE the object (and get 404s) in vain until the queue entry passes the configurable reclaim age. DocImpact Implements: blueprint storage-policies Change-Id: I66260e99fda37e97d6d2470971b6f811ee9e01be
2014-06-06 11:35:34 -07:00
# validate object is expired
found_in_policy = None
metadata = self.client.get_object_metadata(
self.account, self.container_name, self.object_name,
acceptable_statuses=(4,),
headers={'X-Backend-Storage-Policy-Index': int(old_policy)})
self.assertIn('x-backend-timestamp', metadata)
Add two vector timestamps The normalized form of the X-Timestamp header looks like a float with a fixed width to ensure stable string sorting - normalized timestamps look like "1402464677.04188" To support overwrites of existing data without modifying the original timestamp but still maintain consistency a second internal offset vector is append to the normalized timestamp form which compares and sorts greater than the fixed width float format but less than a newer timestamp. The internalized format of timestamps looks like "1402464677.04188_0000000000000000" - the portion after the underscore is the offset and is a formatted hexadecimal integer. The internalized form is not exposed to clients in responses from Swift. Normal client operations will not create a timestamp with an offset. The Timestamp class in common.utils supports internalized and normalized formatting of timestamps and also comparison of timestamp values. When the offset value of a Timestamp is 0 - it's considered insignificant and need not be represented in the string format; to support backwards compatibility during a Swift upgrade the internalized and normalized form of a Timestamp with an insignificant offset are identical. When a timestamp includes an offset it will always be represented in the internalized form, but is still excluded from the normalized form. Timestamps with an equivalent timestamp portion (the float part) will compare and order by their offset. Timestamps with a greater timestamp portion will always compare and order greater than a Timestamp with a lesser timestamp regardless of it's offset. String comparison and ordering is guaranteed for the internalized string format, and is backwards compatible for normalized timestamps which do not include an offset. The reconciler currently uses a offset bump to ensure that objects can move to the wrong storage policy and be moved back. This use-case is valid because the content represented by the user-facing timestamp is not modified in way. Future consumers of the offset vector of timestamps should be mindful of HTTP semantics of If-Modified and take care to avoid deviation in the response from the object server without an accompanying change to the user facing timestamp. DocImpact Implements: blueprint storage-policies Change-Id: Id85c960b126ec919a481dc62469bf172b7fb8549
2014-06-10 22:17:47 -07:00
self.assertEqual(Timestamp(metadata['x-backend-timestamp']),
Fix object-expirer for missing objects Currently if the object-expirer goes to delete an object and the primary nodes are unavailable, or the object is on handoffs - the object servers are unable to verify the x-if-delete-at timestamp and return 412, without writing a tombstone or updating the containers. The expirer treats 412 as success and the dark data is not removed form the object servers nor the object removed in the listing. As a side effect of this bug, if the expirer encounters split brain the delete would never get processed in the correct storage policy. It seems it's just not correct to treat the lack of data as success. Now the object server will treat x-if-delete at against a non-existent object as a 404, and to distinguish from a successfull process of an x-if-delete-at request, will return 204. The expirer will treat a 404 response from swift as a failure, and will continue to attempt to expire the object until it is older that it's configurable reclaim age. However swift will only return 404 if the majority of nodes are able to return success, or if only even a single node is able to accept the x-if-delete-at request the containers will get updated and replicaiton will settle the tombstone - the subsequent x-if-delete-at request will 412 and be removed from the queue. It's worth noting that if an object with x-delete-at meta is DELETED (by a client request) an async update for the expiring update containers will be processed to remove the queue entry - but if no primary nodes handle the DELETE request replication will never remove the expiring entry and assuming it's scheduled for beyond the tombstones reclaim age - the queue entry will not be processable. In this case the expirer will attempt to DELETE the object (and get 404s) in vain until the queue entry passes the configurable reclaim age. DocImpact Implements: blueprint storage-policies Change-Id: I66260e99fda37e97d6d2470971b6f811ee9e01be
2014-06-06 11:35:34 -07:00
create_timestamp)
# but it is still in the listing
self.assertTrue(self._check_obj_in_container_listing(),
msg='Did not find listing for %s' % self.object_name)
Fix object-expirer for missing objects Currently if the object-expirer goes to delete an object and the primary nodes are unavailable, or the object is on handoffs - the object servers are unable to verify the x-if-delete-at timestamp and return 412, without writing a tombstone or updating the containers. The expirer treats 412 as success and the dark data is not removed form the object servers nor the object removed in the listing. As a side effect of this bug, if the expirer encounters split brain the delete would never get processed in the correct storage policy. It seems it's just not correct to treat the lack of data as success. Now the object server will treat x-if-delete at against a non-existent object as a 404, and to distinguish from a successfull process of an x-if-delete-at request, will return 204. The expirer will treat a 404 response from swift as a failure, and will continue to attempt to expire the object until it is older that it's configurable reclaim age. However swift will only return 404 if the majority of nodes are able to return success, or if only even a single node is able to accept the x-if-delete-at request the containers will get updated and replicaiton will settle the tombstone - the subsequent x-if-delete-at request will 412 and be removed from the queue. It's worth noting that if an object with x-delete-at meta is DELETED (by a client request) an async update for the expiring update containers will be processed to remove the queue entry - but if no primary nodes handle the DELETE request replication will never remove the expiring entry and assuming it's scheduled for beyond the tombstones reclaim age - the queue entry will not be processable. In this case the expirer will attempt to DELETE the object (and get 404s) in vain until the queue entry passes the configurable reclaim age. DocImpact Implements: blueprint storage-policies Change-Id: I66260e99fda37e97d6d2470971b6f811ee9e01be
2014-06-06 11:35:34 -07:00
# clear proxy cache
client.post_container(self.url, self.token, self.container_name, {})
# run the expirer again after replication
Fix object-expirer for missing objects Currently if the object-expirer goes to delete an object and the primary nodes are unavailable, or the object is on handoffs - the object servers are unable to verify the x-if-delete-at timestamp and return 412, without writing a tombstone or updating the containers. The expirer treats 412 as success and the dark data is not removed form the object servers nor the object removed in the listing. As a side effect of this bug, if the expirer encounters split brain the delete would never get processed in the correct storage policy. It seems it's just not correct to treat the lack of data as success. Now the object server will treat x-if-delete at against a non-existent object as a 404, and to distinguish from a successfull process of an x-if-delete-at request, will return 204. The expirer will treat a 404 response from swift as a failure, and will continue to attempt to expire the object until it is older that it's configurable reclaim age. However swift will only return 404 if the majority of nodes are able to return success, or if only even a single node is able to accept the x-if-delete-at request the containers will get updated and replicaiton will settle the tombstone - the subsequent x-if-delete-at request will 412 and be removed from the queue. It's worth noting that if an object with x-delete-at meta is DELETED (by a client request) an async update for the expiring update containers will be processed to remove the queue entry - but if no primary nodes handle the DELETE request replication will never remove the expiring entry and assuming it's scheduled for beyond the tombstones reclaim age - the queue entry will not be processable. In this case the expirer will attempt to DELETE the object (and get 404s) in vain until the queue entry passes the configurable reclaim age. DocImpact Implements: blueprint storage-policies Change-Id: I66260e99fda37e97d6d2470971b6f811ee9e01be
2014-06-06 11:35:34 -07:00
self.expirer.once()
# object is not in the listing
self.assertFalse(self._check_obj_in_container_listing(),
msg='Found listing for %s' % self.object_name)
Fix object-expirer for missing objects Currently if the object-expirer goes to delete an object and the primary nodes are unavailable, or the object is on handoffs - the object servers are unable to verify the x-if-delete-at timestamp and return 412, without writing a tombstone or updating the containers. The expirer treats 412 as success and the dark data is not removed form the object servers nor the object removed in the listing. As a side effect of this bug, if the expirer encounters split brain the delete would never get processed in the correct storage policy. It seems it's just not correct to treat the lack of data as success. Now the object server will treat x-if-delete at against a non-existent object as a 404, and to distinguish from a successfull process of an x-if-delete-at request, will return 204. The expirer will treat a 404 response from swift as a failure, and will continue to attempt to expire the object until it is older that it's configurable reclaim age. However swift will only return 404 if the majority of nodes are able to return success, or if only even a single node is able to accept the x-if-delete-at request the containers will get updated and replicaiton will settle the tombstone - the subsequent x-if-delete-at request will 412 and be removed from the queue. It's worth noting that if an object with x-delete-at meta is DELETED (by a client request) an async update for the expiring update containers will be processed to remove the queue entry - but if no primary nodes handle the DELETE request replication will never remove the expiring entry and assuming it's scheduled for beyond the tombstones reclaim age - the queue entry will not be processable. In this case the expirer will attempt to DELETE the object (and get 404s) in vain until the queue entry passes the configurable reclaim age. DocImpact Implements: blueprint storage-policies Change-Id: I66260e99fda37e97d6d2470971b6f811ee9e01be
2014-06-06 11:35:34 -07:00
# and validate object is tombstoned
found_in_policy = None
for policy in ENABLED_POLICIES:
Fix object-expirer for missing objects Currently if the object-expirer goes to delete an object and the primary nodes are unavailable, or the object is on handoffs - the object servers are unable to verify the x-if-delete-at timestamp and return 412, without writing a tombstone or updating the containers. The expirer treats 412 as success and the dark data is not removed form the object servers nor the object removed in the listing. As a side effect of this bug, if the expirer encounters split brain the delete would never get processed in the correct storage policy. It seems it's just not correct to treat the lack of data as success. Now the object server will treat x-if-delete at against a non-existent object as a 404, and to distinguish from a successfull process of an x-if-delete-at request, will return 204. The expirer will treat a 404 response from swift as a failure, and will continue to attempt to expire the object until it is older that it's configurable reclaim age. However swift will only return 404 if the majority of nodes are able to return success, or if only even a single node is able to accept the x-if-delete-at request the containers will get updated and replicaiton will settle the tombstone - the subsequent x-if-delete-at request will 412 and be removed from the queue. It's worth noting that if an object with x-delete-at meta is DELETED (by a client request) an async update for the expiring update containers will be processed to remove the queue entry - but if no primary nodes handle the DELETE request replication will never remove the expiring entry and assuming it's scheduled for beyond the tombstones reclaim age - the queue entry will not be processable. In this case the expirer will attempt to DELETE the object (and get 404s) in vain until the queue entry passes the configurable reclaim age. DocImpact Implements: blueprint storage-policies Change-Id: I66260e99fda37e97d6d2470971b6f811ee9e01be
2014-06-06 11:35:34 -07:00
metadata = self.client.get_object_metadata(
self.account, self.container_name, self.object_name,
acceptable_statuses=(4,),
headers={'X-Backend-Storage-Policy-Index': int(policy)})
if 'x-backend-timestamp' in metadata:
if found_in_policy:
self.fail('found object in %s and also %s' %
(found_in_policy, policy))
found_in_policy = policy
self.assertIn('x-backend-timestamp', metadata)
self.assertGreater(Timestamp(metadata['x-backend-timestamp']),
create_timestamp)
Fix object-expirer for missing objects Currently if the object-expirer goes to delete an object and the primary nodes are unavailable, or the object is on handoffs - the object servers are unable to verify the x-if-delete-at timestamp and return 412, without writing a tombstone or updating the containers. The expirer treats 412 as success and the dark data is not removed form the object servers nor the object removed in the listing. As a side effect of this bug, if the expirer encounters split brain the delete would never get processed in the correct storage policy. It seems it's just not correct to treat the lack of data as success. Now the object server will treat x-if-delete at against a non-existent object as a 404, and to distinguish from a successfull process of an x-if-delete-at request, will return 204. The expirer will treat a 404 response from swift as a failure, and will continue to attempt to expire the object until it is older that it's configurable reclaim age. However swift will only return 404 if the majority of nodes are able to return success, or if only even a single node is able to accept the x-if-delete-at request the containers will get updated and replicaiton will settle the tombstone - the subsequent x-if-delete-at request will 412 and be removed from the queue. It's worth noting that if an object with x-delete-at meta is DELETED (by a client request) an async update for the expiring update containers will be processed to remove the queue entry - but if no primary nodes handle the DELETE request replication will never remove the expiring entry and assuming it's scheduled for beyond the tombstones reclaim age - the queue entry will not be processable. In this case the expirer will attempt to DELETE the object (and get 404s) in vain until the queue entry passes the configurable reclaim age. DocImpact Implements: blueprint storage-policies Change-Id: I66260e99fda37e97d6d2470971b6f811ee9e01be
2014-06-06 11:35:34 -07:00
def test_expirer_doesnt_make_async_pendings(self):
# The object expirer cleans up its own queue. The inner loop
# basically looks like this:
#
# for obj in stuff_to_delete:
# delete_the_object(obj)
# remove_the_queue_entry(obj)
#
# By default, upon receipt of a DELETE request for an expiring
# object, the object servers will create async_pending records to
# clean the expirer queue. Since the expirer cleans its own queue,
# this is unnecessary. The expirer can make requests in such a way
# tha the object server does not write out any async pendings; this
# test asserts that this is the case.
# Make an expiring object in each policy
for policy in ENABLED_POLICIES:
container_name = "expirer-test-%d" % policy.idx
container_headers = {'X-Storage-Policy': policy.name}
client.put_container(self.url, self.token, container_name,
headers=container_headers)
now = time.time()
delete_at = int(now + 2.0)
client.put_object(
self.url, self.token, container_name, "some-object",
headers={'X-Delete-At': str(delete_at),
'X-Timestamp': Timestamp(now).normal},
contents='dontcare')
time.sleep(2.0)
# make sure auto-created expirer-queue containers get in the account
# listing so the expirer can find them
Manager(['container-updater']).once()
# Make sure there's no async_pendings anywhere. Probe tests only run
# on single-node installs anyway, so this set should be small enough
# that an exhaustive check doesn't take too long.
all_obj_nodes = self.get_all_object_nodes()
pendings_before = self.gather_async_pendings(all_obj_nodes)
# expire the objects
Manager(['object-expirer']).once()
pendings_after = self.gather_async_pendings(all_obj_nodes)
self.assertEqual(pendings_after, pendings_before)
def test_expirer_object_should_not_be_expired(self):
# Current object-expirer checks the correctness via x-if-delete-at
# header that it can be deleted by expirer. If there are objects
# either which doesn't have x-delete-at header as metadata or which
# has different x-delete-at value from x-if-delete-at value,
# object-expirer's delete will fail as 412 PreconditionFailed.
# However, if some of the objects are in handoff nodes, the expirer
# can put the tombstone with the timestamp as same as x-delete-at and
# the object consistency will be resolved as the newer timestamp will
# be winner (in particular, overwritten case w/o x-delete-at). This
# test asserts such a situation that, at least, the overwriten object
# which have larger timestamp than the original expirered date should
# be safe.
def put_object(headers):
# use internal client to PUT objects so that X-Timestamp in headers
# is effective
headers['Content-Length'] = '0'
path = self.client.make_path(
self.account, self.container_name, self.object_name)
try:
self.client.make_request('PUT', path, headers, (2,))
except UnexpectedResponse as e:
self.fail(
'Expected 201 for PUT object but got %s' % e.resp.status)
obj_brain = BrainSplitter(self.url, self.token, self.container_name,
self.object_name, 'object', self.policy)
# T(obj_created) < T(obj_deleted with x-delete-at) < T(obj_recreated)
# < T(expirer_executed)
# Recreated obj should be appeared in any split brain case
obj_brain.put_container()
# T(obj_deleted with x-delete-at)
# object-server accepts req only if X-Delete-At is later than 'now'
# so here, T(obj_created) < T(obj_deleted with x-delete-at)
now = time.time()
delete_at = int(now + 2.0)
recreate_at = delete_at + 1.0
put_object(headers={'X-Delete-At': str(delete_at),
'X-Timestamp': Timestamp(now).normal})
# some object servers stopped to make a situation that the
# object-expirer can put tombstone in the primary nodes.
obj_brain.stop_primary_half()
# increment the X-Timestamp explicitly
# (will be T(obj_deleted with x-delete-at) < T(obj_recreated))
put_object(headers={'X-Object-Meta-Expired': 'False',
'X-Timestamp': Timestamp(recreate_at).normal})
# make sure auto-created containers get in the account listing
Manager(['container-updater']).once()
# sanity, the newer object is still there
try:
metadata = self.client.get_object_metadata(
self.account, self.container_name, self.object_name)
except UnexpectedResponse as e:
self.fail(
'Expected 200 for HEAD object but got %s' % e.resp.status)
self.assertIn('x-object-meta-expired', metadata)
# some object servers recovered
obj_brain.start_primary_half()
# sleep until after recreated_at
while time.time() <= recreate_at:
time.sleep(0.1)
# Now, expirer runs at the time after obj is recreated
self.expirer.once()
# verify that original object was deleted by expirer
obj_brain.stop_handoff_half()
try:
metadata = self.client.get_object_metadata(
self.account, self.container_name, self.object_name,
acceptable_statuses=(4,))
except UnexpectedResponse as e:
self.fail(
'Expected 404 for HEAD object but got %s' % e.resp.status)
obj_brain.start_handoff_half()
# and inconsistent state of objects is recovered by replicator
Manager(['object-replicator']).once()
# check if you can get recreated object
try:
metadata = self.client.get_object_metadata(
self.account, self.container_name, self.object_name)
except UnexpectedResponse as e:
self.fail(
'Expected 200 for HEAD object but got %s' % e.resp.status)
self.assertIn('x-object-meta-expired', metadata)
def _test_expirer_delete_outdated_object_version(self, object_exists):
# This test simulates a case where the expirer tries to delete
# an outdated version of an object.
# One case is where the expirer gets a 404, whereas the newest version
# of the object is offline.
# Another case is where the expirer gets a 412, since the old version
# of the object mismatches the expiration time sent by the expirer.
# In any of these cases, the expirer should retry deleting the object
# later, for as long as a reclaim age has not passed.
obj_brain = BrainSplitter(self.url, self.token, self.container_name,
self.object_name, 'object', self.policy)
obj_brain.put_container()
if object_exists:
obj_brain.put_object()
# currently, the object either doesn't exist, or does not have
# an expiration
# stop primary servers and put a newer version of the object, this
# time with an expiration. only the handoff servers will have
# the new version
obj_brain.stop_primary_half()
now = time.time()
delete_at = int(now + 2.0)
obj_brain.put_object({'X-Delete-At': str(delete_at)})
# make sure auto-created containers get in the account listing
Manager(['container-updater']).once()
# update object record in the container listing
Manager(['container-replicator']).once()
# take handoff servers down, and bring up the outdated primary servers
obj_brain.start_primary_half()
obj_brain.stop_handoff_half()
# wait until object expiration time
while time.time() <= delete_at:
time.sleep(0.1)
# run expirer against the outdated servers. it should fail since
# the outdated version does not match the expiration time
self.expirer.once()
# bring all servers up, and run replicator to update servers
obj_brain.start_handoff_half()
Manager(['object-replicator']).once()
# verify the deletion has failed by checking the container listing
self.assertTrue(self._check_obj_in_container_listing(),
msg='Did not find listing for %s' % self.object_name)
# run expirer again, delete should now succeed
self.expirer.once()
# verify the deletion by checking the container listing
self.assertFalse(self._check_obj_in_container_listing(),
msg='Found listing for %s' % self.object_name)
def test_expirer_delete_returns_outdated_404(self):
self._test_expirer_delete_outdated_object_version(object_exists=False)
def test_expirer_delete_returns_outdated_412(self):
self._test_expirer_delete_outdated_object_version(object_exists=True)
def test_slo_async_delete(self):
if not self.cluster_info.get('slo', {}).get('allow_async_delete'):
raise unittest.SkipTest('allow_async_delete not enabled')
segment_container = self.container_name + '_segments'
client.put_container(self.url, self.token, self.container_name, {})
client.put_container(self.url, self.token, segment_container, {})
client.put_object(self.url, self.token,
segment_container, 'segment_1', b'1234')
client.put_object(self.url, self.token,
segment_container, 'segment_2', b'5678')
client.put_object(
self.url, self.token, self.container_name, 'slo', json.dumps([
{'path': segment_container + '/segment_1'},
{'data': 'Cg=='},
{'path': segment_container + '/segment_2'},
]), query_string='multipart-manifest=put')
_, body = client.get_object(self.url, self.token,
self.container_name, 'slo')
self.assertEqual(body, b'1234\n5678')
client.delete_object(
self.url, self.token, self.container_name, 'slo',
query_string='multipart-manifest=delete&async=true')
# Object's deleted
_, objects = client.get_container(self.url, self.token,
self.container_name)
self.assertEqual(objects, [])
with self.assertRaises(client.ClientException) as caught:
client.get_object(self.url, self.token, self.container_name, 'slo')
self.assertEqual(404, caught.exception.http_status)
# But segments are still around and accessible
_, objects = client.get_container(self.url, self.token,
segment_container)
self.assertEqual([o['name'] for o in objects],
['segment_1', 'segment_2'])
_, body = client.get_object(self.url, self.token,
segment_container, 'segment_1')
self.assertEqual(body, b'1234')
_, body = client.get_object(self.url, self.token,
segment_container, 'segment_2')
self.assertEqual(body, b'5678')
# make sure auto-created expirer-queue containers get in the account
# listing so the expirer can find them
Manager(['container-updater']).once()
self.expirer.once()
# Now the expirer has cleaned up the segments
_, objects = client.get_container(self.url, self.token,
segment_container)
self.assertEqual(objects, [])
with self.assertRaises(client.ClientException) as caught:
client.get_object(self.url, self.token,
segment_container, 'segment_1')
self.assertEqual(404, caught.exception.http_status)
with self.assertRaises(client.ClientException) as caught:
client.get_object(self.url, self.token,
segment_container, 'segment_2')
self.assertEqual(404, caught.exception.http_status)
Fix object-expirer for missing objects Currently if the object-expirer goes to delete an object and the primary nodes are unavailable, or the object is on handoffs - the object servers are unable to verify the x-if-delete-at timestamp and return 412, without writing a tombstone or updating the containers. The expirer treats 412 as success and the dark data is not removed form the object servers nor the object removed in the listing. As a side effect of this bug, if the expirer encounters split brain the delete would never get processed in the correct storage policy. It seems it's just not correct to treat the lack of data as success. Now the object server will treat x-if-delete at against a non-existent object as a 404, and to distinguish from a successfull process of an x-if-delete-at request, will return 204. The expirer will treat a 404 response from swift as a failure, and will continue to attempt to expire the object until it is older that it's configurable reclaim age. However swift will only return 404 if the majority of nodes are able to return success, or if only even a single node is able to accept the x-if-delete-at request the containers will get updated and replicaiton will settle the tombstone - the subsequent x-if-delete-at request will 412 and be removed from the queue. It's worth noting that if an object with x-delete-at meta is DELETED (by a client request) an async update for the expiring update containers will be processed to remove the queue entry - but if no primary nodes handle the DELETE request replication will never remove the expiring entry and assuming it's scheduled for beyond the tombstones reclaim age - the queue entry will not be processable. In this case the expirer will attempt to DELETE the object (and get 404s) in vain until the queue entry passes the configurable reclaim age. DocImpact Implements: blueprint storage-policies Change-Id: I66260e99fda37e97d6d2470971b6f811ee9e01be
2014-06-06 11:35:34 -07:00
if __name__ == "__main__":
unittest.main()