swift

Author	SHA1	Message	Date
gholt	c859ebf5ce	Per device replication_lock New replication_one_per_device (True by default) that restricts incoming REPLICATION requests to one per device, replication_currency allowing. Also has replication_lock_timeout (15 by default) to control how long a request will wait to obtain a replication device lock before giving up. This should be very useful in that you can be assured any concurrent REPLICATION requests are each writing to distinct devices. If you have 100 devices on a server, you can set replication_concurrency to 100 and be confident that, even if 100 replication requests were executing concurrently, they'd each be writing to separate devices. Before, all 100 could end up writing to the same device, bringing it to a horrible crawl. NOTE: This is only for ssync replication. The current default rsync replication still has the potentially horrible behavior. Change-Id: I36e99a3d7e100699c76db6d3a4846514537ff685	2013-11-22 21:40:29 +00:00
Chmouel Boudjnah	8255810e7d	Add swiftsync. Change-Id: Ie588db762aedc55e00f8f51b3d2e329ba9a08a6c	2013-11-15 16:32:52 -05:00
Jenkins	af2f8295fd	Merge "Add more stuff to SAIO doc's proxy pipeline."	2013-11-12 09:12:42 +00:00
gholt	a80c720af5	Object replication ssync (an rsync alternative) For this commit, ssync is just a direct replacement for how we use rsync. Assuming we switch over to ssync completely someday and drop rsync, we will then be able to improve the algorithms even further (removing local objects as we successfully transfer each one rather than waiting for whole partitions, using an index.db with hash-trees, etc., etc.) For easier review, this commit can be thought of in distinct parts: 1) New global_conf_callback functionality for allowing services to perform setup code before workers, etc. are launched. (This is then used by ssync in the object server to create a cross-worker semaphore to restrict concurrent incoming replication.) 2) A bit of shifting of items up from object server and replicator to diskfile or DEFAULT conf sections for better sharing of the same settings. conn_timeout, node_timeout, client_timeout, network_chunk_size, disk_chunk_size. 3) Modifications to the object server and replicator to optionally use ssync in place of rsync. This is done in a generic enough way that switching to FutureSync should be easy someday. 4) The biggest part, and (at least for now) completely optional part, are the new ssync_sender and ssync_receiver files. Nice and isolated for easier testing and visibility into test coverage, etc. All the usual logging, statsd, recon, etc. instrumentation is still there when using ssync, just as it is when using rsync. Beyond the essential error and exceptional condition logging, I have not added any additional instrumentation at this time. Unless there is something someone finds super pressing to have added to the logging, I think such additions would be better as separate change reviews. FOR NOW, IT IS NOT RECOMMENDED TO USE SSYNC ON PRODUCTION CLUSTERS. Some of us will be in a limited fashion to look for any subtle issues, tuning, etc. but generally ssync is an experimental feature. In its current implementation it is probably going to be a bit slower than rsync, but if all goes according to plan it will end up much faster. There are no comparisions yet between ssync and rsync other than some raw virtual machine testing I've done to show it should compete well enough once we can put it in use in the real world. If you Tweet, Google+, or whatever, be sure to indicate it's experimental. It'd be best to keep it out of deployment guides, howtos, etc. until we all figure out if we like it, find it to be stable, etc. Change-Id: If003dcc6f4109e2d2a42f4873a0779110fff16d6	2013-11-07 16:52:01 +00:00
Samuel Merritt	26483a2fd1	Add more stuff to SAIO doc's proxy pipeline. If you're setting one of these up, you're probably going to use it for development, in which case you want everything but the kitchen sink turned on so you can just start hacking away. Change-Id: I98d178ff545cbf8d853c102e9fce76fb9f6773ac	2013-11-05 16:15:07 -08:00
Peter Portante	5202b0e586	DiskFile API, with reference implementation Refactor on-disk knowledge out of the object server by pushing the async update pickle creation to the new DiskFileManager class (name is not the best, so suggestions welcome), along with the REPLICATOR method logic. We also move the mount checking and thread pool storage to the new ondisk.Devices object, which then also becomes the new home of the audit_location_generator method. For the object server, a new setup() method is now called at the end of the controller's construction, and the _diskfile() method has been renamed to get_diskfile(), to allow implementation specific behavior. We then hide the need for the REST API layer to know how and where quarantining needs to be performed. There are now two places it is checked internally, on open() where we verify the content-length, name, and x-timestamp metadata, and in the reader on close where the etag metadata is checked if the entire file was read. We add a reader class to allow implementations to isolate the WSGI handling code for that specific environment (it is used no-where else in the REST APIs). This simplifies the caller's code to just use a "with" statement once open to avoid multiple points where close needs to be called. For a full historical comparison, including the usage patterns see: https://gist.github.com/portante/5488238 (as of master, 2b639f5, Merge "Fix 500 from account-quota This Commit middleware") --------------------------------+------------------------------------ DiskFileManager(conf) Methods: .pickle_async_update() .get_diskfile() .get_hashes() Attributes: .devices .logger .disk_chunk_size .keep_cache_size .bytes_per_sync DiskFile(a,c,o,keep_data_fp=) DiskFile(a,c,o) Methods: Methods: .__iter__() .close(verify_file=) .is_deleted() .is_expired() .quarantine() .get_data_file_size() .open() .read_metadata() .create() .create() .write_metadata() .delete() .delete() Attributes: Attributes: .quarantined_dir .keep_cache .metadata DiskFileReader() Methods: .__iter__() .close() Attributes: +.was_quarantined DiskWriter() DiskFileWriter() Methods: Methods: .write() .write() .put() .put() * Note that the DiskFile class * Note that the DiskReader() object implements all the methods returned by the necessary for a WSGI app DiskFileOpened.reader() method iterator implements all the methods necessary for a WSGI app iterator + Note that if the auditor is refactored to not use the DiskFile class, see https://review.openstack.org/44787 then we don't need the was_quarantined attribute A reference "in-memory" object server implementation of a backend DiskFile class in swift/obj/mem_server.py and swift/obj/mem_diskfile.py. One can also reference https://github.com/portante/gluster-swift/commits/diskfile for the proposed integration with the gluster-swift code based on these changes. Change-Id: I44e153fdb405a5743e9c05349008f94136764916 Signed-off-by: Peter Portante <peter.portante@redhat.com>	2013-10-17 15:03:31 -04:00
Peter Portante	9411a24ba7	Revert "Refactor common/utils methods to common/ondisk" This reverts commit 7760f41c3ce436cb23b4b8425db3749a3da33d32 Change-Id: I95e57a2563784a8cd5e995cc826afeac0eadbe62 Signed-off-by: Peter Portante <peter.portante@redhat.com>	2013-10-07 17:18:09 -04:00
Jenkins	0b594bc3af	Merge "Change OpenStack LLC to Foundation"	2013-10-07 16:09:37 +00:00
Jenkins	30d8ff3ccc	Merge "Fedora 19: need to use /etc/rc.d/rc.local"	2013-10-07 06:44:19 +00:00
Jenkins	95b5d6f443	Merge "Remove sphinx build warnings"	2013-10-07 06:43:54 +00:00
Peter Portante	3d8f0f1805	Fedora 19: need to use /etc/rc.d/rc.local Change-Id: I80e9a4c40ff99ec09a8eeef935447c6393ea78ec Signed-off-by: Peter Portante <peter.portante@redhat.com>	2013-10-03 11:36:31 -04:00
Peter Portante	db4547d01d	Remove sphinx build warnings Change-Id: Ic34bbd9cc65d96ea9b8434be7b54e5bcfae28b63 Signed-off-by: Peter Portante <peter.portante@redhat.com>	2013-10-03 11:21:26 -04:00
Jenkins	34340ddf49	Merge "Add "note" box callouts to SAIO for user changes."	2013-10-02 22:33:53 +00:00
Jenkins	9dc0886bc6	Merge "Pool memcache connections"	2013-10-02 21:46:23 +00:00
Clay Gerrard	02e247c1b3	Add "note" box callouts to SAIO for user changes. The SAIO is purpously cut into two parts, so that you don't have to switch back and forth between root and your unprivledged user. Add some "note" box callouts to highlight this changeover. Change-Id: I8b1a8f0539eac60d4121bdd4dab01df75ecca207	2013-10-02 11:39:35 -07:00
Chuck Thier	ae8470131e	Pool memcache connections This creates a pool to each memcache server so that connections will not grow without bound. This also adds a proxy config "max_memcache_connections" which can control how many connections are available in the pool. A side effect of the change is that we had to change the memcache calls that used noreply, and instead wait for the result of the request. Leaving with noreply could cause a race condition (specifically in account auto create), due to one request calling `memcache.del(key)` and then `memcache.get(key)` with a different pooled connection. If the delete didn't complete fast enough, the get would return the old value before it was deleted, and thus believe that the account was not autocreated. ClaysMindExploded DocImpact Change-Id: I350720b7bba29e1453894d3d4105ac1ea232595b	2013-10-02 02:08:04 +00:00
Peter Portante	e8a07c4ca7	Fedora 19 updates Change-Id: I95138852e45aa7632218a7107e0e7ba1f6ef373c Signed-off-by: Peter Portante <peter.portante@redhat.com>	2013-09-30 10:23:04 -04:00
Samuel Merritt	d9d7b2135a	Install libffi-dev in SAIO docs. If you don't, then newer versions of xattr won't install, and since our xattr requirement is simply ">= 0.4" in requirements.txt, this affects anyone setting up a new SAIO. This happened with xattr 0.7, which was released on 2013-07-19. Change-Id: Iaf335fa25a2908953d1fd218158ebedf5d01cc27	2013-09-24 16:54:58 -07:00
Samuel Merritt	ce5e810fed	Update SAIO doc to have double proxy-logging in pipeline. Change-Id: I0a034ca1420761cbf4e35dcea1d9cd18a92f90bd	2013-09-24 16:54:58 -07:00
ZhiQiang Fan	f72704fc82	Change OpenStack LLC to Foundation Change-Id: I7c3df47c31759dbeb3105f8883e2688ada848d58 Closes-bug: #1214176	2013-09-20 01:02:31 +08:00
Peter Portante	7760f41c3c	Refactor common/utils methods to common/ondisk Place all the methods related to on-disk layout and / or configuration into a new common module that can be shared by the various modules using the same on-disk layout. Change-Id: I27ffd4665d5115ffdde649c48a4d18e12017e6a9 Signed-off-by: Peter Portante <peter.portante@redhat.com>	2013-09-17 17:32:04 -04:00
Chuck Thier	a30a7ced9c	Add handoffs_first and handoff_delete to obj-repl If handoffs_first is True, then the object replicator will give partitions that are not supposed to be on the node priority. If handoff_delete is set to a number (n), then it will delete a handoff partition if at least n replicas were successfully replicated Also fixed a couple of things in the object replicator unit tests and added some more DocImpact Change-Id: Icb9968953cf467be2a52046fb16f4b84eb5604e4	2013-09-13 15:44:07 +00:00
Pete Zaitcev	d4b024ad7d	Split backends off swift/common/db.py The main purpose of this patch is to lay the groundwork for allowing the container and account servers to optionally use pluggable backend implementations. The backend.py files will eventually be the module where the backend APIs are defined via docstrings of this reference implementation. The swift/common/db.py module will remain an internal module used by the reference implementation. We have a raft of changes to docstrings staged for later, but this patch takes care to relocate ContainerBroker and AccountBroker into their new home intact. Change-Id: Ibab5c7605860ab768c8aa5a3161a705705689b04	2013-09-10 13:30:28 -06:00
Jenkins	621ea520a5	Merge "Added container listing ratelimiting"	2013-08-23 15:46:53 +00:00
gholt	52eca4d8a7	Implements configurable swift_owner_headers These are headers that will be stripped unless the WSGI environment contains a true value for 'swift_owner'. The exact definition of a swift_owner is up to the auth system in use, but usually indicates administrative responsibilities. DocImpact Change-Id: I972772fbbd235414e00130ca663428e8750cabca	2013-08-15 16:42:58 -07:00
gholt	c8795e6e85	Added container listing ratelimiting Change-Id: If4e9cfe4e4c743de1f39704acf849164cf3f0bd0	2013-08-14 12:40:25 +00:00
Chmouel Boudjnah	716ad3e07b	Add libcloud to associated_projects. Change-Id: I8778bbecc7ae5852cf789ae6b71191004f69861f	2013-08-09 17:18:40 +02:00
John Dickinson	e8a593fbf0	added a couple of java libraries to associated projects Change-Id: I7a554af509e8d9743a8416a051845c266e1fb9f6	2013-08-09 07:50:28 -07:00
Koert van der Veer	657a0e4e26	Add swift-basicauth and better-staticweb to associated projects. As announced here: http://openstack.markmail.org/thread/jjjbdpurhb5jwaxe Change-Id: I683c463745b34c003ecb79ba8261b94b9dc166b6	2013-08-05 09:57:50 +02:00
Jenkins	a2126add0b	Merge "Set default wsgi workers to cpu_count"	2013-07-30 19:12:28 +00:00
Jenkins	3349013dff	Merge "Configuration options for error regex and log file in the config now"	2013-07-29 18:23:39 +00:00
Marcelo Martins	d2dd3e5488	Configuration options for error regex and log file in the config now Making it possible for one to overwrite the default set of regexes used to search for device block errors in the log file. Also making the log file naming pattern configurable by setting them in the drive-audit.conf file. Updating "Detecting Failed Drives" section on the admin guide as well. Change-Id: I7bd3acffed196da3e09db4c9dcbb48a20bdd1cf0	2013-07-23 07:24:29 -05:00
Clay Gerrard	de3acec4bf	Set default wsgi workers to cpu_count Change the default value of wsgi workers from 1 to auto. The new default value for workers in the proxy, container, account & object wsgi servers will spawn as many workers per process as you have cpu cores. This will not be ideal for some configurations, but it's much more likely to produce a successful out of the box deployment. Inspect the number of cpu_cores using python's multiprocessing when available. Multiprocessing was added in python 2.6, but I know I've compiled python without it before on accident. The cpu_count method seems to be pretty system agnostic, but it says it can raise NotImplementedError or sometimes return 0. Add a new utility method 'config_auto_int_value' to pull an integer out of the config which has a dynamic default. * drive by s/container/proxy/ in proxy-server.conf.5 * fix misplaced max_clients in -server.conf-sample update doc/development_saio to force workers = 1 DocImpact Change-Id: Ifa563d22952c902ab8cbe1d339ba385413c54e95	2013-07-18 22:57:18 -07:00
Chmouel Boudjnah	18a0813d9b	Add documentation about flake8+hacking. - Fixes bug 1201431. Change-Id: If025a41caf3a629b9efb4d67c53c423796d37a91	2013-07-15 17:14:16 +02:00
Jenkins	72faf7b86d	Merge "Revert "docfix apache2 now supports client chunked encodin""	2013-07-09 19:40:33 +00:00
David Hadas	fcfb8012cd	Revert "docfix apache2 now supports client chunked encodin" This reverts commit 68cb91097b75a92237bd90caffcd405c3e83cb53 Just so this is not get forgotten in the tree... We are using daemon mode and chunked is not supported in this mode.	2013-07-02 22:18:32 +00:00
John Dickinson	9b2ee07ee3	small cleanup to associated projects page Change-Id: I5d6d6d6c32b6573474288897f6fa174b6f150183	2013-07-01 13:40:37 -07:00
Jenkins	4a90414fc7	Merge "docfix apache2 now supports client chunked encodin"	2013-06-29 01:21:10 +00:00
Chuck Thier	581f7f5517	Update docs to use default XFS inode size In past couple of years, the XFS team has greatly improved inode use in xfs. With more recent kernels, there is no performance penalty for using the default inode size, and a smaller inode size gives us improvements in other areas where disk access is involved. DocImpact Change-Id: Ie9da53a6e8bf43d1d02881befbb52595462c9f2e	2013-06-28 19:52:17 +00:00
Tom Fifield	68cb91097b	docfix apache2 now supports client chunked encodin As reported in the documentation bug, the apache deployment guide's reference to apache2 mod_wsgi not supporting client chunked encoding has become outdated. It now supports this feature, using an optional parameter. Updated the paragraph in question to reflect this patchset 2 mentions the WSGIChunkedRequest variable and adds it to the sample configs - On by default. Feedback welcome fixes bug 1194935 Change-Id: I07c5c8506ac34e1e0e08fa6d961babde2f9b7367	2013-06-28 15:27:33 +10:00
Chuck Thier	b012fd998c	Change ring partition size for SAIO Making this smaller (10 instead of 18) can make some of the tests run faster and makes rebuilding of the rings faster. Change-Id: Ibe46011d8e6a6482d39b3a20ac9c091d9fbc6ef7	2013-06-26 15:52:18 +00:00
Samuel Merritt	d9f2a76973	Local write affinity for object PUT requests. The proxy can now be configured to prefer local object servers for PUT requests, where "local" is governed by the "write_affinity". The "write_affinity_node_count" setting controls how many local object servers to try before giving up and going on to remote ones. I chose to simply re-order the object servers instead of filtering out nonlocal ones so that, if all of the local ones are down, clients can still get successful responses (just slower). The goal is to trade availability for throughput. By writing to local object servers across fast LAN links, clients get better throughput than if the object servers were far away over slow WAN links. The downside, of course, is that data availability (not durability) may suffer when drives fail. The default configuration has no write affinity in it, so the default behavior is unchanged. Added some words about these settings to the admin guide. DocImpact Change-Id: I09a0bd00524544ff627a3bccdcdc48f40720a86e	2013-06-23 22:04:56 -07:00
Pete Zaitcev	11aaaf1f3f	Remove Lucid/ppa instructions from SAIO guide Lucid won't EOL until May of 2014; but I stopped trusting that ppa a long time ago. Besides with the requires for dnspython and mock where they're at you almost can't install swift from source on any stock distro and expect tests to pass with system packages - so we're looking at pypi for depends regardless. While I'm in there: * more explanation of <your-user-name> and a helpful find/sed for configs * group the "setup ~/.bashrc" stuff with the "setup ~/bin" stuff * some updates/fixes from my experience installing on CentOS * remove region warnings from remakerings Change-Id: Ie2e6b06959ab699d853e07e5b7e8cda7036a44fe	2013-06-13 18:29:06 -06:00
Jenkins	cc4589cf63	Merge "Improve SAIO deploy document."	2013-06-13 06:29:27 +00:00
Kun Huang	90c422deae	Improve SAIO deploy document. improving points: 1. Remove yum install swift in Fedora; Use installing from source for both Ubuntu and Fedora. 2. Explain you could use all users including root, your own guest. An d the points developer have to care. Change-Id: Id6d683441bd790a21734624e29eb7c98bb40de85 Fixes: bug #1126389	2013-06-13 11:55:03 +08:00
Jenkins	5bfd2d798d	Merge "Add parallelism to object expirer daemon."	2013-06-11 22:48:24 +00:00
Jenkins	b63b5d590a	Merge "Use threadpools in the object server for performance."	2013-06-11 22:47:07 +00:00
Greg Lange	209c5ec418	Add parallelism to object expirer daemon. Two types of parallelism are added: - concurrency to speed up what a single process does - a way to run multiple daemons to work on different parts of the work DocImpact Change-Id: I48997f68eb2fd8de19a5ee8b9fcdf76dde2ba0ab	2013-06-07 20:49:47 +00:00
Samuel Merritt	b491549ac2	Use threadpools in the object server for performance. Without a (per-disk) threadpool, requests to a slow disk would affect all clients by blocking the entire eventlet reactor on read/write/etc. The slower the disk, the worse the performance. On an object server, you frequently have at least one slow disk due to auditing and replication activity sucking up all the available IO. By kicking those blocking calls out to a separate OS thread, we let the eventlet reactor make progress in other greenthreads, and by having a per-disk pool, we ensure that one slow disk can't suck up all the resources of an entire object server. There were a few blocking calls that were done with eventlet.tpool, but that's a fixed-size global threadpool, so I moved them to the per-disk threadpools. If the object server is configured not to use per-disk threadpools, (i.e. threads_per_disk = 0, which is the default), those call sites will still ultimately end up using eventlet.tpool.execute. You won't end up blocking a whole object server while waiting for a huge fsync. If you decide not to use threadpools, the only extra overhead should be a few extra Python function calls here and there. This is accomplished by setting threads_per_disk = 0 in the config. blueprint concurrent-disk-io Change-Id: I490f8753d926fdcee3a0c65c5aaf715bc2b7c290	2013-06-07 13:06:04 -07:00
Pete Zaitcev	4b5db1dd0a	Improve config samples - Add proxy-logging to multinode. We had it since Folsom and people still forget it, resulting in missing logs. - Use correct name, for ease hit with '*' in vi at least. Admittedly trivial changes, which I meant to hold until Leah's major doc improvement lands, but I'm tired of keeping stuff like this in my working repo. Change-Id: I44f80c51d6d7329a9b696e67fcb8a895db63e497	2013-06-06 19:41:13 -06:00

1 2 3 4 5 ...

468 Commits