Go to file

gholt a80c720af5 Object replication ssync (an rsync alternative)

For this commit, ssync is just a direct replacement for how
we use rsync. Assuming we switch over to ssync completely
someday and drop rsync, we will then be able to improve the
algorithms even further (removing local objects as we
successfully transfer each one rather than waiting for whole
partitions, using an index.db with hash-trees, etc., etc.)

For easier review, this commit can be thought of in distinct
parts:

1)  New global_conf_callback functionality for allowing
    services to perform setup code before workers, etc. are
    launched. (This is then used by ssync in the object
    server to create a cross-worker semaphore to restrict
    concurrent incoming replication.)

2)  A bit of shifting of items up from object server and
    replicator to diskfile or DEFAULT conf sections for
    better sharing of the same settings. conn_timeout,
    node_timeout, client_timeout, network_chunk_size,
    disk_chunk_size.

3)  Modifications to the object server and replicator to
    optionally use ssync in place of rsync. This is done in
    a generic enough way that switching to FutureSync should
    be easy someday.

4)  The biggest part, and (at least for now) completely
    optional part, are the new ssync_sender and
    ssync_receiver files. Nice and isolated for easier
    testing and visibility into test coverage, etc.

All the usual logging, statsd, recon, etc. instrumentation
is still there when using ssync, just as it is when using
rsync.

Beyond the essential error and exceptional condition
logging, I have not added any additional instrumentation at
this time. Unless there is something someone finds super
pressing to have added to the logging, I think such
additions would be better as separate change reviews.

FOR NOW, IT IS NOT RECOMMENDED TO USE SSYNC ON PRODUCTION
CLUSTERS. Some of us will be in a limited fashion to look
for any subtle issues, tuning, etc. but generally ssync is
an experimental feature. In its current implementation it is
probably going to be a bit slower than rsync, but if all
goes according to plan it will end up much faster.

There are no comparisions yet between ssync and rsync other
than some raw virtual machine testing I've done to show it
should compete well enough once we can put it in use in the
real world.

If you Tweet, Google+, or whatever, be sure to indicate it's
experimental. It'd be best to keep it out of deployment
guides, howtos, etc. until we all figure out if we like it,
find it to be stable, etc.

Change-Id: If003dcc6f4109e2d2a42f4873a0779110fff16d6

2013-11-07 16:52:01 +00:00

bin

Object replication ssync (an rsync alternative)

2013-11-07 16:52:01 +00:00

doc

Object replication ssync (an rsync alternative)

2013-11-07 16:52:01 +00:00

etc

Object replication ssync (an rsync alternative)

2013-11-07 16:52:01 +00:00

examples

Add a user variable to templates

2013-09-17 11:46:04 +10:00

locale

Reverted the pulling out of various middleware:

2012-05-16 21:25:10 +00:00

swift

Object replication ssync (an rsync alternative)

2013-11-07 16:52:01 +00:00

test

Object replication ssync (an rsync alternative)

2013-11-07 16:52:01 +00:00

.coveragerc

Align tox.ini and fix coverage jobs in jenkins.

2012-06-08 20:05:14 -04:00

.functests

Allow dot test runners from any dir

2012-12-07 14:08:49 -08:00

.gitignore

fix(gitignore) : ignore *.egg and *.egg-info

2013-07-30 15:11:00 -04:00

.gitreview

Add .gitreview config file for gerrit.

2011-10-24 15:05:49 -04:00

.mailmap

Update my mailmap

2013-10-25 16:29:16 +08:00

.probetests

Allow dot test runners from any dir

2012-12-07 14:08:49 -08:00

.unittests

Add branch coverage reporting

2013-06-10 10:30:40 -04:00

AUTHORS

CHANGELOG and AUTHORS updates for 1.10.0 release

2013-10-08 23:58:13 -07:00

babel.cfg

add pybabel setup.py commands and initial .pot

2011-01-27 00:01:24 +00:00

CHANGELOG

CHANGELOG and AUTHORS updates for 1.10.0 release

2013-10-08 23:58:13 -07:00

CONTRIBUTING.md

Add CONTRIBUTING file.

2012-11-21 11:23:15 -08:00

LICENSE

Convert LICENSE to use unix style line endings.

2012-12-19 12:48:27 -05:00

MANIFEST.in

Add requirements files to the source distribution

2013-06-03 19:26:20 +04:00

README.md

Correct URL in readme

2013-10-07 22:27:34 -07:00

requirements.txt

Make pbr a build-time only dependency

2013-10-29 12:29:49 -07:00

setup.cfg

Migrate to pbr for build

2013-08-14 19:10:07 -03:00

setup.py

Migrate to pbr for build

2013-08-14 19:10:07 -03:00

test-requirements.txt

Start using Hacking

2013-07-15 11:41:58 +02:00

tox.ini

Merge "Add support for POST commit coverage runs"

2013-09-14 00:19:05 +00:00

README.md

Swift

A distributed object storage system designed to scale from a single machine to thousands of servers. Swift is optimized for multi-tenancy and high concurrency. Swift is ideal for backups, web and mobile content, and any other unstructured data that can grow without bound.

Swift provides a simple, REST-based API fully documented at http://docs.openstack.org/.

Swift was originally developed as the basis for Rackspace's Cloud Files and was open-sourced in 2010 as part of the OpenStack project. It has since grown to include contributions from many companies and has spawned a thriving ecosystem of 3rd party tools. Swift's contributors are listed in the AUTHORS file.

Docs

To build documentation install sphinx (pip install sphinx), run python setup.py build_sphinx, and then browse to /doc/build/html/index.html. These docs are auto-generated after every commit and available online at http://docs.openstack.org/developer/swift/.

For Developers

The best place to get started is the "SAIO - Swift All In One". This document will walk you through setting up a development cluster of Swift in a VM. The SAIO environment is ideal for running small-scale tests against swift and trying out new features and bug fixes.

You can run unit tests with .unittests and functional tests with .functests.

Code Organization

bin/: Executable scripts that are the processes run by the deployer
doc/: Documentation
etc/: Sample config files
swift/: Core code
- account/: account server
- common/: code shared by different modules
  - middleware/: "standard", officially-supported middleware
  - ring/: code implementing Swift's ring
- container/: container server
- obj/: object server
- proxy/: proxy server
test/: Unit and functional tests

Data Flow

Swift is a WSGI application and uses eventlet's WSGI server. After the processes are running, the entry point for new requests is the Application class in swift/proxy/server.py. From there, a controller is chosen, and the request is processed. The proxy may choose to forward the request to a back- end server. For example, the entry point for requests to the object server is the ObjectController class in swift/obj/server.py.

For Deployers

Deployer docs are also available at http://docs.openstack.org/developer/swift/. A good starting point is at http://docs.openstack.org/developer/swift/deployment_guide.html

You can run functional tests against a swift cluster with .functests. These functional tests require /etc/swift/test.conf to run. A sample config file can be found in this source tree in test/sample.conf.

For Client Apps

For client applications, official Python language bindings are provided at http://github.com/openstack/python-swiftclient.

Complete API documentation at http://docs.openstack.org/api/openstack-object-storage/1.0/content/

For more information come hang out in #openstack-swift on freenode.

Thanks,

The Swift Development Team