Containerize OpenStack based upon SPC and fig

This specification proposes using fig (being renamed to compose soon) to provide single-node multi-container orchestration. By using this mechanism, a very simple Ansible Playbook could easily deploy a single node in to a specific role type - such as a controller node, a compute node, or a storage node. This specification further proposes using super-privileged containers to provide solutions for the upgrade and rollback use cases of an OpenStack deployment. Change-Id: I56ff1fdf8b19b47be97778b55ea947ebb43995c1
2015-02-07 08:51:39 -07:00 · 2015-02-07 08:51:39 -07:00 · ddc12789bc
commit ddc12789bc
parent 4855d96c32
1 changed files with 243 additions and 0 deletions
--- a/specs/containerize-openstack.rst
+++ b/specs/containerize-openstack.rst
@ -0,0 +1,243 @@
+..
+   This work is licensed under a Creative Commons Attribution 3.0 Unported
+ License.
+
+ http://creativecommons.org/licenses/by/3.0/legalcode
+
+======================
+Containerize OpenStack
+======================
+
+When upgrading or downgrading OpenStack, it is possible to use package based
+management or image-based management.  Containerizing OpenStack is meant to
+optimize image-based management of OpenStack.  Containerizing OpenStack
+solves a manageability and availability problem with the current state of the
+art deployment systems in OpenStack.
+
+Problem description
+===================
+
+Current state of the art deployment systems use either image based or package
+based upgrade.
+
+Image based upgrades are utilized by TripleO.  When TripleO updates a system,
+it creates an image of the entire disk and deploys that rather than just the
+parts that compose the OpenStack deployment.  This results in significant
+loss of availability.  Further running VMs are shut down in the imaging
+process.  However, image based systems offer atomicity, because all related
+software for a service is updated in one atomic action by reimaging the system.
+
+Other systems use package based upgrade.  Package based upgrades suffer from
+a non-atomic nature.  An update may update 1 or more RPM packages.  The update
+process could fail for any number of reasons, and there is no way to back
+out the existing changes.  Typically in an OpenStack deployment it is
+desireable to update a service that does one thing including it's dependencies
+as an atomic unit.  Package based upgrades do not offer atomicity.
+
+To solve this problem, containers can be used to provide an image-based update
+approach which offers atomic upgrade of a running system with minimal
+interruption in service.  A rough prototype of compute upgrade [1] shows
+approximately a 10 second window of unavailability during a software update.
+The prototype keeps virtual machines running without interruption.
+
+Use cases
+---------
+1. Upgrade or rollback OpenStack deployments atomically.  End-user wants to
+   change the running software versions in her system to deploy a new upstream
+   release without interrupting service for significant periods.
+2. Upgrade OpenStack based by component.  End-user wants to upgrade her system
+   in fine-grained chunks to limit damage from a failed upgrade.
+3. Rollback OpenStack based by component.  End-user experienced a failed
+   upgrade and wishes to rollback to the last known good working version.
+
+
+Proposed change
+===============
+An OpenStack deployment based on containers are represented in a tree structure
+with each node representing a container set, and each leaf representing a
+container.
+
+The full properties of a container set:
+
+* A container set is composed of one or more container subsets or one or more
+  individual containers
+* A container set provides a single logical service
+* A container set is managed as a unit during startup, shutdown, and version
+* Each container set is launched together as one unit
+* A container set with subsets is launched as one unit including all subsets
+* A container set is not atomically managed
+* A container set provides appropriate hooks for high availability monitoring
+
+The full properties of a container:
+
+* A container is atomically upgraded or rolled back
+* A container includes a monotonically increasing generation number to identify
+  the container's age in comparison with other containers
+* A container has a single responsibility
+* A container may be super-privileged when it needs significant access to the
+  host including:
+  * the network namespace of the host
+  * The UUID namespace of the host
+  * The IPC namespace of the host
+  * Filesystem sharing of the host for persistent storage
+* A container may lack any privileges when it does not require significant
+  access to the host.
+* A container should include a check function for evaluating its own health.
+* A container will include proper PID 1 handling for reaping exited child
+  processes.
+
+The top level container sets are composed of:
+
+* database control
+* messaging control
+* high availability control
+* OpenStack control
+* Openstack compute operation
+* OpenStack storage operation
+
+The various container sets are composed in more detail as follows:
+
+* Database control
+  * galera
+  * mariadb
+  * mongodb
+
+* Messaging control
+  * rabbitmq
+
+* High availability control
+  * HAProxy
+
+* OpenStack control
+  * keystone
+  * glance-controller
+    * glance-api
+    * glance-registry
+  * nova-controller
+    * nova-api
+    * nova-conductor
+    * nova-scheduler
+  * neutron-controller
+    * neutron-server
+  * neutron-agents
+    * metadata
+  * ceiloemter-controller
+    * ceilometer-alarm
+    * ceilometer-api
+    * ceilometer-base
+    * ceilometer-central
+    * ceilometer-collector
+    * ceilometer-notification
+  * heat-controller
+    * heat-api
+    * heat-engine
+
+* Openstack compute operation
+  * nova-compute
+  * nova-libvirt
+  * neutron-agents-linux-bridge
+  * neutron-agents-ovs
+  * dhcp
+  * l3
+
+* OpenStack storage operation
+  * Cinder
+  * Swift
+    * swift-account
+    * swift-base
+    * swift-container
+    * swift-object
+    * swift-proxy-server
+
+In order to achieve the desired results, we plan to permit super-privileged
+containers.  A super-privileged container is defined as any container launched
+with the --privileged=true flag to docker that:
+
+* bind-mounts specific security-crucial host operating system directories
+  with -v.  This includes nearly all directories in the filesystem except for
+  leaf directories with no other host openarting system use.
+* shares any namespace with the --ipc=host, --pid=host, or --net=host flags
+
+We will use the docker flag --restart=always to provide some measure of
+high availability for the individual containers and ensure they operate
+correctly as currently designed.
+
+A host tool will run and monitor the container's built-in check script via
+docker exec to validate the container is operational on a pre-configured timer.
+If the container does not pass its healthcheck operation, it should be
+restarted.
+
+Integration of metadata with fig or a similar single node Docker orchestration
+tool will be implemented.  Even though fig  executes on a single node, the
+containers will be designed to run multi-node and the deploy tool should take
+some form of information to allow it to operate multi-node.  The deploy tool
+should take a set of key/value pairs as inputs and convert them into inputs
+into the environment passed to Docker.  These key/value pairs could be a file
+or environment variables.  We will not offer integration with multi-node
+scheduling or orchestration tools, but instead expect our consumers to manage
+each bare metal machine using our fig or similar in nature tool integration.
+
+Any contributions from the community of the required metadata to run these
+containers using a multi-node orchestration tool will be warmly received but
+generally won't be maintained by the core team.
+
+The technique for launching the deploy script is not handled by Kolla.  This
+is a problem for a higher level deployment tool such as TripleO or Fuel to
+tackle.
+
+Logs from the individual containers will be retrievable in some consistent way.
+
+Security impact
+---------------
+
+Container usage with super-privileged mode may possibly impact security.  For
+example, when using --net=host mode and bind-mounting /run which is necessary
+for a compute node, it is possible that a compute breakout could corrupt the
+host operating system.
+
+To mitigate security concerns, solutions such as SELinux and AppArmor should
+be used where appropriate to contain the security privileges of the containers.
+
+Performance Impact
+------------------
+
+The upgrade or downgrade process changes from a multi-hour outtage to a 10
+second outage across the system.
+
+Implementation
+==============
+
+
+Assignee(s)
+-----------
+
+Primary assignee:
+
+kolla maintainers
+
+Work Items
+----------
+
+1. Container Sets
+2. Containers
+3. A minimal proof of concept single-node fig deployment integration
+4. A minimal proof of concept fig healthchecking integration
+
+Testing
+=======
+
+Functional tests will be implemented in the OpenStack check/gating system to
+automatically check that containers pass each container's functional tests
+stored in the project's repositories.
+
+Documentation Impact
+====================
+
+The documentation impact is unclear as this project is a proof of concept
+with no clear delivery consumer.
+
+
+References
+==========
+
+* [1] https://github.com/sdake/compute-upgrade