Merge "Delete the unnecessary space"
This commit is contained in:
commit
2fde030fd4
@ -9,13 +9,13 @@ Multi-node Ansible
|
||||
==================
|
||||
|
||||
This blueprint specifies an approach to automate the deployment of OpenStack
|
||||
using Ansible and Docker best practices. The overriding principles used in
|
||||
using Ansible and Docker best practices. The overriding principles used in
|
||||
this specification are simplicity, flexibility and optimized deployment speed.
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
Kolla can be deployed multi-node currently. To do so, the environment
|
||||
Kolla can be deployed multi-node currently. To do so, the environment
|
||||
variables must be hand edited to define the hosts to connect to for various
|
||||
services.
|
||||
|
||||
@ -42,10 +42,10 @@ Proposed change
|
||||
===============
|
||||
|
||||
The docker-compose tool is single node and does nearly the same job as Ansible
|
||||
would in this specification. As a result, we recommend deprecating
|
||||
would in this specification. As a result, we recommend deprecating
|
||||
docker-compose as the default deployment system for Kolla.
|
||||
|
||||
To replace it, we recommend Ansible as a technology choice. Ansible is easy
|
||||
To replace it, we recommend Ansible as a technology choice. Ansible is easy
|
||||
to learn, easy to use, and offers a base set of functionality to solve
|
||||
deployment as outlined in our four use cases.
|
||||
|
||||
@ -53,36 +53,36 @@ We recommend three models of configuration.
|
||||
|
||||
The first model is based upon internally configuring the container and having
|
||||
the container take responsibility for all container configuration including
|
||||
database setup, database synchronization, and keystone registration. This
|
||||
model uses docker-compose and docker as dependencies. Existing containers will
|
||||
database setup, database synchronization, and keystone registration. This
|
||||
model uses docker-compose and docker as dependencies. Existing containers will
|
||||
be maintained but new container content will use either of the two remaining
|
||||
models. James Slagle (TripleO PTL on behalf of our downstream TripleO
|
||||
models. James Slagle (TripleO PTL on behalf of our downstream TripleO
|
||||
community) was very clear that he would prefer to see this model stay available
|
||||
and maintained. As TripleO enters the world of Big Tent, they don't intend to
|
||||
and maintained. As TripleO enters the world of Big Tent, they don't intend to
|
||||
deploy all of the services, and as such it doesn't make sense to maintain this
|
||||
legacy operational mode for new container content except on demand of our
|
||||
downstreams, hopefully with their assistance. This model is called
|
||||
downstreams, hopefully with their assistance. This model is called
|
||||
CONFIG_INSIDE.
|
||||
|
||||
The second model and third model configure the containers outside of the
|
||||
container. These models depend on Ansible and Docker. In the future, the
|
||||
container. These models depend on Ansible and Docker. In the future, the
|
||||
OpenStack Puppet, OpenStack Chef and TripleO communities may decide to switch
|
||||
to one of these two models in which case these communities may maintain tooling
|
||||
to integrate with Kolla. The major difference between these two models is that
|
||||
to integrate with Kolla. The major difference between these two models is that
|
||||
one offers immutability and single source of truth (CONFIG_OUTSIDE_COPY_ONCE),
|
||||
while the third model trades these two properties to allow an Operator to
|
||||
directly modify configuration files on a system and have the configuration be
|
||||
live in the container (CONFIG_OUTSIDE_COPY_ALWAYS). Because
|
||||
live in the container (CONFIG_OUTSIDE_COPY_ALWAYS). Because
|
||||
CONFIG_OUTSIDE_COPY_ALWAYS requires direct Operator intervention on a node, and
|
||||
we prefer as a community Operators interact with the tools provided by Kolla,
|
||||
CONFIG_OUTSIDE_COPY_ONCE will be the default.
|
||||
|
||||
We do not have to further enhance two sets of container configuration, but
|
||||
instead can focus our development effort on the default Ansible configuration
|
||||
methods. If a defect is found in one of the containers based upon the
|
||||
methods. If a defect is found in one of the containers based upon the
|
||||
CONFIG_INSIDE model, the community will repair it.
|
||||
|
||||
Finally we will implement a complete Ansible deployment system. The details
|
||||
Finally we will implement a complete Ansible deployment system. The details
|
||||
of the implementation are covered in a later section in this specification.
|
||||
We estimate this will be approximately ~1000 LOC defining ~100 Ansible tasks.
|
||||
We further estimate the total code base when complete will be under 6 KLOC.
|
||||
@ -97,7 +97,7 @@ best practices while introducing completely customizable configuration.
|
||||
|
||||
The CONFIG_OUTSIDE_COPY_ALWAYS model of configuration offers the Operator
|
||||
greater flexibility in managing their deployment, at greater risk of damaging
|
||||
their deployment. It trades one set of best practices for another,
|
||||
their deployment. It trades one set of best practices for another,
|
||||
specifically the Kolla container best practices for flexibility.
|
||||
|
||||
Security impact
|
||||
|
@ -9,8 +9,8 @@ Containerize OpenStack
|
||||
======================
|
||||
|
||||
When upgrading or downgrading OpenStack, it is possible to use package based
|
||||
management or image-based management. Containerizing OpenStack is meant to
|
||||
optimize image-based management of OpenStack. Containerizing OpenStack
|
||||
management or image-based management. Containerizing OpenStack is meant to
|
||||
optimize image-based management of OpenStack. Containerizing OpenStack
|
||||
solves a manageability and availability problem with the current state of the
|
||||
art deployment systems in OpenStack.
|
||||
|
||||
@ -20,34 +20,34 @@ Problem description
|
||||
Current state of the art deployment systems use either image based or package
|
||||
based upgrade.
|
||||
|
||||
Image based upgrades are utilized by TripleO. When TripleO updates a system,
|
||||
Image based upgrades are utilized by TripleO. When TripleO updates a system,
|
||||
it creates an image of the entire disk and deploys that rather than just the
|
||||
parts that compose the OpenStack deployment. This results in significant
|
||||
loss of availability. Further running VMs are shut down in the imaging
|
||||
process. However, image based systems offer atomicity, because all related
|
||||
parts that compose the OpenStack deployment. This results in significant
|
||||
loss of availability. Further running VMs are shut down in the imaging
|
||||
process. However, image based systems offer atomicity, because all related
|
||||
software for a service is updated in one atomic action by reimaging the system.
|
||||
|
||||
Other systems use package based upgrade. Package based upgrades suffer from
|
||||
a non-atomic nature. An update may update 1 or more RPM packages. The update
|
||||
Other systems use package based upgrade. Package based upgrades suffer from
|
||||
a non-atomic nature. An update may update 1 or more RPM packages. The update
|
||||
process could fail for any number of reasons, and there is no way to back
|
||||
out the existing changes. Typically in an OpenStack deployment it is
|
||||
out the existing changes. Typically in an OpenStack deployment it is
|
||||
desirable to update a service that does one thing including it's dependencies
|
||||
as an atomic unit. Package based upgrades do not offer atomicity.
|
||||
as an atomic unit. Package based upgrades do not offer atomicity.
|
||||
|
||||
To solve this problem, containers can be used to provide an image-based update
|
||||
approach which offers atomic upgrade of a running system with minimal
|
||||
interruption in service. A rough prototype of compute upgrade [1] shows
|
||||
interruption in service. A rough prototype of compute upgrade [1] shows
|
||||
approximately a 10 second window of unavailability during a software update.
|
||||
The prototype keeps virtual machines running without interruption.
|
||||
|
||||
Use cases
|
||||
---------
|
||||
1. Upgrade or rollback OpenStack deployments atomically. End-user wants to
|
||||
1. Upgrade or rollback OpenStack deployments atomically. End-user wants to
|
||||
change the running software versions in her system to deploy a new upstream
|
||||
release without interrupting service for significant periods.
|
||||
2. Upgrade OpenStack based by component. End-user wants to upgrade her system
|
||||
2. Upgrade OpenStack based by component. End-user wants to upgrade her system
|
||||
in fine-grained chunks to limit damage from a failed upgrade.
|
||||
3. Rollback OpenStack based by component. End-user experienced a failed
|
||||
3. Rollback OpenStack based by component. End-user experienced a failed
|
||||
upgrade and wishes to rollback to the last known good working version.
|
||||
|
||||
|
||||
@ -180,16 +180,16 @@ The various container sets are composed in more detail as follows:
|
||||
* swift-proxy-server
|
||||
|
||||
In order to achieve the desired results, we plan to permit super-privileged
|
||||
containers. A super-privileged container is defined as any container launched
|
||||
containers. A super-privileged container is defined as any container launched
|
||||
with the --privileged=true flag to docker that:
|
||||
|
||||
* bind-mounts specific security-crucial host operating system directories
|
||||
with -v. This includes nearly all directories in the filesystem except for
|
||||
with -v. This includes nearly all directories in the filesystem except for
|
||||
leaf directories with no other host operating system use.
|
||||
* shares any namespace with the --ipc=host, --pid=host, or --net=host flags
|
||||
|
||||
We will not use the Docker EXPOSE operation since all containers will use
|
||||
--net=host. One motive for using --net=host is it is inherently simpler.
|
||||
--net=host. One motive for using --net=host is it is inherently simpler.
|
||||
A different motive for not using EXPOSE is the 20 microsecond penalty
|
||||
applied to every packet forwarded and returned by docker-proxy.
|
||||
If EXPOSE functionality is desired, it can be added back by
|
||||
@ -207,12 +207,12 @@ If the container does not pass its healthcheck operation, it should be
|
||||
restarted.
|
||||
|
||||
Integration of metadata with fig or a similar single node Docker orchestration
|
||||
tool will be implemented. Even though fig executes on a single node, the
|
||||
tool will be implemented. Even though fig executes on a single node, the
|
||||
containers will be designed to run multi-node and the deploy tool should take
|
||||
some form of information to allow it to operate multi-node. The deploy tool
|
||||
some form of information to allow it to operate multi-node. The deploy tool
|
||||
should take a set of key/value pairs as inputs and convert them into inputs
|
||||
into the environment passed to Docker. These key/value pairs could be a file
|
||||
or environment variables. We will not offer integration with multi-node
|
||||
into the environment passed to Docker. These key/value pairs could be a file
|
||||
or environment variables. We will not offer integration with multi-node
|
||||
scheduling or orchestration tools, but instead expect our consumers to manage
|
||||
each bare metal machine using our fig or similar in nature tool integration.
|
||||
|
||||
@ -220,7 +220,7 @@ Any contributions from the community of the required metadata to run these
|
||||
containers using a multi-node orchestration tool will be warmly received but
|
||||
generally won't be maintained by the core team.
|
||||
|
||||
The technique for launching the deploy script is not handled by Kolla. This
|
||||
The technique for launching the deploy script is not handled by Kolla. This
|
||||
is a problem for a higher level deployment tool such as TripleO or Fuel to
|
||||
tackle.
|
||||
|
||||
@ -229,7 +229,7 @@ Logs from the individual containers will be retrievable in some consistent way.
|
||||
Security impact
|
||||
---------------
|
||||
|
||||
Container usage with super-privileged mode may possibly impact security. For
|
||||
Container usage with super-privileged mode may possibly impact security. For
|
||||
example, when using --net=host mode and bind-mounting /run which is necessary
|
||||
for a compute node, it is possible that a compute breakout could corrupt the
|
||||
host operating system.
|
||||
|
@ -6,7 +6,7 @@ https://blueprints.launchpad.net/kolla/+spec/kolla-kubernetes
|
||||
|
||||
Kubernetes was evaluated by the Kolla team in the first two months of the
|
||||
project and it was found to be problematic because it did not support net=host,
|
||||
pid=host, and --privileged features in docker. Since then, it has developed
|
||||
pid=host, and --privileged features in docker. Since then, it has developed
|
||||
these features [1].
|
||||
|
||||
The objective is to manage the lifecycle of containerized OpenStack services by
|
||||
@ -51,7 +51,7 @@ Orchestration
|
||||
-------------
|
||||
|
||||
OpenStack on Kubernetes will be orchestrated by outside tools in order to create
|
||||
a production ready OpenStack environment. The kolla-kubernetes repo is where
|
||||
a production ready OpenStack environment. The kolla-kubernetes repo is where
|
||||
any deployment tool can join the community and be a part of orchestrating a
|
||||
kolla-kubernetes deployment.
|
||||
|
||||
@ -60,10 +60,10 @@ Service Config Management
|
||||
|
||||
Config generation will be completely decoupled from the deployment. The
|
||||
containers only expect a config file to land in a specific directory in
|
||||
the container in order to run. With this decoupled model, any tool could be
|
||||
used to generate config files. The kolla-kubernetes community will evaluate
|
||||
the container in order to run. With this decoupled model, any tool could be
|
||||
used to generate config files. The kolla-kubernetes community will evaluate
|
||||
any config generation tool, but will likely use Ansible for config generation
|
||||
in order to reuse existing work from the community. This solution uses
|
||||
in order to reuse existing work from the community. This solution uses
|
||||
customized Ansible and jinja2 templates to generate the config. Also, there will
|
||||
be a maintained set of defaults and a global yaml file that can override the
|
||||
defaults.
|
||||
@ -82,7 +82,7 @@ will be a Kubernetes Job, which will run the task until completion then
|
||||
terminate the pods [7].
|
||||
|
||||
Each service will have a bootstrap task so that when the operator upgrades,
|
||||
the bootstrap tasks are reused to upgrade the database. This will allow
|
||||
the bootstrap tasks are reused to upgrade the database. This will allow
|
||||
deployment and upgrades to follow the same pipeline.
|
||||
|
||||
The Kolla containers will communicate with the Kubernetes API server to in order
|
||||
@ -96,14 +96,14 @@ require some orchestration and the bootstrap pod will need to be setup to
|
||||
never restart or be replicated.
|
||||
|
||||
2) Use a sidecar container in the pod to handle the database sync with proper
|
||||
health checking to make sure the services are coming up healthy. The big
|
||||
health checking to make sure the services are coming up healthy. The big
|
||||
difference between kolla's old docker-compose solution and Kubernetes, is that
|
||||
docker-compose would only restart the containers. Kubernetes will completely
|
||||
reschedule them. Which means, removing the pod and restarting it. The reason
|
||||
docker-compose would only restart the containers. Kubernetes will completely
|
||||
reschedule them. Which means, removing the pod and restarting it. The reason
|
||||
this would fix that race condition failure kolla saw from docker-compose is
|
||||
because glance would be rescheduled on failure allowing keystone to get a
|
||||
chance to sync with the database and become active instead of constantly being
|
||||
piled with glance requests. There can also be health checks around this to help
|
||||
piled with glance requests. There can also be health checks around this to help
|
||||
determine order.
|
||||
|
||||
If kolla-kubernetes used this sidecar approach, it would regain the use of
|
||||
@ -116,12 +116,12 @@ Dependencies
|
||||
- Docker >= 1.10.0
|
||||
- Jinja2 >= 2.8.0
|
||||
|
||||
Kubernetes does not support dependencies between pods. The operator will launch
|
||||
Kubernetes does not support dependencies between pods. The operator will launch
|
||||
all the services and use kubernetes health checks to bring the deployment to an
|
||||
operational state.
|
||||
|
||||
With orchestration around Kubernetes, the operator can determine what tasks are
|
||||
run and when the tasks are run. This way, dependencies are handled at the
|
||||
run and when the tasks are run. This way, dependencies are handled at the
|
||||
orchestration level, but they are not required because proper health checking
|
||||
will bring up the cluster in a healthy state.
|
||||
|
||||
@ -133,7 +133,7 @@ desired state for the pods and the deployment will move the cluster to the
|
||||
desired state when a change is detected.
|
||||
|
||||
Kolla-kubernetes will provide Jobs that will provide the operator with the
|
||||
flexibility needed to under go a step wise upgrade. In future releases,
|
||||
flexibility needed to under go a step wise upgrade. In future releases,
|
||||
kolla-kubernetes will look to Kubernetes to provide a means for operators to
|
||||
plugin these jobs into a Deployment.
|
||||
|
||||
@ -141,22 +141,22 @@ Reconfigure
|
||||
-----------
|
||||
|
||||
The operator generates a new config and loads it into the Kubernetes configmap
|
||||
by changing the configmap version in the service yaml file. Then, the operator
|
||||
by changing the configmap version in the service yaml file. Then, the operator
|
||||
will trigger a rolling upgrade, which will scale down old pods and bring up new
|
||||
ones that will run with the updated configuration files.
|
||||
|
||||
There's an open issue upstream in Kubernetes where the plan is to add support
|
||||
around detecting if a pod has a changed in the configmap [6]. Depending on what
|
||||
the solution is, kolla-kubernetes may or may not use it. The rolling
|
||||
around detecting if a pod has a changed in the configmap [6]. Depending on what
|
||||
the solution is, kolla-kubernetes may or may not use it. The rolling
|
||||
upgrade feature will provide kolla-kubernetes with an elegant way to handle
|
||||
restarting the services.
|
||||
|
||||
HA Architecture
|
||||
---------------
|
||||
|
||||
Kubernetes uses health checks to bring up the services. Therefore,
|
||||
Kubernetes uses health checks to bring up the services. Therefore,
|
||||
kolla-kubernetes will use the same checks when monitoring if a service is
|
||||
healthy. When a service fails, the replication controller will be responsible
|
||||
healthy. When a service fails, the replication controller will be responsible
|
||||
for bringing up a new container in its place [8][9].
|
||||
|
||||
However, Kubernetes does not cover all the HA corner cases, for instance,
|
||||
@ -178,14 +178,14 @@ guarantee a pod will always be scheduled to a host, it makes node based
|
||||
persistent storage unlikely, unless the community uses labels for every pod.
|
||||
|
||||
Persistent storage in kolla-kubernetes will come from volumes backed by
|
||||
different storage offerings to provide persistent storage. Kolla-kubernetes
|
||||
different storage offerings to provide persistent storage. Kolla-kubernetes
|
||||
will provide a default solution using Ceph RBD, that the community will use to
|
||||
deploy multinode with. From there, kolla-kubernetes can add any additional
|
||||
persistent storage options as well as support options for the operator to
|
||||
reference an existing storage solution.
|
||||
|
||||
To deploy Ceph, the community will use the Ansible playbooks from Kolla to
|
||||
deploy a containerized Ceph at least for the 1.0 release. After Kubernetes
|
||||
deploy a containerized Ceph at least for the 1.0 release. After Kubernetes
|
||||
deployment matures, the community can evaluate building its own Ceph deployment
|
||||
solution.
|
||||
|
||||
@ -198,9 +198,9 @@ Service Roles
|
||||
At the broadest level, OpenStack can split up into two main roles, Controller
|
||||
and Compute. With Kubernetes, the role definition layer changes.
|
||||
Kolla-kubernetes will still need to define Compute nodes, but not Controller
|
||||
nodes. Compute nodes hold the libvirt container and the running vms. That
|
||||
nodes. Compute nodes hold the libvirt container and the running vms. That
|
||||
service cannont migrate because the vms associated with it exist on the node.
|
||||
However, the Controller role is more flexible. The Kubernetes layer provides IP
|
||||
However, the Controller role is more flexible. The Kubernetes layer provides IP
|
||||
persistence so that APIs will remain active and abstracted from the operator's
|
||||
view [15]. kolla-kubernetes can direct Controller services away from the Compute
|
||||
node using labels, while managing Compute services more strictly.
|
||||
@ -244,7 +244,7 @@ To reuse Kolla's containers, kolla-kubernetes will use elastic search, heka, and
|
||||
kibana as the default logging mechanism.
|
||||
|
||||
The community will implement centralized logging by using a 'side car' container
|
||||
in the Kubernetes pod [17]. The logging service will trace the logs from the
|
||||
in the Kubernetes pod [17]. The logging service will trace the logs from the
|
||||
shared volume of the running serivce and send the data to elastic search. This
|
||||
solution is ideal because volumes are shared amoung the containers in a pod.
|
||||
|
||||
|
@ -4,15 +4,15 @@ Logging with Heka
|
||||
|
||||
https://blueprints.launchpad.net/kolla/+spec/heka
|
||||
|
||||
Kolla currently uses Rsyslog for logging. And Change Request ``252968`` [1]
|
||||
Kolla currently uses Rsyslog for logging. And Change Request ``252968`` [1]
|
||||
suggests to use ELK (Elasticsearch, Logstash, Kibana) as a way to index all the
|
||||
logs, and visualize them.
|
||||
|
||||
This spec suggests using Heka [2] instead of Logstash, while still using
|
||||
Elasticsearch for indexing and Kibana for visualization. It also discusses
|
||||
Elasticsearch for indexing and Kibana for visualization. It also discusses
|
||||
the removal of Rsyslog along the way.
|
||||
|
||||
What is Heka? Heka is a open-source stream processing software created and
|
||||
What is Heka? Heka is a open-source stream processing software created and
|
||||
maintained by Mozilla.
|
||||
|
||||
Using Heka will provide a lightweight and scalable log processing solution
|
||||
@ -22,7 +22,7 @@ Problem description
|
||||
===================
|
||||
|
||||
Change Request ``252968`` [1] adds an Ansible role named "elk" that enables
|
||||
deploying ELK (Elasticsearch, Logstash, Kibana) on nodes with that role. This
|
||||
deploying ELK (Elasticsearch, Logstash, Kibana) on nodes with that role. This
|
||||
spec builds on that work, proposing a scalable log processing architecture
|
||||
based on the Heka [2] stream processing software.
|
||||
|
||||
@ -34,7 +34,7 @@ OpenStack nodes rather than using a centralized log processing engine that
|
||||
represents a bottleneck and a single-point-of-failure.
|
||||
|
||||
We also know from experience that Heka provides all the necessary flexibility
|
||||
for processing other types of data streams than log messages. For example, we
|
||||
for processing other types of data streams than log messages. For example, we
|
||||
already use Heka together with Elasticsearch for logs, but also with collectd
|
||||
and InfluxDB for statistics and metrics.
|
||||
|
||||
@ -53,16 +53,16 @@ in a dedicated container, referred to as the Heka container in the rest of this
|
||||
document.
|
||||
|
||||
Each Heka instance reads and processes the logs local to the node it runs on,
|
||||
and sends these logs to Elasticsearch for indexing. Elasticsearch may be
|
||||
and sends these logs to Elasticsearch for indexing. Elasticsearch may be
|
||||
distributed on multiple nodes for resiliency and scalability, but that part is
|
||||
outside the scope of that specification.
|
||||
|
||||
Heka, written in Go, is fast and has a small footprint, making it possible to
|
||||
run it on every node of the cluster. In contrast, Logstash runs in a JVM and
|
||||
run it on every node of the cluster. In contrast, Logstash runs in a JVM and
|
||||
is known [3] to be too heavy to run on every node.
|
||||
|
||||
Another important aspect is flow control and avoiding the loss of log messages
|
||||
in case of overload. Heka’s filter and output plugins, and the Elasticsearch
|
||||
in case of overload. Heka’s filter and output plugins, and the Elasticsearch
|
||||
output plugin in particular, support the use of a disk based message queue.
|
||||
This message queue allows plugins to reprocess messages from the queue when
|
||||
downstream servers (Elasticsearch) are down or cannot keep up with the data
|
||||
@ -74,20 +74,20 @@ which introduces some complexity and other points-of-failures.
|
||||
Remove Rsyslog
|
||||
--------------
|
||||
|
||||
Kolla currently uses Rsyslog. The Kolla services are configured to write their
|
||||
logs to Syslog. Rsyslog gets the logs from the ``/var/lib/kolla/dev/log`` Unix
|
||||
socket and dispatches them to log files on the local file system. Rsyslog
|
||||
Kolla currently uses Rsyslog. The Kolla services are configured to write their
|
||||
logs to Syslog. Rsyslog gets the logs from the ``/var/lib/kolla/dev/log`` Unix
|
||||
socket and dispatches them to log files on the local file system. Rsyslog
|
||||
running in a Docker container, the log files are stored in a Docker volume
|
||||
(named ``rsyslog``).
|
||||
|
||||
With Rsyslog already running on each cluster node, the question of using two
|
||||
log processing daemons, namely ``rsyslogd`` and ``hekad``, has been raised on
|
||||
the mailing list. The spec evaluates the possibility of using ``hekad`` only,
|
||||
log processing daemons, namely ``rsyslogd`` and ``hekad``, has been raised on
|
||||
the mailing list. The spec evaluates the possibility of using ``hekad`` only,
|
||||
based on some prototyping work we have conducted [4].
|
||||
|
||||
Note: Kolla doesn't currently collect logs from RabbitMQ, HAProxy and
|
||||
Keepalived. For RabbitMQ the problem is related to RabbitMQ not having the
|
||||
capability to write its logs to Syslog. HAProxy and Keepalived do have that
|
||||
Keepalived. For RabbitMQ the problem is related to RabbitMQ not having the
|
||||
capability to write its logs to Syslog. HAProxy and Keepalived do have that
|
||||
capability, but the ``/var/lib/kolla/dev/log`` Unix socket file is currently
|
||||
not mounted into the HAProxy and Keepalived containers.
|
||||
|
||||
@ -96,21 +96,21 @@ Use Heka's ``DockerLogInput`` plugin
|
||||
|
||||
To remove Rsyslog and only use Heka one option would be to make the Kolla
|
||||
services write their logs to ``stdout`` (or ``stderr``) and rely on Heka's
|
||||
``DockerLogInput`` plugin [5] for reading the logs. Our experiments have
|
||||
``DockerLogInput`` plugin [5] for reading the logs. Our experiments have
|
||||
revealed a number of problems with this option:
|
||||
|
||||
* The ``DockerLogInput`` plugin doesn't currently work for containers that have
|
||||
a ``tty`` allocated. And Kolla currently allocates a tty for all containers
|
||||
a ``tty`` allocated. And Kolla currently allocates a tty for all containers
|
||||
(for good reasons).
|
||||
|
||||
* When ``DockerLogInput`` is used there is no way to differentiate log messages
|
||||
for containers producing multiple log streams. ``neutron-agents`` is an
|
||||
example of such a container. (Sam Yaple has raised that issue multiple
|
||||
for containers producing multiple log streams. ``neutron-agents`` is an
|
||||
example of such a container. (Sam Yaple has raised that issue multiple
|
||||
times.)
|
||||
|
||||
* If Heka is stopped and restarted later then log messages will be lost, as the
|
||||
``DockerLogInput`` plugin doesn't currently have a mechanism for tracking its
|
||||
positions in the log streams. This is in contrast to the ``LogstreamerInput``
|
||||
positions in the log streams. This is in contrast to the ``LogstreamerInput``
|
||||
plugin [6] which does include that mechanism.
|
||||
|
||||
For these reasons we think that relying on the ``DockerLogInput`` plugin may
|
||||
@ -119,7 +119,7 @@ not be a practical option.
|
||||
For the note, our experiments have also shown that the OpenStack containers
|
||||
logs written to ``stdout`` are visible to neither Heka nor ``docker logs``.
|
||||
This problem is not reproducible when ``stderr`` is used rather than
|
||||
``stdout``. The cause of this problem is currently unknown. And it looks like
|
||||
``stdout``. The cause of this problem is currently unknown. And it looks like
|
||||
other people have come across that issue [7].
|
||||
|
||||
Use local log files
|
||||
@ -129,7 +129,7 @@ Another option consists of configuring all the Kolla services to log into local
|
||||
files, and using Heka's ``LogstreamerInput`` plugin [5].
|
||||
|
||||
This option involves using a Docker named volume, mounted both into the service
|
||||
containers (in ``rw`` mode) and into the Heka container (in ``ro`` mode). The
|
||||
containers (in ``rw`` mode) and into the Heka container (in ``ro`` mode). The
|
||||
services write logs into files placed in that volume, and Heka reads logs from
|
||||
the files found in that volume.
|
||||
|
||||
@ -138,28 +138,28 @@ And it relies on Heka's ``LogstreamerInput`` plugin, which, based on our
|
||||
experience, is efficient and robust.
|
||||
|
||||
Keeping file logs locally on the nodes has been established as a requirement by
|
||||
the Kolla developers. With this option, and the Docker volume used, meeting
|
||||
the Kolla developers. With this option, and the Docker volume used, meeting
|
||||
that requirement necessitates no additional mechanism.
|
||||
|
||||
For this option to be applicable the services must have the capability of
|
||||
logging into files. Most of the Kolla services have this capability. The
|
||||
logging into files. Most of the Kolla services have this capability. The
|
||||
exceptions are HAProxy and Keepalived, for which a different mechanism should
|
||||
be used (described further down in the document). Note that this will make it
|
||||
be used (described further down in the document). Note that this will make it
|
||||
possible to collect logs from RabbitMQ, which does not support logging to
|
||||
Syslog but does support logging to a file.
|
||||
|
||||
Also, this option requires that the services have the permission to create
|
||||
files into the Docker volume, and that Heka has the permission to read these
|
||||
files. This means that the Docker named volume will have to have appropriate
|
||||
owner, group and permission bits. With the Heka container running under
|
||||
files. This means that the Docker named volume will have to have appropriate
|
||||
owner, group and permission bits. With the Heka container running under
|
||||
a specific user (see below) this will mean using an ``extend_start.sh`` script
|
||||
including ``sudo chown`` and possibly ``sudo chmod`` commands. Our prototype
|
||||
including ``sudo chown`` and possibly ``sudo chmod`` commands. Our prototype
|
||||
[4] already includes this.
|
||||
|
||||
As mentioned already the ``LogstreamerInput`` plugin includes a mechanism for
|
||||
tracking positions in log streams. This works with journal files stored on the
|
||||
file system (in ``/var/cache/hekad``). A specific volume, private to Heka,
|
||||
will be used for these journal files. In this way no logs will be lost if the
|
||||
tracking positions in log streams. This works with journal files stored on the
|
||||
file system (in ``/var/cache/hekad``). A specific volume, private to Heka,
|
||||
will be used for these journal files. In this way no logs will be lost if the
|
||||
Heka container is removed and a new one is created.
|
||||
|
||||
Handling HAProxy and Keepalived
|
||||
@ -174,7 +174,7 @@ This works by using Heka's ``UdpInput`` plugin with its ``net`` option set
|
||||
to ``unixgram``.
|
||||
|
||||
This also requires that a Unix socket is created by Heka, and that socket is
|
||||
mounted into the HAProxy and Keepalived containers. For that we will use the
|
||||
mounted into the HAProxy and Keepalived containers. For that we will use the
|
||||
same technique as the one currently used in Kolla with Rsyslog, that is
|
||||
mounting ``/var/lib/kolla/dev`` into the Heka container and mounting
|
||||
``/var/lib/kolla/dev/log`` into the service containers.
|
||||
@ -182,7 +182,7 @@ mounting ``/var/lib/kolla/dev`` into the Heka container and mounting
|
||||
Our prototype already includes some code demonstrating this. See [4].
|
||||
|
||||
Also, to be able to store a copy of the HAProxy and Keepalived logs locally on
|
||||
the node, we will use Heka's ``FileOutput`` plugin. We will possibly create
|
||||
the node, we will use Heka's ``FileOutput`` plugin. We will possibly create
|
||||
two instances of that plugin, one for HAProxy and one for Keepalived, with
|
||||
specific filters (``message_matcher``).
|
||||
|
||||
@ -190,29 +190,29 @@ Read Python Tracebacks
|
||||
----------------------
|
||||
|
||||
In case of exceptions the OpenStack services log Python Tracebacks as multiple
|
||||
log messages. If no special care is taken then the Python Tracebacks will be
|
||||
log messages. If no special care is taken then the Python Tracebacks will be
|
||||
indexed as separate documents in Elasticsearch, and displayed as distinct log
|
||||
entries in Kibana, making them hard to read. To address that issue we will use
|
||||
entries in Kibana, making them hard to read. To address that issue we will use
|
||||
a custom Heka decoder, which will be responsible for coalescing the log lines
|
||||
making up a Python Traceback into one message. Our prototype includes that
|
||||
making up a Python Traceback into one message. Our prototype includes that
|
||||
decoder [4].
|
||||
|
||||
Collect system logs
|
||||
-------------------
|
||||
|
||||
In addition to container logs we think it is important to collect system logs
|
||||
as well. For that we propose to mount the host's ``/var/log`` directory into
|
||||
as well. For that we propose to mount the host's ``/var/log`` directory into
|
||||
the Heka container, and configure Heka to get logs from standard log files
|
||||
located in that directory (e.g. ``kern.log``, ``auth.log``, ``messages``). The
|
||||
located in that directory (e.g. ``kern.log``, ``auth.log``, ``messages``). The
|
||||
list of system log files will be determined at development time.
|
||||
|
||||
Log rotation
|
||||
------------
|
||||
|
||||
Log rotation is an important aspect of the logging system. Currently Kolla
|
||||
doesn't rotate logs. Logs just accumulate in the ``rsyslog`` Docker volume.
|
||||
Log rotation is an important aspect of the logging system. Currently Kolla
|
||||
doesn't rotate logs. Logs just accumulate in the ``rsyslog`` Docker volume.
|
||||
The work on Heka proposed in this spec isn't directly related to log rotation,
|
||||
but we are suggesting to address this issue for Mitaka. This will mean
|
||||
but we are suggesting to address this issue for Mitaka. This will mean
|
||||
creating a new container that uses ``logrotate`` to manage the log files
|
||||
created by the Kolla containers.
|
||||
|
||||
@ -220,33 +220,33 @@ Create an ``heka`` user
|
||||
-----------------------
|
||||
|
||||
For security reasons an ``heka`` user will be created in the Heka container and
|
||||
the ``hekad`` daemon will run under that user. The ``heka`` user will be added
|
||||
the ``hekad`` daemon will run under that user. The ``heka`` user will be added
|
||||
to the ``kolla`` group, to make sure that Heka can read the log files created
|
||||
by the services.
|
||||
|
||||
Security impact
|
||||
---------------
|
||||
|
||||
Heka is a mature product maintained and used in production by Mozilla. So we
|
||||
trust Heka as being secure. We also trust the Heka developers as being serious
|
||||
Heka is a mature product maintained and used in production by Mozilla. So we
|
||||
trust Heka as being secure. We also trust the Heka developers as being serious
|
||||
should security vulnerabilities be found in the Heka code.
|
||||
|
||||
As described above we are proposing to use a Docker volume between the service
|
||||
containers and the Heka container. The group of the volume directory and the
|
||||
log files will be ``kolla``. And the owner of the log files will be the user
|
||||
that executes the service producing logs. But the ``gid`` of the ``kolla``
|
||||
containers and the Heka container. The group of the volume directory and the
|
||||
log files will be ``kolla``. And the owner of the log files will be the user
|
||||
that executes the service producing logs. But the ``gid`` of the ``kolla``
|
||||
group and the ``uid``'s of the users executing the services may correspond
|
||||
to a different group and different users on the host system. This means
|
||||
that the permissions may not be right on the host system. This problem is
|
||||
to a different group and different users on the host system. This means
|
||||
that the permissions may not be right on the host system. This problem is
|
||||
not specific to this specification, and it already exists in Kolla (for
|
||||
the mariadb data volume for example).
|
||||
|
||||
Performance Impact
|
||||
------------------
|
||||
|
||||
The ``hekad`` daemon will run in a container on each cluster node. But the
|
||||
``rsyslogd`` will be removed. And we have assessed that Heka is lightweight
|
||||
enough to run on every node. Also, a possible option would be to constrain the
|
||||
The ``hekad`` daemon will run in a container on each cluster node. But the
|
||||
``rsyslogd`` will be removed. And we have assessed that Heka is lightweight
|
||||
enough to run on every node. Also, a possible option would be to constrain the
|
||||
Heka container to only use a defined amount of resources.
|
||||
|
||||
Alternatives
|
||||
@ -256,12 +256,12 @@ An alternative to this proposal involves using Logstash in a centralized
|
||||
way as done in [1].
|
||||
|
||||
Another alternative would be to execute Logstash on each cluster node, as this
|
||||
spec proposes with Heka. But this would mean running a JVM on each cluster
|
||||
spec proposes with Heka. But this would mean running a JVM on each cluster
|
||||
node, and using Redis as a centralized queue.
|
||||
|
||||
Also, as described above, we initially considered relying on services writing
|
||||
their logs to ``stdout`` and use Heka's ``DockerLogInput`` plugin. But our
|
||||
prototyping work has demonstrated the limits of that approach. See the
|
||||
their logs to ``stdout`` and use Heka's ``DockerLogInput`` plugin. But our
|
||||
prototyping work has demonstrated the limits of that approach. See the
|
||||
``DockerLogInput`` section above for more information.
|
||||
|
||||
Implementation
|
||||
|
@ -8,8 +8,8 @@
|
||||
This template should be in ReSTructured text. The filename in the git
|
||||
repository should match the launchpad URL, for example a URL of
|
||||
https://blueprints.launchpad.net/kolla/+spec/awesome-thing should be named
|
||||
awesome-thing.rst . Please do not delete any of the sections in this
|
||||
template. If you have nothing to say for a whole section, just write: None
|
||||
awesome-thing.rst . Please do not delete any of the sections in this
|
||||
template. If you have nothing to say for a whole section, just write: None
|
||||
For help with syntax, see http://sphinx-doc.org/rest.html
|
||||
To test out your formatting, see http://www.tele3.cz/jbar/rest/rest.html
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user