Introduce cpu-policy for container deployment

This spec proposes to introduce the concept of dedicated and
shared cpusets in zun.

Change-Id: Ide2fbb527134bc81048534f2dc7cb0b983b308cc
This commit is contained in:
Sudipta Biswas 2017-01-30 23:58:15 +05:30
parent 9448e462e7
commit 4c1e1e8398

175
specs/cpuset-container.rst Normal file

@ -0,0 +1,175 @@
..
This work is licensed under a Creative Commons Attribution 3.0 Unported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
==========================
Supporting CPU sets in ZUN
==========================
Related Launchpad Blueprint:
https://blueprints.launchpad.net/zun/+spec/cpuset-container
ZUN presently does not have a way to allow users to specify dedicated
resources for workloads that require higher performance. Such workloads
can be classified as Network Function Virtualization (NFV) based, AI
based or HPC based. This spec takes a first step towards supporting
such workloads with dedicated resources. The first of such resources
can be the cpusets or cores on a given physical host.
Problem description
===================
Exposing cpusets to the cloud users cannot be done in its raw form.
This is because, exposing such parameters to the end user breaks
the cloudy model of doing things.
Exposing cpusets can be broadly thought of as combination of user policies
and host capabilities.
The user policies are applied against compute host capabilities and if it
matches, the user is allowed to perform the CRUD operations on a container.
Proposed change
===============
1. Piggy back on the work done for host capabilities.
More details of this work would be covered on a seperate blueprint:
https://blueprints.launchpad.net/zun/+spec/expose-host-capabilities
2. Hydrate the schema with information obtained via calling driver specific
methods that obtain the details of a host inventory. For cpusets, lscpu -p
can be used to obtain the required information. Implement a periodic task
that inventories the host at regular intervals.
3. Define 2 user cpu-policies called "dedicated" and "shared". The first
policy signifies that the user wants to use dedicated cpusets for their
workloads. The shared mode is very similar to the default behavior. If unless
specified, the behavior will be defaulted to "shared".
4. Write driver methods to provision containers with dedicated cpusets.
The logic of 'what' cpusets should be picked up for a given requests lies
in the control of the zun code and is not exposed to the user.
5. The cpu-policy parameter is specified in conjuction with the vcpus field
for container creation. The number of vcpus shall determine the number of
cpusets requested for dedicated usage.
6. If this feature is being used with the zun scheduler, then the scheduler
needs to be aware of the host capabilities to choose the right host.
For example:
zun run -i -t --name test --cpu 4 --cpu-policy dedicated
We would try to support scheduling using both of these policies on the same
host.
How it works internally?
Once the user specifies the number of cpus, we would try to select a numa node
that has the same or more number of cpusets unpinned that can satisfy the request.
Once the cpusets are determined by the scheduler and it's corresponding numa node,
a driver method should be called for the actual provisoning of the request on the
compute node. Corresponding updates would be made to the inventory table.
In case of the docker driver - this can be achieved by a docker run equivalent:
docker run -d ubuntu --cpusets-cpu="1,3" --cpuset-mems="1,3"
The cpuset-mems would allow the memory access for the cpusets to stay localized.
If the container is in paused/stopped state, the DB will still continue to block
the pinset information for the container instead of releasing it.
Design Principles
-----------------
1. Build a host capability model that can be leveraged by the zun scheduler.
2. Create abstract user policies for the cloud user instead of raw
values.
Alternatives
------------
None
Data model impact
-----------------
- Introduce a new field in the container object called 'cpu_policy'.
- Add a new numa.py object to store the inventory information.
- Add numa topology obtained to the compute_node object.
REST API impact
---------------
The existing container CRUD APIs should allow a new parameter
for cpu policy.
Security impact
---------------
None
Notifications impact
--------------------
None
Other end user impact
---------------------
None
Performance Impact
------------------
None
Other deployer impact
---------------------
None
Developer impact
----------------
None
Implementation
==============
Assignee(s)
-----------
Primary assignee:
Sudipta Biswas (sbiswas7@in.ibm.com)
Other contributors:
Hongbin Lu, Pradeep Singh
Work Items
----------
1. Create the new schema.
2. Add cpu_policy field in the REST APIs and zun client.
3. Write logic to hydrate the inventory tables.
4. Implement a periodic task that inventories the host.
5. Write logic to check the cpusets of a given host.
6. Implement unit/integration test.
Dependencies
============
Testing
=======
Each patch will have unit tests.
Documentation Impact
====================
A set of documentation for this new feature will be required.