Initial cyborg api and db design proposal

This spec proposes the initial design for the
cyborg api. The cyborg api should support the basic
operations concerning accelerators, and does not
necessarily have to be user facing api at the early
stage.The api should support functionalities such as
provision, attach, detach, list and update.

This spec also contains the proposal for a simple DB for
Cyborg. Note that although we propose a DB schema for Cyborg,
in implementation it should be aligned with resource provider
db schema as much as possible.

APIImpact

Change-Id: I98c74df91f4548ecef42d2e3f96facf9023a346a
Signed-off-by: zhipengh <huangzhipeng@huawei.com>
This commit is contained in:
zhipengh 2017-03-15 16:24:39 +08:00 committed by Zhipeng Huang
parent 2b01cb135a
commit b8669f18e6
3 changed files with 802 additions and 6 deletions

View File

@ -1,6 +0,0 @@
============
Cyborg Specs
============
This folder contains all the spec files.

View File

@ -0,0 +1,410 @@
..
This work is licensed under a Creative Commons Attribution 3.0 Unported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
===================
Cyborg API proposal
===================
https://blueprints.launchpad.net/openstack-cyborg/+spec/cyborg-api
This spec proposes to provide the initial API design for Cyborg.
Problem description
===================
Cyborg as a common management framework for dedicated devices (hardware/
software accelerators, high-speed storage, etc) needs RESTful API to expose
the basic functionalities.
Use Cases
---------
* As a user I want to be able to spawn VM with dedicated hardware, so
that I can utilize provided hardware.
* As a compute service I need to know how requested resource should be
attached to the VM.
* As a scheduler service I'd like to know on which resource provider
requested resource can be found.
Proposed change
===============
In general we want to develop the APIs that support basic life cycle management
for Cyborg.
Life Cycle Management Phases
----------------------------
For cyborg, LCM phases include typical create, retrieve, update, delete operations.
One thing should be noted that deprovisioning mainly refers to detach(delete) operation
which deactivate an acceleration capability but preserve the resource itself
for future usage. For Cyborg, from functional point of view, the LCM includes provision,
attach,update,list, and detach. There is no notion of deprovisioning for Cyborg API
in a sense that we decomission or disconnect an entire accelerator device from
the bus.
Difference between Provision and Attach/Detach
----------------------------------------------
Noted that while the APIs support provisioning via CRUD operations, attach/detach
are considered different:
* Provision operations (create) will involve api->
conductor->agent->driver workflow, where as attach/detach (update/delete) could be taken
care of at the driver layer without the involvement of the pre-mentioned workflow. This
is similar to the difference between create a volume and attach/detach a volume in Cinder.
* The attach/detach in Cyborg API will mainly involved in DB status modification.
Difference between Attach/Detach To VM and Host
-----------------------------------------------
Moreover there are also differences when we attach an accelerator to a VM or
a host, similar to Cinder.
* When the attachment happens to a VM, we are expecting that Nova could call
the virt driver to perform the action for the instance. In this case Nova
needs to support the acc-attach and acc-detach action.
* When the attachment happens to a host, we are expecting that Cyborg could
take care of the action itself via Cyborg driver. Althrough currently there
is the generic driver to accomplish the job, we should consider a os-brick
like standalone lib for accelerator attach/detach operations.
Alternatives
------------
* For attaching an accelerator to a VM, we could let Cyborg perform the action
itself, however it runs into the risk of tight-coupling with Nova of which Cyborg
needs to get instance related information.
* For attaching an accelerator to a host, we could consider to use Ironic drivers
however it might not bode well with the standalone accelerator rack scenarios where
accelerators are not attached to server at all.
Data model impact
-----------------
A new table in the API database will be created::
CREATE TABLE accelerators (
accelerator_id INT NOT NULL,
device_type STRING NOT NULL,
acc_type STRING NOT NULL,
acc_capability STRING NOT NULL,
vendor_id STRING,
product_id STRING,
remotable INT,
);
Note that there is an ongoing discussion on nested resource
provider new data structures that will impact Cyborg DB imp-
lementation. For code implementation it should be aligned
with resource provider db requirement as much as possible.
REST API impact
---------------
The API changes add resource endpoints to:
* `GET` a list of all the accelerators
* `GET` a single accelerator for a given id
* `POST` create a new accelerator resource
* `PUT` an update to an existing accelerator spec
* `PUT` attach an accelerator to a VM or a host
* `DELETE` detach an existing accelerator for a given id
The following new REST API call will be created:
'GET /accelerators'
*************************
Return a list of accelerators managed by Cyborg
Example message body of the response to the GET operation::
200 OK
Content-Type: application/json
{
"accelerator":[
{
"uuid":"8e45a2ea-5364-4b0d-a252-bf8becaa606e",
"acc_specs":
{
"remote":0,
"num":1,
"device_type":"CRYPTO"
"acc_capability":
{
"num":2
"ipsec":
{
"aes":
{
"3des":50,
"num":1,
}
}
}
}
},
{
"uuid":"eaaf1c04-ced2-40e4-89a2-87edded06d64",
"acc_specs":
{
"remote":0,
"num":1,
"device_type":"CRYPTO"
"acc_capability":
{
"num":2
"ipsec":
{
"aes":
{
"3des":40,
"num":1,
}
}
}
}
}
]
}
'GET /accelerators/{uuid}'
*************************
Retrieve a certain accelerator info indetified by '{uuid}'
Example GET Request::
GET /accelerators/8e45a2ea-5364-4b0d-a252-bf8becaa606e
200 OK
Content-Type: application/json
{
"uuid":"8e45a2ea-5364-4b0d-a252-bf8becaa606e",
"acc_specs":{
"remote":0,
"num":1,
"device_type":"CRYPTO"
"acc_capability":{
"num":2
"ipsec":{
"aes":{
"3des":50,
"num":1,
}
}
}
}
}
If the accelerator does not exist a `404 Not Found` must be
returned.
'POST /accelerators/{uuid}'
*******************
Create a new accelerator
Example POST Request::
Content-type: application/json
{
"name": "IPSec Card",
"uuid": "8e45a2ea-5364-4b0d-a252-bf8becaa606e"
}
The body of the request must match the following JSONSchema document::
{
"type": "object",
"properties": {
"name": {
"type": "string"
},
"uuid": {
"type": "string",
"format": "uuid"
}
},
"required": [
"name"
]
"additionalProperties": False
}
The response body is empty. The headers include a location header
pointing to the created accelerator resource::
201 Created
Location: /accelerators/8e45a2ea-5364-4b0d-a252-bf8becaa606e
A `409 Conflict` response code will be returned if another accelerator
exists with the provided name.
'PUT /accelerators/{uuid}/{acc_spec}'
*************************
Update the spec for the accelerator identified by `{uuid}`.
Example::
PUT /accelerator/8e45a2ea-5364-4b0d-a252-bf8becaa606e
Content-type: application/json
{
"acc_specs":{
"remote":0,
"num":1,
"device_type":"CRYPTO"
"acc_capability":{
"num":2
"ipsec":{
"aes":{
"3des":50,
"num":1,
}
}
}
}
}
The returned HTTP response code will be one of the following:
* `200 OK` if the spec is successfully updated
* `404 Not Found` if the accelerator identified by `{uuid}` was
not found
* `400 Bad Request` for bad or invalid syntax
* `409 Conflict` if another process updated the same spec.
'PUT /accelerators/{uuid}'
*************************
Attach the accelerator identified by `{uuid}`.
Example::
PUT /accelerator/8e45a2ea-5364-4b0d-a252-bf8becaa606e
Content-type: application/json
{
"name": "IPSec Card",
"uuid": "8e45a2ea-5364-4b0d-a252-bf8becaa606e"
}
The returned HTTP response code will be one of the following:
* `200 OK` if the accelerator is successfully attached
* `404 Not Found` if the accelerator identified by `{uuid}` was
not found
* `400 Bad Request` for bad or invalid syntax
* `409 Conflict` if another process attach the same accelerator.
'DELETE /accelerator/{uuid}'
****************************
Detach the accelerator identified by `{uuid}`.
The body of the request and the response is empty.
The returned HTTP response code will be one of the following:
* `204 No Content` if the request was successful and the accelerator was detached.
* `404 Not Found` if the accelerator identified by `{uuid}` was
not found.
* `409 Conflict` if there exist allocations records for any of the
accelerator resource that would be detached as a result of detaching the accelerator.
Security impact
---------------
None
Notifications impact
--------------------
None
Other end user impact
---------------------
None
Performance Impact
------------------
None
Other deployer impact
---------------------
None
Developer impact
----------------
Developers can use this REST API after it has been implemented.
Implementation
==============
Assignee(s)
-----------
Primary assignee:
zhipengh <huangzhipeng@huawei.com>
Work Items
----------
* Implement the APIs specified in this spec
* Proposal to Nova about the new accelerator
attach/detach api
* Implement the DB specified in this spec
Dependencies
============
None.
Testing
=======
* Unit tests will be added to Cyborg API.
Documentation Impact
====================
None
References
==========
None
History
=======
.. list-table:: Revisions
:header-rows: 1
* - Release
- Description
* - Pike
- Introduced

392
specs/template.rst Normal file
View File

@ -0,0 +1,392 @@
..
This work is licensed under a Creative Commons Attribution 3.0 Unported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
==========================================
Example Spec - The title of your blueprint
==========================================
Include the URL of your launchpad blueprint:
https://blueprints.launchpad.net/openstack-cyborg/+spec/example
Introduction paragraph -- why are we doing anything? A single paragraph of
prose that operators can understand. The title and this first paragraph
should be used as the subject line and body of the commit message
respectively.
Some notes about the cyborg-spec and blueprint process:
* Not all blueprints need a spec. For more information see
http://docs.openstack.org/developer/cyborg/blueprints.html#specs
* The aim of this document is first to define the problem we need to solve,
and second agree the overall approach to solve that problem.
* This is not intended to be extensive documentation for a new feature.
For example, there is no need to specify the exact configuration changes,
nor the exact details of any DB model changes. But you should still define
that such changes are required, and be clear on how that will affect
upgrades.
* You should aim to get your spec approved before writing your code.
While you are free to write prototypes and code before getting your spec
approved, its possible that the outcome of the spec review process leads
you towards a fundamentally different solution than you first envisaged.
* But, API changes are held to a much higher level of scrutiny.
As soon as an API change merges, we must assume it could be in production
somewhere, and as such, we then need to support that API change forever.
To avoid getting that wrong, we do want lots of details about API changes
upfront.
Some notes about using this template:
* Your spec should be in ReSTructured text, like this template.
* Please wrap text at 79 columns.
* The filename in the git repository should match the launchpad URL, for
example a URL of: https://blueprints.launchpad.net/openstack-cyborg/+spec/awesome-thing
should be named awesome-thing.rst
* Please do not delete any of the sections in this template. If you have
nothing to say for a whole section, just write: None
* For help with syntax, see http://sphinx-doc.org/rest.html
* To test out your formatting, build the docs using tox and see the generated
HTML file in doc/build/html/specs/<path_of_your_file>
* If you would like to provide a diagram with your spec, ascii diagrams are
required. http://asciiflow.com/ is a very nice tool to assist with making
ascii diagrams. The reason for this is that the tool used to review specs is
based purely on plain text. Plain text will allow review to proceed without
having to look at additional files which can not be viewed in gerrit. It
will also allow inline feedback on the diagram itself.
* If your specification proposes any changes to the Cyborg REST API such
as changing parameters which can be returned or accepted, or even
the semantics of what happens when a client calls into the API, then
you should add the APIImpact flag to the commit message. Specifications with
the APIImpact flag can be found with the following query:
https://review.openstack.org/#/q/status:open+project:openstack/cyborg+message:apiimpact,n,z
Problem description
===================
A detailed description of the problem. What problem is this blueprint
addressing?
Use Cases
---------
What use cases does this address? What impact on actors does this change have?
Ensure you are clear about the actors in each use case: Developer, End User,
Deployer etc.
Proposed change
===============
Here is where you cover the change you propose to make in detail. How do you
propose to solve this problem?
If this is one part of a larger effort make it clear where this piece ends. In
other words, what's the scope of this effort?
At this point, if you would like to just get feedback on if the problem and
proposed change fit in Cyborg, you can stop here and post this for review to get
preliminary feedback. If so please say:
Posting to get preliminary feedback on the scope of this spec.
Alternatives
------------
What other ways could we do this thing? Why aren't we using those? This doesn't
have to be a full literature review, but it should demonstrate that thought has
been put into why the proposed solution is an appropriate one.
Data model impact
-----------------
Changes which require modifications to the data model often have a wider impact
on the system. The community often has strong opinions on how the data model
should be evolved, from both a functional and performance perspective. It is
therefore important to capture and gain agreement as early as possible on any
proposed changes to the data model.
Questions which need to be addressed by this section include:
* What new data objects and/or database schema changes is this going to
require?
* What database migrations will accompany this change.
* How will the initial set of new data objects be generated, for example if you
need to take into account existing instances, or modify other existing data
describe how that will work.
REST API impact
---------------
Each API method which is either added or changed should have the following
* Specification for the method
* A description of what the method does suitable for use in
user documentation
* Method type (POST/PUT/GET/DELETE)
* Normal http response code(s)
* Expected error http response code(s)
* A description for each possible error code should be included
describing semantic errors which can cause it such as
inconsistent parameters supplied to the method, or when an
instance is not in an appropriate state for the request to
succeed. Errors caused by syntactic problems covered by the JSON
schema definition do not need to be included.
* URL for the resource
* URL should not include underscores, and use hyphens instead.
* Parameters which can be passed via the url
* JSON schema definition for the request body data if allowed
* Field names should use snake_case style, not CamelCase or MixedCase
style.
* JSON schema definition for the response body data if any
* Field names should use snake_case style, not CamelCase or MixedCase
style.
* Example use case including typical API samples for both data supplied
by the caller and the response
* Discuss any policy changes, and discuss what things a deployer needs to
think about when defining their policy.
Note that the schema should be defined as restrictively as
possible. Parameters which are required should be marked as such and
only under exceptional circumstances should additional parameters
which are not defined in the schema be permitted (eg
additionaProperties should be False).
Reuse of existing predefined parameter types such as regexps for
passwords and user defined names is highly encouraged.
Security impact
---------------
Describe any potential security impact on the system. Some of the items to
consider include:
* Does this change touch sensitive data such as tokens, keys, or user data?
* Does this change alter the API in a way that may impact security, such as
a new way to access sensitive information or a new way to login?
* Does this change involve cryptography or hashing?
* Does this change require the use of sudo or any elevated privileges?
* Does this change involve using or parsing user-provided data? This could
be directly at the API level or indirectly such as changes to a cache layer.
* Can this change enable a resource exhaustion attack, such as allowing a
single API interaction to consume significant server resources? Some examples
of this include launching subprocesses for each connection, or entity
expansion attacks in XML.
For more detailed guidance, please see the OpenStack Security Guidelines as
a reference (https://wiki.openstack.org/wiki/Security/Guidelines). These
guidelines are a work in progress and are designed to help you identify
security best practices. For further information, feel free to reach out
to the OpenStack Security Group at openstack-security@lists.openstack.org.
Notifications impact
--------------------
Please specify any changes to notifications. Be that an extra notification,
changes to an existing notification, or removing a notification.
Other end user impact
---------------------
Aside from the API, are there other ways a user will interact with this
feature?
* Does this change have an impact on python-cyborgclient? What does the user
interface there look like?
Performance Impact
------------------
Describe any potential performance impact on the system, for example
how often will new code be called, and is there a major change to the calling
pattern of existing code.
Examples of things to consider here include:
* A periodic task might look like a small addition but if it calls conductor or
another service the load is multiplied by the number of nodes in the system.
* Scheduler filters get called once per host for every instance being created,
so any latency they introduce is linear with the size of the system.
* A small change in a utility function or a commonly used decorator can have a
large impacts on performance.
* Calls which result in a database queries (whether direct or via conductor)
can have a profound impact on performance when called in critical sections of
the code.
* Will the change include any locking, and if so what considerations are there
on holding the lock?
Other deployer impact
---------------------
Discuss things that will affect how you deploy and configure OpenStack
that have not already been mentioned, such as:
* What config options are being added? Should they be more generic than
proposed (for example a flag that other hypervisor drivers might want to
implement as well)? Are the default values ones which will work well in
real deployments?
* Is this a change that takes immediate effect after its merged, or is it
something that has to be explicitly enabled?
* If this change is a new binary, how would it be deployed?
* Please state anything that those doing continuous deployment, or those
upgrading from the previous release, need to be aware of. Also describe
any plans to deprecate configuration values or features. For example, if we
change the directory name that instances are stored in, how do we handle
instance directories created before the change landed? Do we move them? Do
we have a special case in the code? Do we assume that the operator will
recreate all the instances in their cloud?
Developer impact
----------------
Discuss things that will affect other developers working on OpenStack,
such as:
* If the blueprint proposes a change to the driver API, discussion of how
other hypervisors would implement the feature is required.
Implementation
==============
Assignee(s)
-----------
Who is leading the writing of the code? Or is this a blueprint where you're
throwing it out there to see who picks it up?
If more than one person is working on the implementation, please designate the
primary author and contact.
Primary assignee:
<launchpad-id or None>
Other contributors:
<launchpad-id or None>
Work Items
----------
Work items or tasks -- break the feature up into the things that need to be
done to implement it. Those parts might end up being done by different people,
but we're mostly trying to understand the timeline for implementation.
Dependencies
============
* Include specific references to specs and/or blueprints in cyborg, or in other
projects, that this one either depends on or is related to.
* If this requires functionality of another project that is not currently used
by Cyborg, document that fact.
* Does this feature require any new library dependencies or code otherwise not
included in OpenStack? Or does it depend on a specific version of library?
Testing
=======
Please discuss the important scenarios needed to test here, as well as
specific edge cases we should be ensuring work correctly. For each
scenario please specify if this requires specialized hardware, a full
OpenStack environment, or can be simulated inside the Cyborg tree.
Please discuss how the change will be tested. We especially want to know what
tempest tests will be added. It is assumed that unit test coverage will be
added so that doesn't need to be mentioned explicitly, but discussion of why
you think unit tests are sufficient and we don't need to add more tempest
tests would need to be included.
Is this untestable in gate given current limitations (specific hardware /
software configurations available)? If so, are there mitigation plans (3rd
party testing, gate enhancements, etc).
Documentation Impact
====================
Which audiences are affected most by this change, and which documentation
titles on docs.openstack.org should be updated because of this change? Don't
repeat details discussed above, but reference them here in the context of
documentation for multiple audiences. For example, the Operations Guide targets
cloud operators, and the End User Guide would need to be updated if the change
offers a new feature available through the CLI or dashboard. If a config option
changes or is deprecated, note here that the documentation needs to be updated
to reflect this specification's change.
References
==========
Please add any useful references here. You are not required to have any
reference. Moreover, this specification should still make sense when your
references are unavailable. Examples of what you could include are:
* Links to mailing list or IRC discussions
* Links to notes from a summit session
* Links to relevant research, if appropriate
* Related specifications as appropriate (e.g. if it's an EC2 thing, link the
EC2 docs)
* Anything else you feel it is worthwhile to refer to
History
=======
Optional section intended to be used each time the spec is updated to describe
new design, API or any database schema updated. Useful to let reader understand
what's happened along the time.
.. list-table:: Revisions
:header-rows: 1
* - Release Name
- Description
* - Pike
- Introduced