522 Commits

Author SHA1 Message Date
Zuul
7ecc9c08d9 Merge "Add obj_make_compatible()" 2020-03-02 16:35:36 +00:00
Zuul
3290719ca7 Merge "add support for multi node deployments to fake driver" 2020-03-02 10:57:33 +00:00
zhangbailin
9831730208 Add obj_make_compatible()
This adds a method to CyborgObject that allows it to convert itself
to and older version, within a compatibility window. So, if an object
had a revision that added or changed the formatting of an attribute,
the obj_make_compatible() method can fix up a primitive representation
before it is sent to a client expecting the older version.

Partial-Implements: blueprint add-description-field-to-device-profiles

Change-Id: I196629059bc32165f161fe9c071a339d63d71c10
2020-02-28 14:20:21 +00:00
zhangbailin
3c7e0868e6 Delete sandbox directory
This directory is not necessary exist in Cyborg project.

Change-Id: I845256168f60cd27522a7ae821bcaa50e1e218f5
2020-02-27 04:28:01 +00:00
Sean Mooney
eb12f68421 add support for multi node deployments to fake driver
This change alters the fake driver to include the hostname
in the deployable list so that each host in a multi node deployment
will have a unique placment RP name.

Change-Id: Ib0e202cac8af5ef7c5028c22dc0654911eb730f5
2020-02-24 23:16:13 +00:00
Dan Smith
1e1b2693aa Fix exceptions defined with improper _msg_fmt
Many exceptions are defined in such a way that they will not render properly
when stringified. This is because instead of _msg_fmt, they used msg_fmt
or message in the class definition.

This fixes those and adds a test which I used to find all the offenders.

Closes task: #38817

Change-Id: I085ef5b0197b76b7b53639610f62b615fb538983
2020-02-19 11:54:57 -08:00
Dan Smith
d279c22d1e Avoid creating a root provider when parent is not found
Before this change, when agent called to conductor to report_data(),
if the parent provider was not found by hostname, we would log an error,
and then continue to create the "child" provider with no parent. We should
never do this if we are supposed to have a parent. Cleanup from this
situation is also messy.

This makes us raise PlacementResourceProviderNotFound() in that case,
which aborts the report and thus does not create the provider incorrectly.
It also makes the agent catch that exception and moves the log message
to the agent where the actual problem is (i.e. likely misconfiguration).

The exception used here is actually defined incorrectly, having a message
class variable instead of _msg_fmt, which caused it to not render properly.
This fixes that along the way and adds tests for the new conductor and
agent behaviors.

Closes task: 38813
Closes task: 38814

Change-Id: Ied8ee91592eb0b4675f9c155e30a6c3a7df9b597
2020-02-19 11:40:42 -08:00
Zuul
e0ba01891f Merge "Update gpu driver" 2020-02-17 02:59:14 +00:00
Zuul
4d0a3d19f4 Merge "Solve py37 timeout" 2020-02-16 11:42:32 +00:00
chenke
08af601271 Solve py37 timeout
Py37 job always reports timeout error recently.
Please see [1] [2] [3].
At first it was suspected that the error was reported
because of the patch [4].
Therefore, Feng Shaohe's patch [5] revoked the merge,
and at this time, disappeared at py37 timeout.

But in fact, this problem is just hidden.
After removing this setting, the job of py37
is actually running on the environment of python 3.6
(community CI default version is 3.6), please see [6]
for detailed reasons.

Therefore, this patch exposes the hidden py37 timeout problem,
and at the same time, found method test_apply_patch_fpga_arq_monitor_job
, think it is the reason of the timeout. The reason I can find
this method is based on the the troubleshooting of tox -epy37 log.
After commenting out this method, I found that tox -epy37 can run
normally and there is no longer a timeout problem.

If you want to test, please ensure that you have a local
python3.7 environment, not 3.6, and execute rm .tox / -rf.
Then execute tox -epy37.

Therefore, the best way is to comment out this method and
restore py37 job at the same time.

If a friend discovers further reasons and solution, this method
can be restored, please refer to [7].

What went wrong in this method?
It is because in the deep call of this method, ThreadWork of
the thread pool will be used, which under Python3.7 will block
the execution of unit tests. For specific reasons, please see
[8] [9].

Reference:

[1]. https://review.opendev.org/#/c/702578/
[2]. https://review.opendev.org/#/c/703049/
[3]. https://review.opendev.org/#/c/703253/
[4]. https://review.opendev.org/#/c/696397/
[5]. https://review.opendev.org/#/c/706911/
[6]. http://eavesdrop.openstack.org/irclogs/%23openstack-infra/%23openstack-infra.2020-02-12.log.html#t2020-02-12T16:46:18
[7]. deed9c822e
[8]. https://review.opendev.org/#/c/707045/5//COMMIT_MSG
[9]. c61dd8c376/cyborg/objects/extarq/ext_arq_job.py (L41)

Change-Id: I09db889fe665c6246ec9503af92c909e7d0da24f
2020-02-14 19:49:34 +08:00
Zuul
c61dd8c376 Merge "Remove useless attributes list in Deployable" 2020-02-13 07:46:46 +00:00
Zuul
ffb848cee1 Merge "Improve UT for cyborg/db deployable" 2020-02-13 07:29:00 +00:00
Zuul
2ccc478986 Merge "Remove the invalid specs from doc/source" 2020-02-13 07:05:03 +00:00
Zuul
deed9c822e Merge "Fix warning in logs that '' is not a valid UUID." 2020-02-12 14:28:31 +00:00
Zuul
b8c74d68a8 Merge "Send a separate bind event to Nova for each ARQ in an instance." 2020-02-12 14:25:59 +00:00
Zuul
990e030983 Merge "Some bug fixes in async bind path." 2020-02-12 14:12:00 +00:00
Zuul
a6b5e8f0e3 Merge "bugs fix for compatibility issues between Py2 and Py3" 2020-02-12 14:12:00 +00:00
Zuul
1718526769 Merge "Improve UT for cyborg/db attach handle" 2020-02-12 13:15:30 +00:00
Zuul
41ef7a0dd8 Merge "Use ResourceNotFound replace ControlpathIDNotFound" 2020-02-12 12:55:55 +00:00
Zuul
b6c6c15cd7 Merge "Improve UT for cyborg/db ExtArq" 2020-02-12 12:55:51 +00:00
chenke
12c448cba3 Use ResourceNotFound replace ControlpathIDNotFound
This is a series of optimization for exception.

In fact, we only need to use the ResourceNotFound exception
to fit NotFound Exception.

More UT for control path such as:
get,list,create,delete will be added in the future.

Change-Id: I740eb28184b434583b58f10d2bf3e5e4621c43d4
Story: 2007045
Task: 38318
2020-02-12 09:38:43 +00:00
chenke
cb1b3ee651 Improve UT for cyborg/db ExtArq
This patch add some UT for ExtArq:
1. get
2. update
3. create
4. list
5. delete

Change-Id: I8f0d15d8c34f1eb77366d6021e465fcebd1be406
Story: 2007091
Task: 38133
2020-02-12 09:38:31 +00:00
chenke
e014259ac5 Remove useless attributes list in Deployable
The attribute and deployable tables have their separate
tables. We should remove the attributes from the
deployable object.

Change-Id: I1be185a6bce2ae90eca244b21b207a22e5a92044
Story: 2007182
Task: 38303
2020-02-12 09:38:17 +00:00
chenke
e7c6783858 Improve UT for cyborg/db deployable
This patch add some UT for deployable:
1. get
2. update
3. create
4. list
5. delete

Change-Id: I39b3f02e898b67e4d4eb686b5a6cf9065c6280de
Story: 2007091
Task: 38141
2020-02-12 09:38:06 +00:00
chenke
078014c053 Improve UT for cyborg/db device
This patch add some UT for device:
1. get
2. update
3. create
4. list
5. delete

Change-Id: Id66ba6f1442f87a0f8fb9644e45e147cc77a4f5e
Story: 2007091
Task: 38121
2020-02-12 09:37:53 +00:00
chenke
e2e1e3f156 Improve UT for cyborg/db attach handle
This patch add some UT for attach handle:
1. get
2. update
3. create
4. list
5. delete
6. allocate

Change-Id: I5e683c99d1e08ed6a166a110a87b665cdbc5bde3
Story: 2007091
Task: 38161
2020-02-12 09:37:40 +00:00
zhangbailin
aa2aa69e34 Remove the invalid specs from doc/source
The specs directory in Cyborg is not update, and we have the
Cyborg specifications in https://specs.openstack.org/openstack/cyborg-specs/,
so remove this directory in Cyborg, to reduce Cyborg maintenance costs.

Change-Id: Iebcbf2ebd6da3bc51e85c62f18c547909026c2f0
2020-02-12 15:31:04 +08:00
Sundar Nadathur
c87c232129 Fix warning in logs that '' is not a valid UUID.
Change-Id: I269554030908c1084f61a3d401524139a3735f28
2020-02-11 16:27:39 -08:00
Sundar Nadathur
d3648dccef Send a separate bind event to Nova for each ARQ in an instance.
This is based on discussion with Nova community. See:
https://review.opendev.org/#/c/692707/6/nova/objects/external_event.py@36

Each event has a unique tag, i.e. the ARQ UUID, and the
bind status for that ARQ.

Each ARQ has its own state. However, the bind status sent to Nova
should be 'completed' or 'failed'. The logic to do that conversion
should not be in nova_client.py, to keep it free of ARQ state details.
So it has been added in get_arq_bind_status() in ext_arq_job.py.

Change-Id: Iddbf9a77196fc42ac82ad1f6d88a4b0732852463
2020-02-11 16:22:34 -08:00
Sundar Nadathur
107cc7ea81 Some bug fixes in async bind path.
Change-Id: I5d575046e2be38f3bdd5d3f9c9495db121d8a05d
2020-02-10 23:36:11 -08:00
Shaohe Feng
e4dfc6f4bd bugs fix for compatibility issues between Py2 and Py3
Change-Id: I745eb4e28871fa0b554852831e5f4105ea677c27
2020-02-10 23:36:11 -08:00
Shaohe Feng
5b6f26abb8 Guess for the root cause of timeout
Change-Id: I877794c738f3c6ec09e9f83476b1f91096447afa
2020-02-10 23:36:10 -08:00
zhangbailin
acbc64f3be Enhance the db layer to verify filters
Now if we init filters=None, as call
dbapi.device_profile_list_by_filters(self.context, filters=None),
that will raise an NoneType error.

Mainly error info:
Traceback (most recent call last):
  File "/home/my_work/code/cyborg/cyborg/db/sqlalchemy/api.py", line
558, in device_profile_list_by_filters
    filters, exact_match_filter_names)
  File "/home/my_work/code/cyborg/cyborg/db/sqlalchemy/api.py", line
223, in _exact_filter
    if key not in filters:
TypeError: argument of type 'NoneType' is not iterable

This patch will add initial validation of the filters.

Change-Id: Icf711dc3621fb8d2e5b022ab1d1ce02b0885b055
2020-02-09 03:15:44 +00:00
Zuul
70bc4b89a4 Merge "Improve UT for cyborg/db device profile" 2020-02-08 17:41:41 +00:00
Zuul
a00ea86a69 Merge "Define fake db objects for UT" 2020-02-08 17:41:35 +00:00
Zuul
19ffa3b790 Merge "testcase for FPGAExtARQ" 2020-02-03 18:05:24 +00:00
Zuul
4d1c78110a Merge "Document the alembic CLI better in README" 2020-01-21 22:51:11 +00:00
Zuul
244cdae066 Merge "Use ResourceNotFound replace RP and Image NotFound" 2020-01-21 08:03:16 +00:00
zhangbailin
a83f43117a Document the alembic CLI better in README
While we use ``cyborg-dbsync revision --message --autogenerate``
(Note in README) to upgrate the db script, it raises an error,
such as:

root@ubuntu:# cyborg-dbsync revision --message --autogenerate
usage: cyborg-dbsync revision [-h] [-m MESSAGE] [--autogenerate]
cyborg-dbsync revision: error: argument -m/--message: expected one
argument

So add a brief message after the "--message" parameter to make the
command run correctly.

Change-Id: I50cb8f105b309b1f7c7653bca2897d31841e0e46
2020-01-21 00:35:47 +00:00
Yumeng Bao
d443b2770b Update gpu driver
Current gpu driver is out of data, it cannot generate attribute info
("resource_class"/"rc" and "traits") of the gpu device, as a result
cyborg-conductor cannot report the device to placement and nova cannot
schedule this device during the process of nova boot.
This patch added the attribute_list generation in gpu driver so that
the device info can be reported to placement and user can boot an instance
with gpu device.

Change-Id: Ifd44d388d8982f90a3e1c5dd35116cae68e80627
2020-01-18 23:53:05 -08:00
chenke
6b28ca2722 Improve UT for cyborg/db device profile
This patch add some UT for device profile:
1. get
2. update
3. create
4. list
5. delete

Change-Id: Ia6f415b74150cdda94d71dc8b5e760f7758a526e
Story: 2007091
Task: 38120
2020-01-19 15:31:20 +08:00
Zuul
12cacd9b73 Merge "Remove useless get_test_accelerator method and fix uuid error" 2020-01-19 03:26:09 +00:00
Zuul
f0a5d84fcb Merge "Set default value in get fpga trait" 2020-01-19 02:45:51 +00:00
chenke
d8cbe092de Define fake db objects for UT
This is a series of unit test optimizations for db.

Change-Id: I96a2f8292c7ada1508af32d5b56d746b36abe054
Story: 2007091
Task: 38140
2020-01-19 09:31:52 +08:00
Shaohe Feng
33e74c0b8d testcase for FPGAExtARQ
This is one of ExtARQ bind UT patches serial.

Change-Id: I5e7dcae051a45cfcd9f27d463f593a79960ce346
2020-01-17 11:10:52 +00:00
chenke
4bf582a849 Use ResourceNotFound replace RP and Image NotFound
This is a series of optimization for exception.

In fact, we only need to use the ResourceNotFound exception
to fit NotFound Exception.

This patch also adds dependency package cursive which is used
in [1].

[1]. 6740c3c0c5/cyborg/image/glance.py (L30)

Change-Id: I9e80dcfed54147c942f90c696e483fa6db842dde
Story: 2007045
Task: 37968
2020-01-17 09:08:22 +00:00
chenke
298ab6cc86 Remove useless get_test_accelerator method and fix uuid error
For method get_test_accelerator(), it will not be used in V2.
So we should remove it.

For uuid in cyborg/tests/unit/db/utils.py, it lacks of one num.
The length of uuid should be 36 instead of 35.

Change-Id: I791ef5ee95d6dd9ff9271dc01c72075631a3efaa
2020-01-16 20:22:29 +08:00
Zuul
a39f816b55 Merge "Set ignore_basepython_conflict (fixes confusing pep8 message)" 2020-01-16 00:59:35 +00:00
Zuul
9b3d1db3e8 Merge "Use ResourceNotFound replace DeployableNotFound" 2020-01-15 09:31:18 +00:00
Zuul
becdd8c52b Merge "Use ResourceNotFound replace AttachHandleNotFound" 2020-01-15 09:28:09 +00:00