3767 Commits

Author SHA1 Message Date
Renat Akhmerov
c9e08a8839 Fix "join" when the last indirect inbound task failed
* See bug description for the example that didn't work. It was
  caused by a simple mistake in a python expression of type
  "my_set = my_set or set()" that didn't work as expected, i.e.
  it created a new set even if my_set is already an empty set.
  So, the proper expression that's needed is
  "my_set = set() if my_set is None else my_set"

Change-Id: I2a787921449fecf3301013a770ffe712e9606baf
Closes-Bug: #1803677
2018-11-16 15:35:18 +07:00
Zuul
a0c8da92dd Merge "Fix race condition in refreshing "join" task state" 2018-11-15 20:30:33 +00:00
Renat Akhmerov
90ddf442ee Clone cached action definitions
* Once in a while we get DetachedInstanceError for action definitions
  and it happens when they are fetched from cache. We must always
  clone persistent objects before caching them.

Change-Id: I1d0cffea6775eb258dcefc0dbb8a6ee18effe597
Closes-Bug: #1803528
2018-11-15 18:39:21 +07:00
Renat Akhmerov
05ce6f893d Fix race condition in refreshing "join" task state
* Previously we used periodic jobs to refresh state of "join" tasks
  and there was a guarantee that only one such job could run at a
  time, so there wasn't a need in using locking. Now we allow more
  than one such jobs run in parallel processes (and threads) so
  we have to lock task execution and then check task state again
  and update it, if needed.

Change-Id: Icaad486d9c3f830db0314dedb44664940cca0014
Closes-Bug: #1803483
2018-11-15 11:32:50 +07:00
Zuul
0d85973f52 Merge "Increment versioning with pbr instruction" 2018-11-13 10:48:44 +00:00
Zuul
8059723309 Merge "Update min tox version to 2.0" 2018-11-13 10:05:15 +00:00
Zuul
2c9a0e79fa Merge "Remove setup.py check from pep8 job" 2018-11-13 10:05:15 +00:00
Zuul
4b83cef3af Merge "Divide yaml input to save it into definitions separately." 2018-11-12 10:45:05 +00:00
Sean McGinnis
a82c4918a3 Remove setup.py check from pep8 job
Using "python setup.py check -r -s" method of checking the package has
been deprecated with the new recommendation to build the sdist and
wheel, then running "twine check" against the output.

Luckily, there is already a job that covers this that only runs when the
README, setup.py, or setup.cfg files change, making running this in the
pep8 job redundant. This covered by the test-release-openstack-python3
that is defined in the publish-to-pypi-python3 template.

More details can be found in this mailing list post:

http://lists.openstack.org/pipermail/openstack-dev/2018-October/136136.html

Change-Id: Iab0d2a2086cfecb8cc609c11de67ebbfc9d4d7d5
Signed-off-by: Sean McGinnis <sean.mcginnis@gmail.com>
2018-11-12 08:56:03 +00:00
Zuul
ce4dc70be5 Merge "Refactor action execution checker without using scheduler" 2018-11-10 16:23:21 +00:00
Zuul
c48efec6db Merge "Add batch size for integrity checker" 2018-11-10 15:09:42 +00:00
Zuul
3e5c102ee5 Merge "Simplify workflow and join completion logic" 2018-11-10 11:36:31 +00:00
Zuul
2737586560 Merge "Fix how action result is assigned to task 'state_info' field" 2018-11-09 14:12:52 +00:00
Zuul
4d8d8821ef Merge "Allow None for 'params' when starting a workflow execution" 2018-11-09 11:59:49 +00:00
Renat Akhmerov
43a8bddc24 Fix how action result is assigned to task 'state_info' field
Closes-Bug: #1802477

Change-Id: Ia8848b3bb0417f66422c4995b64be7a803dde1e7
2018-11-09 16:14:25 +07:00
Oleg Ovcharuk
c712e369ed Divide yaml input to save it into definitions separately.
In case of creating/updating multiple workflows from one yaml,
we should not save the whole input to each workflow.

Closes-Bug: #1792975
Change-Id: I724c041ab3441805fcfa2cfc4a50afd774998cc7
Signed-off-by: Oleg Ovcharuk <vgvoleg@gmail.com>
2018-11-09 07:27:44 +00:00
Renat Akhmerov
2d74e6ebac Refactor action execution checker without using scheduler
* Removed using scheduler from action execution heartbeat checker
  in favor of regular threads.
* Added the new config options "batch_size" under [action_heartbeat]
  group to limit a number of action executions being processed during
  one iteration the checker.
* Added a test checking that an action execution is automatically
  failed by the heartbeat checker.

Closes-Bug: #1802065
Change-Id: I18c0c2c3159b9294c8af96c93c65a6edfc1de1a1
2018-11-09 14:17:28 +07:00
Renat Akhmerov
3b4136ff1e Add batch size for integrity checker
* Added the new property 'execution_integrity_check_batch_size'
  under the [engine] group to limit the number of task executions
  that the integrity checker may process during one iteration.

Closes-Bug: #1801876
Change-Id: I3c5074c45c476ebff109617cb15d56c54575dd4f
2018-11-09 14:17:27 +07:00
Renat Akhmerov
80a1bed67b Simplify workflow and join completion logic
* action_queue module is replaced with the more generic
  post_tx_queue module that allows to register operations that must
  run after the main DB transaction associated with processing a
  workflow event such as completing action.
* Instead of calling workflow completion check from all places
  where task may possibly complete, Mistral now registers a post
  transactional operation that runs after the main DB transaction
  (to make sure at least one needed consistent DB read) right
  inside the task completion logic. It reduces clutter significantly.
* Workflow completion check is now registered only if the just
  completed task may lead to workflow completion, i.e. if it's the
  last one in a workflow branch.
* Join now checks delayed calls to reduce a number of join
  completion checks created with scheduler and also uses post
  transactional queue for that.

Closes-Bug: #1801872
Change-Id: I90741d4121c48c42606dfa850cfe824557b095d0
2018-11-09 14:17:20 +07:00
Renat Akhmerov
b413aa087e Allow None for 'params' when starting a workflow execution
Change-Id: Ic28352d9acbe9e3f53a9d33a4ff0a5f99261f53f
Closes-Bug: #1793651
2018-11-08 21:01:21 +07:00
akhiljain23
b1dd0613c4 Update min tox version to 2.0
The commands used by constraints need at least tox 2.0.  Update to
reflect reality, which should help with local running of constraints
targets.

Change-Id: I0bb160bd02b876ed94a3804c88087289f9c3acc2
2018-11-07 07:07:20 +00:00
Renat Akhmerov
3d7acd3957 Improve workflow completion logic by removing periodic jobs
* Workflow completion algorithm use periodic scheduled jobs to
  poll DB and determine when a workflow is finished. The problem
  with this approach is that if Mistral runs another iteration
  of such job too soon then running such jobs will create a big
  load on the system. If too late, then a workflow may be in
  RUNNING state for too long after all its tasks are completed.
  The current implementation tries to predict a delay with which
  the next job should run, based on a number of incompleted tasks.
  This approach was initially taken because we switched to a
  non-blocking transactional model (previously we locked the entire
  workflow execution graph in order to change a state of anything)
  and in this architecture, when we have parallel branches, i.e.
  parallel DB transactions, we can't make a consistent read from
  DB from neither of these transactions to make a reliable decision
  about whether the workflow is completed or not. Using periodic
  jobs was a solution. However, this approach has been proven to
  work unreliably because such a prediction about delay before the
  next job iteration doesn't work well on all variety of use cases
  that we have.
  This patch removes using periodic jobs in favor of using the
  "two transactions" approach when in the first transaction we
  handle action completion event (and task completion if it causes
  it) and in the second transaction, if a task is completed, we
  check if the workflow is completed. This approach guarantees
  that at least one of the "second" transactions in parallel
  branches will make needed consistent read from DB (i.e. will
  see the actuall state of all needed objects) to make the right
  decision.

Closes-Bug: #1799382
Change-Id: I2333507503b3b8226c184beb0bd783e1dcfa397f
2018-11-07 04:00:04 +00:00
Thomas Herve
ec3d14112c Fix senlin fake client creation
The new openstacksdk mechanism forces a keystone request to find info
about endpoints. We don't need this for fake client, so skip the
__init__ of the class.

Change-Id: I5b0d89ac57c14f982a6afa638f088d365e0e4ab8
2018-11-06 11:52:32 +01:00
Renat Akhmerov
c39842b849 Fix usage of cachetools in lookup_utils
* In the latest version of cachetools lib (3.0.0) the previously
  deprecated argument "missing" of cache classes has been removed.
* Disabled test_generator failing due to the changes in the
  senlin client until it's fixed by https://review.openstack.org/614211

Change-Id: Iac42f592834734a6fddb743e947860b3bb7e1aba
2018-11-06 15:36:43 +07:00
Zuul
eb402e40ff Merge "Improve join by removing periodic jobs" 2018-10-23 15:13:01 +00:00
Renat Akhmerov
1a4c599a4d Improve join by removing periodic jobs
* This patch removes the approach with DB polling needed to
  determine if a "join" task is ready to run. Instead of running
  a periodic scheduled job, each task completion now runs the
  algorithm that finds all potentially affected join tasks
  and schedules just one job (instead of a periodic job) to check
  their readiness.
  This solves a problem of system cascaded overloading in case of
  having many very large joins (when a workflow has many joins with
  many  dependencies each). Previously, in such case Mistral created
  too many periodic jobs that just didn't let the workflow progress
  well, i.e. most CPU was used by scheduler to run those periodic
  jobs that very rarely switched "join" tasks to the RUNNING state.

Change-Id: I5ebc44c7a3f95c868d653689dc5cea689c788cd0
Closes-Bug: #1799356
2018-10-23 14:01:39 +07:00
Zuul
a2a477b5ea Merge "Mistral install guide" 2018-10-23 05:16:34 +00:00
visnyei
0aa73edbc1 Mistral install guide
First attempt at creating the mistral install guide

Change-Id: I30142b46e36270b573b9ec10201907811040d94b
Signed-off-by: visnyei <andrea.visnyei@nokia.com>
2018-10-19 10:45:29 +02:00
Dougal Matthews
0b38cd8028 Reduce the concurrency in the 500 wb join Rally task
This should reduce the number of deadlocks in MySQL, making the job more stable.

Change-Id: I06fed65a0321e4381d46a93693f9f3d622a73b8b
2018-10-17 09:36:29 +01:00
Zuul
70c269e7d1 Merge "Fix next link in get resource list rest API" 2018-10-16 06:06:53 +00:00
Zuul
cdd1539ada Merge "An execution hangs in the RUNNING state after rerun" 2018-10-15 22:42:44 +00:00
Zuul
937a0a1633 Merge "Update version.version_string to actually be a string" 2018-10-15 07:18:53 +00:00
Zuul
e6a8f74a6a Merge "Update OnClauseSPec task name criteria" 2018-10-15 07:18:52 +00:00
Zuul
ef95bae717 Merge "make user_info_endpoint_url independent of auth_url" 2018-10-15 07:14:46 +00:00
Vitalii Solodilov
041f3bd35c An execution hangs in the RUNNING state after rerun
When we rerun an execution we must create the "_check_and_complete"
delayed calls for all parent workflows. The problem was that we created
the delayed call only for the rerun execution and its parent.

Recursive rerun was extracted in the separated function. Because we
need to execute some additional operations, for example, create delayed
call for every a rerun execution.

Change-Id: I530094e916daf25bb9c672c445afa980ad4311ae
Closes-Bug: #1792451
Signed-off-by: Vitalii Solodilov <mcdkr@yandex.ru>
2018-10-14 10:29:16 +00:00
Renat Akhmerov
991734a294 Add sqlalchemy.exc.OperationalError to the retry decorator
* Currently Mistral retries a DB transaction only in case of a DB
  deadlock (often happens on MySql) and a connection error. Both
  make sense to retry because the issue may be temporary. This
  patch also adds sqlalchemy.exc.OperationalError to the list of
  retriable exceptions since part of the errors wrapped into this
  exception may also be temporary, such as "Too many connections"
  error thrown by MySql. Some errors may not make sense to retry
  though (like SQL error) but this shouldn't be a problem because
  most of them will happen during development/testing time and
  will be fixed before going in production and even if it happens
  in a real production the worst thing that will happen is retrying
  a DB transaction up to the maximum configured number of attempts,
  currently hardcoded 50 times.

Change-Id: Ie2fe988cdb8e4ca88c3e51f510d87320d3fca9a6
Closes-Bug: #1796242
2018-10-14 10:27:45 +00:00
Eyal
ae23de737d make user_info_endpoint_url independent of auth_url
Client should be able to create a token using “auth_url” (e.g. ”https://keycloak:7443/auth”)
Server should be able to validate the token using “user_info_endpoint_url” (e.g. “https://cbnd:9443/something/custom”)
also be backward compatible

Change-Id: I437fde40345af52483cc764e5dc6a1f55f1b3e88
2018-10-14 09:21:52 +03:00
Sean McGinnis
b902b96392 Increment versioning with pbr instruction
With moving away from required milestone releases, the version numbers
calculated by PBR on the master branch will not work for those testing
upgrades from the last stable release. More details can be found in the
mailing list post here:

    http://lists.openstack.org/pipermail/openstack-dev/2018-October/135706.html

This is an empty commit that will cause PBR to increment its calculated
version to get around this.

PBR will see the following which will cause it to increment the version:

Sem-Ver: feature

Please merge this patch as soon as possible to support those testing
upgrades.

Change-Id: Ibd0fc3036014a4cfaa76d48c707ccb04cfc869a2
Signed-off-by: Sean McGinnis <sean.mcginnis@gmail.com>
2018-10-12 13:10:47 -05:00
Bob Haddleton
e98614cd60 Update OnClauseSPec task name criteria
The OnClauseSpec required Task names to be \w+ or [a-zA-Z0-9_]
which is not enforced by the DSL, so it was possible to have
valid task names that could not be referenced in an on-clause.

YAML enforces some restrictions on characters in task names (#, !, |)
but other than that any JSON-schema valid string should be a valid Task
name

Change-Id: I3f1056cad7c67e160a082c2a0de2e3bfd476bc63
Closes-Bug: 1797439
2018-10-12 11:58:22 -05:00
hardikj
08ba20c8d2 Fix next link in get resource list rest API
Fix the issue where `next` link in the response of the REST API, when
used with pagination, gets corrupted and makes it unusable. After this
fix, next link should be readily usable.

Change-Id: Idf45a59e0b07d8306cc82391679fe30a9cd2f0c1
Closes-Bug: #1793344
2018-10-12 16:04:00 +05:30
Dougal Matthews
9be7e928d6 Remove remaining references to the rpc_backend
A number of configuration options provided by oslo.messaging were
deprecated in Ocata and have now been removed. See
https://docs.openstack.org/releasenotes/oslo.messaging/unreleased.html#upgrade-notes
for more details.

* Because of the removal of a number of options from
  [oslo_messaging_rabbit] some code related to them and the
  corresponding tests for the Kombu RPC now don't make sense
  and so they've been removed by this patch.
* Style/formatting changes in the Kombu RPC tests.

Change-Id: I37c71dbe4bb270367f5434b0b8c2557e29a9b1df
2018-10-12 11:23:50 +07:00
Bob Haddleton
488b40834f Update version.version_string to actually be a string
pbr provides VersionInfo.version_string() as a method to determine
the version.  mistral.version.version_string can hide this and
provide a uniform string interface to config and launch.

This fixes a problem with mistral-server --version generating
an exception because the version passed to argparse was a method
instead of a string.

Change-Id: Ie468685e4360bfaec5d82b02f8cf1a27a93bcd94
Closes-Bug: 1796921
2018-10-09 11:30:10 -05:00
Nguyen Van Trung
64622cff7b Don't quote {posargs} in tox.ini
Quotes around {posargs} cause the entire string to be combined into one
arg that gets passed to stestr. This prevents passing multiple args
(e.g. '--concurrency=16 some-regex')

Change-Id: I25ebe667cce8a2a35f9b119b76bbed7851e458f5
2018-10-09 10:16:17 +07:00
Zuul
b3fbb03b81 Merge "Add entry point to allow for oslo.policy CLI usage" 2018-10-08 12:33:22 +00:00
Zuul
fc9876ec7b Merge "Make task execution logging more readable and informative" 2018-10-08 09:28:45 +00:00
Dougal Matthews
f0b49a82b4 Add a release note for Ic98e2db02abd8483591756d73e06784cc2e9cbe3
Change-Id: I484b47f9e0d94c37079961d18061e2e9e121af92
2018-10-01 10:56:08 +01:00
Renat Akhmerov
c802ad2851 Make task execution logging more readable and informative
* Changed a debug log statement more readable for tasks
* Minor style changes

Change-Id: I841c15230fe2bc1e605a985bb1b9cd6131ac795c
2018-10-01 11:37:48 +07:00
Thomas Herve
5c005a7926 Cleanup transport along RPC clients
This fixes a bad weird condition in the API server related to
cron-triggers and SIGHUP. The parent API server creates a RPC connection
when creating workflows from cron. If a SIGUP signal happens after, the
child inherits the connection, but it's non-functional.

Change-Id: Ic98e2db02abd8483591756d73e06784cc2e9cbe3
Closes-Bug: #1789680
2018-09-27 11:45:42 +02:00
Lance Bragstad
5e3cdec918 Add entry point to allow for oslo.policy CLI usage
The oslo.policy library exposes entry points so that users can
generate sample policy files and templates. The entry points do
expect some things to be done by the service in order to work,
though.

This commit adds an entry point for oslo.policy so that it can
consume an enforcer that has been initialized with mistrals
policies. The library will use this to generate useful things
for users like templates and sample policy files.

Change-Id: Ib442fbb79b5c237d634586c3169cf8c7f595da1c
Closes-Bug: 1793346
2018-09-19 16:25:33 +00:00
Zuul
8720a2711b Merge "Fix how Mistral calculates workflow output" 2018-09-13 10:16:12 +00:00