Add an article about profiling into the docs
Change-Id: I79befa91fcaf393ec19498372f1e6c6953151867
This commit is contained in:
parent
0a3880c22c
commit
cc825ebcd3
@ -7,5 +7,6 @@ Contributor Documentation
|
||||
|
||||
coding_guidelines
|
||||
debugging_and_testing
|
||||
profiling
|
||||
troubleshooting
|
||||
devstack
|
||||
|
298
doc/source/developer/contributor/profiling.rst
Normal file
298
doc/source/developer/contributor/profiling.rst
Normal file
@ -0,0 +1,298 @@
|
||||
Profiling Mistral
|
||||
=================
|
||||
|
||||
What Is Profiling?
|
||||
------------------
|
||||
Profiling is a procedure for gathering runtime statistics about certain code
|
||||
snippets like:
|
||||
|
||||
- The maximum run time
|
||||
- The minimum run time
|
||||
- The average run time
|
||||
- The number of runs
|
||||
|
||||
Such info is a key to understanding performance bottlenecks residing in
|
||||
a system. Having these metrics, we can focus on places in code that slow
|
||||
down the system most and come up with optimisations to improve them.
|
||||
|
||||
A typical code snippet eligible for gathering this kind of information is a
|
||||
function or a method since, most popular engineering techniques encourage
|
||||
developers to decompose code into functions/methods representing well defined
|
||||
parts of program logic. However, any arbitrary piece of code may be a target
|
||||
for measuring.
|
||||
|
||||
'osprofiler' Project
|
||||
--------------------
|
||||
|
||||
`osprofiler <https://osprofiler.readthedocs.io/en/latest/>`_ is a project
|
||||
created within the OpenStack ecosystem to do profiling. The paragraphs below
|
||||
explain how Mistral uses 'osprofiler' for profiling. The central concept of
|
||||
'osprofiler' is a profile trace. A developer can mark code snippets with
|
||||
profiler traces and 'osprofiler' will be tracking them. In general,
|
||||
'osprofiler' allows cross-service profiling, that is, tracking a chain of
|
||||
calls that belong to different RESTful services but related with the same
|
||||
user request. However, this guide doesn't cover this more complex use case
|
||||
and focus on profiling within just one service, Mistral.
|
||||
|
||||
Profiler Traces
|
||||
---------------
|
||||
|
||||
The most common way to create a profiler trace in the code is adding a
|
||||
special ''@trace"
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from osprofiler import profiler
|
||||
|
||||
|
||||
class DefaultEngine(base.Engine):
|
||||
...
|
||||
|
||||
@profiler.trace('engine-on-action-complete', hide_args=True)
|
||||
def on_action_complete(self, action_ex_id, result, wf_action=False,
|
||||
async_=False):
|
||||
with db_api.transaction():
|
||||
if wf_action:
|
||||
action_ex = db_api.get_workflow_execution(action_ex_id)
|
||||
|
||||
if result is None:
|
||||
result = ml_actions.Result(data=action_ex.output)
|
||||
else:
|
||||
action_ex = db_api.get_action_execution(action_ex_id)
|
||||
|
||||
action_handler.on_action_complete(action_ex, result)
|
||||
|
||||
return action_ex.get_clone()
|
||||
|
||||
In this example, we applied a special decorator to a method that adds a
|
||||
profiling trace. The most important argument of the decorator is a trace
|
||||
name. Its value is 'engine-on-action-complete' in our case. The second
|
||||
argument 'hide_args' defines whether 'osprofiler' needs to pass method
|
||||
argument values down to other layers. More specifically, there's a notion
|
||||
metrics collector in 'osprofiler' that accumulates info about traces
|
||||
in any desirable form, it depends on a particular implementation. This
|
||||
topic though is out of the scope of this document. For our purposes, it's
|
||||
better to set this argument to **True** which will not lead to loosing
|
||||
performance on processing additional data (argument values of all method
|
||||
calls).
|
||||
|
||||
Another way of adding a profiling trace is the following:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
try:
|
||||
profiler.start("engine-on-action-complete")
|
||||
|
||||
action_handler.on_action_complete(action_ex, result)
|
||||
finally:
|
||||
profiler.stop()
|
||||
|
||||
|
||||
Here we don't decorate the entire method, we only want to profile just one
|
||||
line of code. But like in the previous example, we added a profiling trace.
|
||||
The obvious advantage of using the decorator is that it can live in code
|
||||
permanently because it doesn't pollute it too much and we can use them any
|
||||
time we want to profile the system.
|
||||
|
||||
Even simpler and more concise way to achieve the same is use a special
|
||||
context manager from 'osprofiler':
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
with profiler.Trace('engine-on-action-complete'):
|
||||
action_handler.on_action_complete(action_ex, result)
|
||||
|
||||
Configuring Mistral for Profiling
|
||||
---------------------------------
|
||||
|
||||
To start a profiling session, one needs to make the steps below.
|
||||
|
||||
Mistral Configuration File
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Make these change in the config file:
|
||||
|
||||
.. code-block:: cfg
|
||||
|
||||
[DEFAULT]
|
||||
log_config_append = wf_trace_logging.conf
|
||||
|
||||
[profiler]
|
||||
enabled = True
|
||||
hmac_keys = secret_word
|
||||
|
||||
Defining the 'log_config_append' property allows to have all the logging
|
||||
configuration in a separate file. In the example above, it's called
|
||||
'wf_trace_logging.conf' but it can have a different name, if needed.
|
||||
'[profiler]' group directly refers to the 'osprofiler' project and is
|
||||
brought by it. The property 'enabled' is self-explaining, but the other one
|
||||
is not. The value of the property 'hmac_keys' basically needs to be known
|
||||
by someone who wants to start a profiling session. This value needs to be
|
||||
passed as part of the user request. It will be shown a bit later.
|
||||
|
||||
Logging Configuration File
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
The content of the logging configuration file conforms the documentation for
|
||||
the standard 'logging' Python module. Find more details at
|
||||
https://docs.python.org/3/library/logging.config.html#configuration-file-format
|
||||
|
||||
This particular example of the logging file configures three different loggers
|
||||
and their corresponding counterparts like handlers. For the purpose of this
|
||||
document though we only need to pay attention how 'profiler_trace' logger is
|
||||
configure. Every entity starting with 'profiler' is related to profiling
|
||||
configuration. The reason why other loggers are also included here is to show
|
||||
how different loggers can coexist within one configuration file and how they
|
||||
can reuse same entities.
|
||||
|
||||
|
||||
.. code-block:: cfg
|
||||
|
||||
[loggers]
|
||||
keys=workflow_trace,profiler_trace,root
|
||||
|
||||
[handlers]
|
||||
keys=consoleHandler, wfTraceFileHandler, profilerFileHandler, fileHandler
|
||||
|
||||
[formatters]
|
||||
keys=wfFormatter, profilerFormatter, simpleFormatter, verboseFormatter
|
||||
|
||||
[logger_workflow_trace]
|
||||
level=INFO
|
||||
handlers=consoleHandler, wfTraceFileHandler
|
||||
qualname=workflow_trace
|
||||
propagate=0
|
||||
|
||||
[logger_profiler_trace]
|
||||
level=INFO
|
||||
handlers=profilerFileHandler
|
||||
qualname=profiler_trace
|
||||
|
||||
[logger_root]
|
||||
level=DEBUG
|
||||
handlers=fileHandler
|
||||
|
||||
[handler_fileHandler]
|
||||
class=FileHandler
|
||||
level=DEBUG
|
||||
formatter=verboseFormatter
|
||||
args=("/tmp/mistral.log",)
|
||||
|
||||
[handler_consoleHandler]
|
||||
class=StreamHandler
|
||||
level=INFO
|
||||
formatter=simpleFormatter
|
||||
args=(sys.stdout,)
|
||||
|
||||
[handler_wfTraceFileHandler]
|
||||
class=FileHandler
|
||||
level=INFO
|
||||
formatter=wfFormatter
|
||||
args=("/tmp/mistral_wf_trace.log",)
|
||||
|
||||
[handler_profilerFileHandler]
|
||||
class=FileHandler
|
||||
level=INFO
|
||||
formatter=profilerFormatter
|
||||
args=("/tmp/mistral_osprofile.log",)
|
||||
|
||||
[formatter_verboseFormatter]
|
||||
format=%(asctime)s %(thread)s %(levelname)s %(module)s [-] %(message)s
|
||||
datefmt=
|
||||
|
||||
[formatter_simpleFormatter]
|
||||
format=%(asctime)s - %(message)s
|
||||
datefmt=%y-%m-%d %H:%M:%S
|
||||
|
||||
[formatter_wfFormatter]
|
||||
format=%(asctime)s WF [-] %(message)s
|
||||
datefmt=
|
||||
|
||||
[formatter_profilerFormatter]
|
||||
format=%(message)s
|
||||
datefmt=%H:%M:%S
|
||||
|
||||
|
||||
Triggering Profiling Sessions
|
||||
-----------------------------
|
||||
|
||||
Once Mistral is configured like explained above, in order to start a
|
||||
profiling session we need to make a user request to Mistral that we
|
||||
want to analyse but adding one property to it. The name of the property
|
||||
is 'profile' and it needs to be set to the value of the 'hmac_keys'
|
||||
property from the main configuration file.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
$ mistral execution-create my_slow_workflow --profile secret_word
|
||||
|
||||
Profiling Session Result
|
||||
------------------------
|
||||
|
||||
When started in a profiling mode like just shown, Mistral will be writing
|
||||
info about the profiling traces into the configured file. In our case it is
|
||||
'/tmp/mistral_osprofile.log'.
|
||||
|
||||
.. code-block:: cfg
|
||||
|
||||
2020-02-27T08:04:25.789433 f12e75d5-5d59-4cbc-b74d-357f19290dd7 f12e75d5-5d59-4cbc-b74d-357f19290dd7 b9b29981-0916-4635-af18-d6c92f991f46 engine-start-workflow-start
|
||||
2020-02-27T08:04:25.790232 f12e75d5-5d59-4cbc-b74d-357f19290dd7 b9b29981-0916-4635-af18-d6c92f991f46 3cdd41b5-318a-4926-a38e-63344b6aef7a workflow-handler-start-workflow-start
|
||||
2020-02-27T08:04:25.812879 f12e75d5-5d59-4cbc-b74d-357f19290dd7 3cdd41b5-318a-4926-a38e-63344b6aef7a 603f1fab-be78-438d-af13-d94ed3b7e416 workflow-start-start
|
||||
2020-02-27T08:04:25.954502 f12e75d5-5d59-4cbc-b74d-357f19290dd7 603f1fab-be78-438d-af13-d94ed3b7e416 b1d0a77a-52f5-4415-a6c4-f16b3591a47d workflow-set-state-start
|
||||
2020-02-27T08:04:25.961298 0.006782 f12e75d5-5d59-4cbc-b74d-357f19290dd7 603f1fab-be78-438d-af13-d94ed3b7e416 b1d0a77a-52f5-4415-a6c4-f16b3591a47d workflow-set-state-stop
|
||||
2020-02-27T08:04:25.961769 f12e75d5-5d59-4cbc-b74d-357f19290dd7 603f1fab-be78-438d-af13-d94ed3b7e416 27b58351-aebe-4e37-9cec-91fdbef5c68b wf-controller-get-controller-start
|
||||
2020-02-27T08:04:25.962041 0.000267 f12e75d5-5d59-4cbc-b74d-357f19290dd7 603f1fab-be78-438d-af13-d94ed3b7e416 27b58351-aebe-4e37-9cec-91fdbef5c68b wf-controller-get-controller-stop
|
||||
2020-02-27T08:04:25.962311 f12e75d5-5d59-4cbc-b74d-357f19290dd7 603f1fab-be78-438d-af13-d94ed3b7e416 605ebfc2-a2bb-4fe1-8159-fc16f6741f5f workflow-controller-continue-workflow-start
|
||||
2020-02-27T08:04:26.023134 0.060832 f12e75d5-5d59-4cbc-b74d-357f19290dd7 603f1fab-be78-438d-af13-d94ed3b7e416 605ebfc2-a2bb-4fe1-8159-fc16f6741f5f workflow-controller-continue-workflow-stop
|
||||
2020-02-27T08:04:26.023600 f12e75d5-5d59-4cbc-b74d-357f19290dd7 603f1fab-be78-438d-af13-d94ed3b7e416 3a5a384a-9598-4844-a740-981f92e604af dispatcher-dispatch-commands-start
|
||||
2020-02-27T08:04:26.023918 f12e75d5-5d59-4cbc-b74d-357f19290dd7 3a5a384a-9598-4844-a740-981f92e604af d84a13e4-4763-4321-ab08-8cbd19656f2f task-handler-run-task-start
|
||||
2020-02-27T08:04:26.024179 f12e75d5-5d59-4cbc-b74d-357f19290dd7 d84a13e4-4763-4321-ab08-8cbd19656f2f 7878e4f8-aaaa-4b9b-b15a-35848b5cdd61 task-handler-build-task-from-command-start
|
||||
2020-02-27T08:04:26.024422 0.000243 f12e75d5-5d59-4cbc-b74d-357f19290dd7 d84a13e4-4763-4321-ab08-8cbd19656f2f 7878e4f8-aaaa-4b9b-b15a-35848b5cdd61 task-handler-build-task-from-command-stop
|
||||
|
||||
So any time Mistral runs code marked as a profiling trace it prints two
|
||||
entries into the file: right before the code snippet starts and right
|
||||
after its completion. Notice also that for the corresponding "-stop" entry
|
||||
(the suffix going after the trace name) Mistral prints an additional number
|
||||
in the second column. This is a duration of the code snippet.
|
||||
|
||||
This content of this file itself is probably not so useful (although, it
|
||||
might be for some purpose) but based on it we can build the following
|
||||
report:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
Total time | Max time | Avg time | Occurrences | Trace name
|
||||
-------------------------------------------------------------------------------------------
|
||||
2948.326 8.612 1.218 2420 engine-on-action-complete
|
||||
2859.172 8.516 1.181 2420 action-handler-on-action-complete
|
||||
2812.726 8.482 1.162 2420 task-handler-on-action-complete
|
||||
2767.836 8.412 1.144 2420 regular-task-on-action-complete
|
||||
2766.199 8.411 1.143 2420 task-complete
|
||||
2702.764 8.351 0.460 5878 task-run
|
||||
2506.531 8.354 0.850 2948 dispatcher-dispatch-commands
|
||||
2503.398 8.353 0.437 5735 task-handler-run-task
|
||||
2488.940 8.350 0.434 5735 task-run-new
|
||||
1669.179 54.737 0.881 1894 default-executor-run-action
|
||||
1201.582 3.687 0.497 2420 regular-task-get-action-input
|
||||
1126.351 2.093 0.476 2366 ad-hoc-action-validate-input
|
||||
1125.129 2.092 0.238 4732 ad-hoc-action-prepare-input
|
||||
687.619 7.594 0.651 1056 task-handler-refresh-task-state
|
||||
387.622 3.872 0.300 1291 workflow-handler-check-and-fix-integrity
|
||||
234.231 4.068 0.392 597 workflow-handler-check-and-complete
|
||||
224.026 4.042 0.375 597 workflow-check-and-complete
|
||||
210.184 6.694 1.470 143 task-run-existing
|
||||
160.118 8.343 0.304 526 workflow-action-schedule
|
||||
141.398 4.546 0.268 528 workflow-handler-start-workflow
|
||||
109.641 4.361 0.208 528 workflow-start
|
||||
78.683 2.004 0.077 1024 direct-wf-controller-get-join-logical-state
|
||||
|
||||
...
|
||||
|
||||
To generate this report, run:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
$ python tools/rank_profiled_methods.py /tmp/mistral_osprofile.log report.txt
|
||||
|
||||
And this report is somewhat really useful when it comes to analysing
|
||||
performance bottlenecks. All times are shown in seconds.
|
Loading…
Reference in New Issue
Block a user