Add an article about profiling into the docs
Change-Id: I79befa91fcaf393ec19498372f1e6c6953151867
This commit is contained in:
parent
0a3880c22c
commit
cc825ebcd3
@ -7,5 +7,6 @@ Contributor Documentation
|
|||||||
|
|
||||||
coding_guidelines
|
coding_guidelines
|
||||||
debugging_and_testing
|
debugging_and_testing
|
||||||
|
profiling
|
||||||
troubleshooting
|
troubleshooting
|
||||||
devstack
|
devstack
|
||||||
|
298
doc/source/developer/contributor/profiling.rst
Normal file
298
doc/source/developer/contributor/profiling.rst
Normal file
@ -0,0 +1,298 @@
|
|||||||
|
Profiling Mistral
|
||||||
|
=================
|
||||||
|
|
||||||
|
What Is Profiling?
|
||||||
|
------------------
|
||||||
|
Profiling is a procedure for gathering runtime statistics about certain code
|
||||||
|
snippets like:
|
||||||
|
|
||||||
|
- The maximum run time
|
||||||
|
- The minimum run time
|
||||||
|
- The average run time
|
||||||
|
- The number of runs
|
||||||
|
|
||||||
|
Such info is a key to understanding performance bottlenecks residing in
|
||||||
|
a system. Having these metrics, we can focus on places in code that slow
|
||||||
|
down the system most and come up with optimisations to improve them.
|
||||||
|
|
||||||
|
A typical code snippet eligible for gathering this kind of information is a
|
||||||
|
function or a method since, most popular engineering techniques encourage
|
||||||
|
developers to decompose code into functions/methods representing well defined
|
||||||
|
parts of program logic. However, any arbitrary piece of code may be a target
|
||||||
|
for measuring.
|
||||||
|
|
||||||
|
'osprofiler' Project
|
||||||
|
--------------------
|
||||||
|
|
||||||
|
`osprofiler <https://osprofiler.readthedocs.io/en/latest/>`_ is a project
|
||||||
|
created within the OpenStack ecosystem to do profiling. The paragraphs below
|
||||||
|
explain how Mistral uses 'osprofiler' for profiling. The central concept of
|
||||||
|
'osprofiler' is a profile trace. A developer can mark code snippets with
|
||||||
|
profiler traces and 'osprofiler' will be tracking them. In general,
|
||||||
|
'osprofiler' allows cross-service profiling, that is, tracking a chain of
|
||||||
|
calls that belong to different RESTful services but related with the same
|
||||||
|
user request. However, this guide doesn't cover this more complex use case
|
||||||
|
and focus on profiling within just one service, Mistral.
|
||||||
|
|
||||||
|
Profiler Traces
|
||||||
|
---------------
|
||||||
|
|
||||||
|
The most common way to create a profiler trace in the code is adding a
|
||||||
|
special ''@trace"
|
||||||
|
|
||||||
|
.. code-block:: python
|
||||||
|
|
||||||
|
from osprofiler import profiler
|
||||||
|
|
||||||
|
|
||||||
|
class DefaultEngine(base.Engine):
|
||||||
|
...
|
||||||
|
|
||||||
|
@profiler.trace('engine-on-action-complete', hide_args=True)
|
||||||
|
def on_action_complete(self, action_ex_id, result, wf_action=False,
|
||||||
|
async_=False):
|
||||||
|
with db_api.transaction():
|
||||||
|
if wf_action:
|
||||||
|
action_ex = db_api.get_workflow_execution(action_ex_id)
|
||||||
|
|
||||||
|
if result is None:
|
||||||
|
result = ml_actions.Result(data=action_ex.output)
|
||||||
|
else:
|
||||||
|
action_ex = db_api.get_action_execution(action_ex_id)
|
||||||
|
|
||||||
|
action_handler.on_action_complete(action_ex, result)
|
||||||
|
|
||||||
|
return action_ex.get_clone()
|
||||||
|
|
||||||
|
In this example, we applied a special decorator to a method that adds a
|
||||||
|
profiling trace. The most important argument of the decorator is a trace
|
||||||
|
name. Its value is 'engine-on-action-complete' in our case. The second
|
||||||
|
argument 'hide_args' defines whether 'osprofiler' needs to pass method
|
||||||
|
argument values down to other layers. More specifically, there's a notion
|
||||||
|
metrics collector in 'osprofiler' that accumulates info about traces
|
||||||
|
in any desirable form, it depends on a particular implementation. This
|
||||||
|
topic though is out of the scope of this document. For our purposes, it's
|
||||||
|
better to set this argument to **True** which will not lead to loosing
|
||||||
|
performance on processing additional data (argument values of all method
|
||||||
|
calls).
|
||||||
|
|
||||||
|
Another way of adding a profiling trace is the following:
|
||||||
|
|
||||||
|
.. code-block:: python
|
||||||
|
|
||||||
|
try:
|
||||||
|
profiler.start("engine-on-action-complete")
|
||||||
|
|
||||||
|
action_handler.on_action_complete(action_ex, result)
|
||||||
|
finally:
|
||||||
|
profiler.stop()
|
||||||
|
|
||||||
|
|
||||||
|
Here we don't decorate the entire method, we only want to profile just one
|
||||||
|
line of code. But like in the previous example, we added a profiling trace.
|
||||||
|
The obvious advantage of using the decorator is that it can live in code
|
||||||
|
permanently because it doesn't pollute it too much and we can use them any
|
||||||
|
time we want to profile the system.
|
||||||
|
|
||||||
|
Even simpler and more concise way to achieve the same is use a special
|
||||||
|
context manager from 'osprofiler':
|
||||||
|
|
||||||
|
.. code-block:: python
|
||||||
|
|
||||||
|
with profiler.Trace('engine-on-action-complete'):
|
||||||
|
action_handler.on_action_complete(action_ex, result)
|
||||||
|
|
||||||
|
Configuring Mistral for Profiling
|
||||||
|
---------------------------------
|
||||||
|
|
||||||
|
To start a profiling session, one needs to make the steps below.
|
||||||
|
|
||||||
|
Mistral Configuration File
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
Make these change in the config file:
|
||||||
|
|
||||||
|
.. code-block:: cfg
|
||||||
|
|
||||||
|
[DEFAULT]
|
||||||
|
log_config_append = wf_trace_logging.conf
|
||||||
|
|
||||||
|
[profiler]
|
||||||
|
enabled = True
|
||||||
|
hmac_keys = secret_word
|
||||||
|
|
||||||
|
Defining the 'log_config_append' property allows to have all the logging
|
||||||
|
configuration in a separate file. In the example above, it's called
|
||||||
|
'wf_trace_logging.conf' but it can have a different name, if needed.
|
||||||
|
'[profiler]' group directly refers to the 'osprofiler' project and is
|
||||||
|
brought by it. The property 'enabled' is self-explaining, but the other one
|
||||||
|
is not. The value of the property 'hmac_keys' basically needs to be known
|
||||||
|
by someone who wants to start a profiling session. This value needs to be
|
||||||
|
passed as part of the user request. It will be shown a bit later.
|
||||||
|
|
||||||
|
Logging Configuration File
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
The content of the logging configuration file conforms the documentation for
|
||||||
|
the standard 'logging' Python module. Find more details at
|
||||||
|
https://docs.python.org/3/library/logging.config.html#configuration-file-format
|
||||||
|
|
||||||
|
This particular example of the logging file configures three different loggers
|
||||||
|
and their corresponding counterparts like handlers. For the purpose of this
|
||||||
|
document though we only need to pay attention how 'profiler_trace' logger is
|
||||||
|
configure. Every entity starting with 'profiler' is related to profiling
|
||||||
|
configuration. The reason why other loggers are also included here is to show
|
||||||
|
how different loggers can coexist within one configuration file and how they
|
||||||
|
can reuse same entities.
|
||||||
|
|
||||||
|
|
||||||
|
.. code-block:: cfg
|
||||||
|
|
||||||
|
[loggers]
|
||||||
|
keys=workflow_trace,profiler_trace,root
|
||||||
|
|
||||||
|
[handlers]
|
||||||
|
keys=consoleHandler, wfTraceFileHandler, profilerFileHandler, fileHandler
|
||||||
|
|
||||||
|
[formatters]
|
||||||
|
keys=wfFormatter, profilerFormatter, simpleFormatter, verboseFormatter
|
||||||
|
|
||||||
|
[logger_workflow_trace]
|
||||||
|
level=INFO
|
||||||
|
handlers=consoleHandler, wfTraceFileHandler
|
||||||
|
qualname=workflow_trace
|
||||||
|
propagate=0
|
||||||
|
|
||||||
|
[logger_profiler_trace]
|
||||||
|
level=INFO
|
||||||
|
handlers=profilerFileHandler
|
||||||
|
qualname=profiler_trace
|
||||||
|
|
||||||
|
[logger_root]
|
||||||
|
level=DEBUG
|
||||||
|
handlers=fileHandler
|
||||||
|
|
||||||
|
[handler_fileHandler]
|
||||||
|
class=FileHandler
|
||||||
|
level=DEBUG
|
||||||
|
formatter=verboseFormatter
|
||||||
|
args=("/tmp/mistral.log",)
|
||||||
|
|
||||||
|
[handler_consoleHandler]
|
||||||
|
class=StreamHandler
|
||||||
|
level=INFO
|
||||||
|
formatter=simpleFormatter
|
||||||
|
args=(sys.stdout,)
|
||||||
|
|
||||||
|
[handler_wfTraceFileHandler]
|
||||||
|
class=FileHandler
|
||||||
|
level=INFO
|
||||||
|
formatter=wfFormatter
|
||||||
|
args=("/tmp/mistral_wf_trace.log",)
|
||||||
|
|
||||||
|
[handler_profilerFileHandler]
|
||||||
|
class=FileHandler
|
||||||
|
level=INFO
|
||||||
|
formatter=profilerFormatter
|
||||||
|
args=("/tmp/mistral_osprofile.log",)
|
||||||
|
|
||||||
|
[formatter_verboseFormatter]
|
||||||
|
format=%(asctime)s %(thread)s %(levelname)s %(module)s [-] %(message)s
|
||||||
|
datefmt=
|
||||||
|
|
||||||
|
[formatter_simpleFormatter]
|
||||||
|
format=%(asctime)s - %(message)s
|
||||||
|
datefmt=%y-%m-%d %H:%M:%S
|
||||||
|
|
||||||
|
[formatter_wfFormatter]
|
||||||
|
format=%(asctime)s WF [-] %(message)s
|
||||||
|
datefmt=
|
||||||
|
|
||||||
|
[formatter_profilerFormatter]
|
||||||
|
format=%(message)s
|
||||||
|
datefmt=%H:%M:%S
|
||||||
|
|
||||||
|
|
||||||
|
Triggering Profiling Sessions
|
||||||
|
-----------------------------
|
||||||
|
|
||||||
|
Once Mistral is configured like explained above, in order to start a
|
||||||
|
profiling session we need to make a user request to Mistral that we
|
||||||
|
want to analyse but adding one property to it. The name of the property
|
||||||
|
is 'profile' and it needs to be set to the value of the 'hmac_keys'
|
||||||
|
property from the main configuration file.
|
||||||
|
|
||||||
|
.. code-block:: bash
|
||||||
|
|
||||||
|
$ mistral execution-create my_slow_workflow --profile secret_word
|
||||||
|
|
||||||
|
Profiling Session Result
|
||||||
|
------------------------
|
||||||
|
|
||||||
|
When started in a profiling mode like just shown, Mistral will be writing
|
||||||
|
info about the profiling traces into the configured file. In our case it is
|
||||||
|
'/tmp/mistral_osprofile.log'.
|
||||||
|
|
||||||
|
.. code-block:: cfg
|
||||||
|
|
||||||
|
2020-02-27T08:04:25.789433 f12e75d5-5d59-4cbc-b74d-357f19290dd7 f12e75d5-5d59-4cbc-b74d-357f19290dd7 b9b29981-0916-4635-af18-d6c92f991f46 engine-start-workflow-start
|
||||||
|
2020-02-27T08:04:25.790232 f12e75d5-5d59-4cbc-b74d-357f19290dd7 b9b29981-0916-4635-af18-d6c92f991f46 3cdd41b5-318a-4926-a38e-63344b6aef7a workflow-handler-start-workflow-start
|
||||||
|
2020-02-27T08:04:25.812879 f12e75d5-5d59-4cbc-b74d-357f19290dd7 3cdd41b5-318a-4926-a38e-63344b6aef7a 603f1fab-be78-438d-af13-d94ed3b7e416 workflow-start-start
|
||||||
|
2020-02-27T08:04:25.954502 f12e75d5-5d59-4cbc-b74d-357f19290dd7 603f1fab-be78-438d-af13-d94ed3b7e416 b1d0a77a-52f5-4415-a6c4-f16b3591a47d workflow-set-state-start
|
||||||
|
2020-02-27T08:04:25.961298 0.006782 f12e75d5-5d59-4cbc-b74d-357f19290dd7 603f1fab-be78-438d-af13-d94ed3b7e416 b1d0a77a-52f5-4415-a6c4-f16b3591a47d workflow-set-state-stop
|
||||||
|
2020-02-27T08:04:25.961769 f12e75d5-5d59-4cbc-b74d-357f19290dd7 603f1fab-be78-438d-af13-d94ed3b7e416 27b58351-aebe-4e37-9cec-91fdbef5c68b wf-controller-get-controller-start
|
||||||
|
2020-02-27T08:04:25.962041 0.000267 f12e75d5-5d59-4cbc-b74d-357f19290dd7 603f1fab-be78-438d-af13-d94ed3b7e416 27b58351-aebe-4e37-9cec-91fdbef5c68b wf-controller-get-controller-stop
|
||||||
|
2020-02-27T08:04:25.962311 f12e75d5-5d59-4cbc-b74d-357f19290dd7 603f1fab-be78-438d-af13-d94ed3b7e416 605ebfc2-a2bb-4fe1-8159-fc16f6741f5f workflow-controller-continue-workflow-start
|
||||||
|
2020-02-27T08:04:26.023134 0.060832 f12e75d5-5d59-4cbc-b74d-357f19290dd7 603f1fab-be78-438d-af13-d94ed3b7e416 605ebfc2-a2bb-4fe1-8159-fc16f6741f5f workflow-controller-continue-workflow-stop
|
||||||
|
2020-02-27T08:04:26.023600 f12e75d5-5d59-4cbc-b74d-357f19290dd7 603f1fab-be78-438d-af13-d94ed3b7e416 3a5a384a-9598-4844-a740-981f92e604af dispatcher-dispatch-commands-start
|
||||||
|
2020-02-27T08:04:26.023918 f12e75d5-5d59-4cbc-b74d-357f19290dd7 3a5a384a-9598-4844-a740-981f92e604af d84a13e4-4763-4321-ab08-8cbd19656f2f task-handler-run-task-start
|
||||||
|
2020-02-27T08:04:26.024179 f12e75d5-5d59-4cbc-b74d-357f19290dd7 d84a13e4-4763-4321-ab08-8cbd19656f2f 7878e4f8-aaaa-4b9b-b15a-35848b5cdd61 task-handler-build-task-from-command-start
|
||||||
|
2020-02-27T08:04:26.024422 0.000243 f12e75d5-5d59-4cbc-b74d-357f19290dd7 d84a13e4-4763-4321-ab08-8cbd19656f2f 7878e4f8-aaaa-4b9b-b15a-35848b5cdd61 task-handler-build-task-from-command-stop
|
||||||
|
|
||||||
|
So any time Mistral runs code marked as a profiling trace it prints two
|
||||||
|
entries into the file: right before the code snippet starts and right
|
||||||
|
after its completion. Notice also that for the corresponding "-stop" entry
|
||||||
|
(the suffix going after the trace name) Mistral prints an additional number
|
||||||
|
in the second column. This is a duration of the code snippet.
|
||||||
|
|
||||||
|
This content of this file itself is probably not so useful (although, it
|
||||||
|
might be for some purpose) but based on it we can build the following
|
||||||
|
report:
|
||||||
|
|
||||||
|
.. code-block:: bash
|
||||||
|
|
||||||
|
Total time | Max time | Avg time | Occurrences | Trace name
|
||||||
|
-------------------------------------------------------------------------------------------
|
||||||
|
2948.326 8.612 1.218 2420 engine-on-action-complete
|
||||||
|
2859.172 8.516 1.181 2420 action-handler-on-action-complete
|
||||||
|
2812.726 8.482 1.162 2420 task-handler-on-action-complete
|
||||||
|
2767.836 8.412 1.144 2420 regular-task-on-action-complete
|
||||||
|
2766.199 8.411 1.143 2420 task-complete
|
||||||
|
2702.764 8.351 0.460 5878 task-run
|
||||||
|
2506.531 8.354 0.850 2948 dispatcher-dispatch-commands
|
||||||
|
2503.398 8.353 0.437 5735 task-handler-run-task
|
||||||
|
2488.940 8.350 0.434 5735 task-run-new
|
||||||
|
1669.179 54.737 0.881 1894 default-executor-run-action
|
||||||
|
1201.582 3.687 0.497 2420 regular-task-get-action-input
|
||||||
|
1126.351 2.093 0.476 2366 ad-hoc-action-validate-input
|
||||||
|
1125.129 2.092 0.238 4732 ad-hoc-action-prepare-input
|
||||||
|
687.619 7.594 0.651 1056 task-handler-refresh-task-state
|
||||||
|
387.622 3.872 0.300 1291 workflow-handler-check-and-fix-integrity
|
||||||
|
234.231 4.068 0.392 597 workflow-handler-check-and-complete
|
||||||
|
224.026 4.042 0.375 597 workflow-check-and-complete
|
||||||
|
210.184 6.694 1.470 143 task-run-existing
|
||||||
|
160.118 8.343 0.304 526 workflow-action-schedule
|
||||||
|
141.398 4.546 0.268 528 workflow-handler-start-workflow
|
||||||
|
109.641 4.361 0.208 528 workflow-start
|
||||||
|
78.683 2.004 0.077 1024 direct-wf-controller-get-join-logical-state
|
||||||
|
|
||||||
|
...
|
||||||
|
|
||||||
|
To generate this report, run:
|
||||||
|
|
||||||
|
.. code-block:: bash
|
||||||
|
|
||||||
|
$ python tools/rank_profiled_methods.py /tmp/mistral_osprofile.log report.txt
|
||||||
|
|
||||||
|
And this report is somewhat really useful when it comes to analysing
|
||||||
|
performance bottlenecks. All times are shown in seconds.
|
Loading…
x
Reference in New Issue
Block a user