As per the current release tested runtime, we test
python version from 3.8 to 3.11 so updating the
same in python classifier in setup.cfg
Change-Id: I303912894d12be87355f83a1a53be071db94cf84
These translation sections are not needed anymore, Babel can generate
translation files without them.
Change-Id: Ib60671941371aa22fbdeeb9d42fc619f60aa15e5
The current fake driver does not properly clean up the fake RPC exchange
between tests.
This means that if a test invokes code that makes an RPC request, using
the fake driver, without consuming the RPC message, then another test
may receive this request making it fail.
This issues has been found while working on a Cinder patch and has been
worked-arounded there with Change-Id
I52ee4b345b0a4b262e330a9a89552cd216eafdbe.
This patch fixes the source of the problem by clearing the exchange
class dictionary in the FakeExchangeManager during the FakeDriver
cleanup.
Change-Id: If82c2175cf7242b80509d180cdf92323c0f4c43b
Add a new flag rabbit_transient_quorum_queue to enable the use of quorum
for transient queues (reply_ and _fanout_)
This is helping a lot OpenStack services to not fail (and recover) from
a rabbit node issue.
Related-bug: #2031497
Signed-off-by: Arnaud Morin <arnaud.morin@ovhcloud.com>
Change-Id: Icee5ee6938ca7c9651f281fb835708fc88b8464f
When rabbit is failing for a specific quorum queue, the only thing to
do is to delete the queue (as per rabbit doc, see [1]).
So, to avoid the RPC service to be broken until an operator eventually
do a manual fix on it, catch any INTERNAL ERROR (code 541) and trigger
the deletion of the failed queues under those conditions.
So on next queue declare (triggered from various retries), the queue
will be created again and the service will recover by itself.
Closes-Bug: #2028384
Related-bug: #2031497
[1] https://www.rabbitmq.com/quorum-queues.html#availability
Signed-off-by: Arnaud Morin <arnaud.morin@ovhcloud.com>
Change-Id: Ib8dba833542973091a4e0bf23bb593aca89c5905
When an operator rely on rabbitmq policies, there is no point to set the
queue TTL in config.
Moreover, using policies is much more simpler as you dont need to
delete/recreate the queues to apply the new parameter (see [1]).
So, adding the possibility to set the transient queue TTL to 0 will
allow the creation of the queue without the x-expire parameter and only
the policy will apply.
[1] https://www.rabbitmq.com/parameters.html#policies
Related-bug: #2031497
Signed-off-by: Arnaud Morin <arnaud.morin@ovhcloud.com>
Change-Id: I34bad0f6d8ace475c48839adc68a023dd0c380de
We encountered bug 2037312 in unit tests when attempting to get this
change rolled out. Heat apparently will attempt to set is_admin using
policy logic if it's not passed in for a new context; this breaks as the
context we are requested doesn't have all the needed information to
exercise the policy logic.
is_admin is just a bool; it's not sensitive; easiest route forward is to
add it to the safe list
Closes-bug: 2037312
Change-Id: I78b08edfcb8115cddd7de9c6c788c0a57c8218a8
Add file to the reno documentation build to show release notes for
stable/2023.2.
Use pbr instruction to increment the minor version number
automatically so that master versions are higher than the versions on
stable/2023.2.
Sem-Ver: feature
Change-Id: I8e9c35ebe41e0283309d64db97a4d9ffebcf9626
Publishing a fully hydrated context object in a notification would give
someone with access to that notification the ability to impersonate the
original actor through inclusion of sensitive fields.
Now, instead, we pare down the context object to the bare minimum before
passing it for serialization in notification workflows.
Related-bug: 2030976
Change-Id: Ic94323658c89df1c1ff32f511ca23502317d0f00
Kombu recommend to run heartbeat_check every seconds but we use a lock
around the kombu connection so, to not lock to much this lock to most of
the time do nothing except waiting the events drain, we start
heartbeat_check and retrieve the server heartbeat packet only two times
more than the minimum required for the heartbeat works:
heartbeat_timeout / heartbeat_rate / 2.0
Because of this, we are not sending the heartbeat frames at correct
intervals. E.G.
If heartbeat_timeout=60 and rate=2, AMQP protocol expects to send a
frame
every 30sec.
With the current heartbeat_check implementation, heartbeat_check will be
called every:
heartbeat_timeout / heartbeat_rate / 2.0 = 60 / 2 / 2.0 = 15
Which will result in the following frame flow:
T+0 --> do nothing (60/2 > 0)
T+15 --> do nothing (60/2 > 15)
T+30 --> do nothing (60/2 > 30)
T+45 --> send a frame (60/2 < 45)
...
With heartbeat_rate=3, the heartbeat_check will be executed more often:
heartbeat_timeout / heartbeat_rate / 2.0 = 60 / 3 / 2.0 = 10
Frame flow:
T+0 --> do nothing (60/3 > 0)
T+10 --> do nothing (60/3 > 10)
T+20 --> do nothing (60/3 > 20)
T+30 --> send a frame (60/3 < 30)
...
Now we are sending the frame with correct intervals
Closes-bug: #2008734
Signed-off-by: Arnaud Morin <arnaud.morin@ovhcloud.com>
Change-Id: Ie646d254faf5e45ba46948212f4c9baf1ba7a1a8
Previously the two values were the same; this caused us
to always exceed the timeout limit ACK_REQUEUE_EVERY_SECONDS_MAX
which results in various code paths never being traversed
due to premature timeout exceptions.
Also apply min/max values to kombu_reconnect_delay so it doesn't
exceed ACK_REQUEUE_EVERY_SECONDS_MAX and break things again.
Closes-Bug: #1993149
Change-Id: I103d2aa79b4bd2c331810583aeca53e22ee27a49
When enabling heartbeat_in_pthread, we were restoring the "threading"
python library from eventlet to original one in RabbitDriver but we
forgot to do the same in AMQPDriverBase (RabbitDriver is subclass of
AMQPDriverBase).
We also need to use the original "queue" so that queues are not going to
use greenthreads as well.
Related-bug: #1961402
Related-bug: #1934937
Closes-bug: #2009138
Signed-off-by: Arnaud Morin <arnaud.morin@ovhcloud.com>
Change-Id: I34ea0d1381e934297df2f793e0d2594ef8254f00
Add file to the reno documentation build to show release notes for
stable/2023.1.
Use pbr instruction to increment the minor version number
automatically so that master versions are higher than the versions on
stable/2023.1.
Sem-Ver: feature
Change-Id: I80f227a59c36693c83bb94890536745610ba2393
In [1] there was a typo made in variable names. To prevent even futher
awkwardness regarding variable naming, we fix typo and publish a
release note for ones that already using variables in their deployments.
[1] https://review.opendev.org/c/openstack/oslo.messaging/+/831058
Change-Id: Icc438397c11521f3e5e9721f85aba9095e0831c2
We currently do not support overriding the class being
instantiated in the RPC helper functions, this adds that
support so that projects that define their own classes
that inherit from oslo.messaging can use the helpers.
For example neutron utilizes code from neutron-lib that
has it's own RPCClient implementation that inherits from
oslo.messaging, in order for them to use for example
the get_rpc_client helper they need support to override
the class being returned. The alternative would be to
modify the internal _manual_load variable which seems
counter-productive to extending the API provided to
consumers.
Change-Id: Ie22f2ee47a4ca3f28a71272ee1ffdb88aaeb7758
'skip_basepython_conflicts' has been the cause of a couple of bugs in
tox 4 and there is talk of it going away. Remove it and fix up a few
other issues in the tox.ini file.
Change-Id: Ic19c896af2ab0cf3570c43e8ceb8cba64fb45cdd
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
We already expose functions to handle the instantiation
of classes such as RPCServer and RPCTransport but the
same was never done for RPCClient so the API is
inconsistent in its enforcement.
This adds a get_rpc_client function that should be used
instead of instatiating the RPCClient class directly to
be more consistent.
This also allows to handle more logic inside the function
in the future such as if implementations for an async client
is implemented, as investigation in [1] has shown.
[1] https://review.opendev.org/c/openstack/oslo.messaging/+/858936
Change-Id: Ia4d1f0497b9e2728bde02f4ff05fdc175ddffe66
A recent oslo.messaging patch [1], not yet merged, who aim to update the
test runtime for antelope lead us to the following error:
```
qdrouterd: Python: ModuleNotFoundError: No module named 'qpid_dispatch'
```
Neither debian nor ubuntu in the latest releases have any binary
built for the qpid backend, not even 3rd party. Only qpid proton,
the client lib, is available.
To solve this issue, these changes propose to deprecate the AMQP1 driver
who is the one based on qpid and proton, and propose to remove the
related functional tests.
The AMQP1 driver doesn't seems to be widely used.
[1] https://review.opendev.org/c/openstack/oslo.messaging/+/856643
Closes-Bug: 1992587
Change-Id: Id2ca9cd9ee8b8dbdd14dcd00ebd8188d20ea18dc
Add file to the reno documentation build to show release notes for
stable/zed.
Use pbr instruction to increment the minor version number
automatically so that master versions are higher than the versions on
stable/zed.
Sem-Ver: feature
Change-Id: Ic1020b39172981abcc9fc3d66fc6ec58f440a456
As was reported in the related bug some time ago, setting that
option to True for nova-compute can break it as it's non-wsgi service.
We also noticed same problems with randomly stucked non-wsgi services
like e.g. neutron agents and probably the same issue can happen with
any other non-wsgi service.
To avoid that this patch changes default value of that config option
to be False.
Together with [1] it effectively reverts change done in [2] some time
ago.
[1] https://review.opendev.org/c/openstack/oslo.messaging/+/800621
[2] https://review.opendev.org/c/openstack/oslo.messaging/+/747395
Related-Bug: #1934937
Closes-Bug: #1961402
Change-Id: I85f5b9d1b5d15ad61a9fcd6e25925b7eeb8bf6e7
In impl_kafka, _produce_message is run in a tpool.execute
context but it was also calling logging functions.
This could cause subsequent calls to logging functions to
deadlock.
This patch moves the logging calls out of the tpool.execute scope.
Change-Id: I81167eea0a6b1a43a88baa3bc383af684f4b1345
Closes-bug: #1981093
this change updates the max version of hacking
to 4.1.0 to allow pre-commit to work with the
flake 3.8.3 release and correct one new error that was
raised as a result.
Change-Id: I3a0242208f411b430db0e7429e2c773f45b3d301
In Zed cycle testing runtime, we are targetting to drop the
python 3.6/3.7 support, project started adding python 3.8 as minimum,
example nova:
- 56b5aed08c/setup.cfg (L13)
Change-Id: Id23d3845db716d26175d71280dbedf93736d19de