The patch for https://review.openstack.org/#/c/436958/ fixed a
threading problem by moving the ack back to the polling
thread. However the RPC server expects to catch any failures of the
ACK and abort the request. This patch adds the ACK error handling
back to the polling thread.
This patch is based heavily off the original work done by Mehdi
Abaakouk (sileht).
Change-Id: I708c3d6676b974d8daac6817c15f596cdf35817b
Closes-Bug: #1695746
The _get_connection_info() method attempts to gather debug information
from the connection, and will reach into the amqp channel to get the
local (client's) TCP port number via the 'sock' property.
If _get_connection_info() is called from autoretry's on_error handler
the 'sock' property notices that the transport is not set and attempts
to re-connect. amqp has deprecated this reconnect behavior, and in
any case the client's socket is irrelevant since the connection may
not be valid at this point.
Closes-Bug: #1745166
Change-Id: I3c42f8463605927f6f94d6c3a7f05e584476abc1
Python 3.7 does not allow the use of 'async' as
a parameter name or object attribute; update
occurrences to use a different name.
This is inline with PEP-492, where await and async
are keywords.
Change-Id: I73efcafab1e0832a0ada95f6c12cb6a659dfcf27
With PEP 479, the behaviour of StopIteration is changing. Raising it to
stop a generator is considered incorrect and from Python 3.7 this will
cause a RuntimeError. The PEP recommends using the return statement.
More details: https://www.python.org/dev/peps/pep-0479/#examples-of-breakage
Change-Id: Ib27581fccbbf14c082fb919d8b6edea1ac83e3c0
The fake_rabbit configuration option has been deprecated since the
release of 1.5.0 in late 2014. Finally remove it, and its test.
Change-Id: I014c2012cca0f289de0d95b9bb35bbde7f61d2ee
The call monitoring feature was introduced in commit
b34ab8b1cc9f4d513a2927c102dbbe82031d9c2a for RabbitMQ. This patch
enables the feature on the AMQP 1.0 driver - currently the only other
driver that supports RPC.
Change-Id: Ic787696852690b59779fb4716aec1e78c48bbe6a
In some specific cases, the middleware would fail to rebuild
the original exception, see bug [1] below. Adding this output
may help locate the root cause quickly.
[1]: https://bugs.launchpad.net/cinder/+bug/1728826
Change-Id: Ia9304bda4e515812b146885f830e70f28a285f2d
This adds an optional call_monitor_timeout parameter to the RPC client,
which if specified, will enable heartbeating of long-running calls by
the server. This enables the user to increase the regular timeout to
a much larger value, allowing calls to take a very long time, but
with heartbeating to indicate that they are still running on the server
side. If the server stops heartbeating, then the call_monitor_timeout
takes over and we fail with the usual MessagingTimeout instead of waiting
for the longer overall timeout to expire.
Change-Id: I60334aaf019f177a984583528b71d00859d31f84
This adds a heartbeat() method to RpcIncomingMessage to be used by a
subsequent patch implementation of active-call heartbeating. This is
unimplemented in all drivers for the moment.
Change-Id: If8ab0dc16e3bef69d5a826c31c0fe35e403ac6a1
This reverts commit c881baed29db49c5710795496cb07907e35ceaba.
Eventhough batch_size is set, rabbitMQ prefetch count is not changed
at the first time of connection creation. It works only connection reset
cases(i.e. rabbitMQ restart). So this patch can not fix original issue
https://bugs.launchpad.net/ceilometer/+bug/1551667
Moreover, it makes another bug that messages of sameple queues are not
consumed properly. So revert it.
Change-Id: Ia8ebee8e2a670e46b6a68859bc30e717bd56ed7e
Signed-off-by: Wonil Choi <wonil22.choi@samsung.com>
Closes-bug: 1759755
It is recommended that all users of the Pika driver transition to
using the Rabbit driver instead. Typically this is done by changing
the prefix of the transport_url configuration option from "pika://..."
to "rabbit://...". There are no changes required to the RabbitMQ
server configuration.
Change-Id: I52ea5ccb7e7c247abd95e2d8d50dac4c4ad11246
Closes-Bug: #1744741
This merely provides a restart() method that passes through to the existing
restart() method on the StopWatch used as the internal for DecayingTimer so
that we can reset it.
Change-Id: Ie6b607ec588db94e2c768bd22ae736a05ab484c1
This patch changes the default driver behavior to synchronously
commit messages following consumer poll. A configuration option
will enable the auto commit for asynchronous commit if desired.
Depends-On: I5b4f01c928373cac530aa6877a34c684577bc64e
Change-Id: I92a3dc95c5d424aa722138195fef5a855a66b31d
Emulate vhost support by adding the virtual host name to the
topic created on the kafka server. Also, update connection
management for producer/consumer.
This patch:
* updates target to topic generation
* add consumer and producer connection classes
* remove connection pool
* update driver test
Change-Id: Idd164444c04e9f465a43ee909af840a41bb090c0
This patch addresses a number of issues that prevented the functional
tests from running. The functional tests now execute and can complete
succesfully. At times, the test will fail (noticiably in CI) indicating
an underlying issue with consumer interaction with the kafka server.
It would be beneficial to merge this patch as it provides repeatability
and visibility for driver-kafka server integration to facilitate
additional debugging and testing.
This patch:
* removes use of deprecated get_transport
* override consumer_group for each test
* changed to synchronous send
* update to kafka 1.0.0 server
Depends-On: Ib552152e841a9fc0bffdcb7c3f7bc75613d0ed62
Change-Id: I7009a3b96ee250c177c10f5121eb73d908747a52
heartbeat_check in kombu.connection is not reraising exceptions as
exceptions.OperationalError, and the socket timeout during the heartbeat
check is really an issue seen in the field when a node goes down; the
heartbeat thread just tries again and again to deal with it, without
success.
Change-Id: I26dbdb18a7e64946db2cba676764ff2d428c7897
Closes-Bug: #1657444
for some reason there are two timeouts. in the batch scenario,
all the time wasted waiting on initial 'get' is never accounted
for so the batch timeout is always longer than it is declared.
Change-Id: I6132c770cccdf0ffad9f178f7463288cf954d672
Probably the most common format for documenting arguments is reST field
lists [1]. This change updates some docstrings to comply with the field
lists syntax.
[1] http://sphinx-doc.org/domains.html#info-field-lists
Change-Id: Ifa8c0db3efc03eac3b034ef642aaa8fce514a66e
If using rabbitmq as rpc backend, oslo.messaging generates large amount
of redundant timeout debug logs (several logs per second on multiple
openstack services, such as nova, heat, cinder), in format of 'Timed out
waiting for RPC response: Timeout while waiting on RPC response - topic:
"<unknown>", RPC method: "<unknown>" info: "<unknown>'. It's because
each socket timeout exception is raised to multiple levels of error
recovery callback functions then logged repeatedly.
However, the accompanying value of socket.timeout exception is currently
always “timed out”. Besides, oslo.messaging has implemented retry
mechanism to recover socket timeout failure. Therefore, IMO those logs
should be suppressed, even if at debug level, to save disk space and
make debugging more convenient.
Change-Id: Iafc360f8d18871cff93e7fd721d793ecdef5f4a1
Closes-Bug: #1714558
Publishing a message using the kombu connection autoretry method may
allow exceptions from the py-amqp library to be raised up to the
application. This does not conform to the documented oslo.messaging
API.
Enhance the try except block to capture any exception and translate it
into a MessageDeliveryFailure.
There are a few cases where exceptions will be raised during autoretry
publishing: recoverable connection or channel errors, and
non-recoverable connection or channel errors.
autoretry will only retry if the error is recoverable. Non recoverable
errors are re-raised immediately regardless of the retry count.
In the case of a recoverable error it seems unlikely that retrying
either the connection or the channel yet again is going to get us
anywhere, so in this case we simply clean up the channel state, log an
error and fail the operation.
In the case of non-recoverable error we are out of luck (think
authentication failure) - further retrying will not help. Best we can
do is clean up state and log the heck out of it.
Change-Id: I2f65d2ee19a8c3e9a323b30404abbf0cbb45a216
Closes-Bug: #1705351
Closes-Bug: #1707160
The FakeExchangeManager uses an instance-level storage for FakeExchanges
mapping[1]. When a client--server pair is created, each keeps their own
instance of FakeDriver -> FakeExchangeManager -> FakeExchange, each of
which has their own (instance-level) copy of e.g _server_queues[2], making
it impossible for them to communicate.
This patch makes the _exchanges mapping a class-level attribute in order
to keep the registered exchanges shared between all Manager instances,
allowing client and server communication (within a single process).
The test_server unit-tests had to be refactored to explicitly pass an exchange
name when building a target. This is required for an exchange name change to
have any effect during a test case run time when compared to passing the
exchange name through the URL. This issue was revealed with this patch.
[1] https://github.com/openstack/oslo.messaging/blob/master/oslo_messaging/_drivers/impl_fake.py#L145,#L148
[2] https://github.com/openstack/oslo.messaging/blob/master/oslo_messaging/_drivers/impl_fake.py#L88,#L92
Change-Id: I8dff66f4cafeb1f4c57dbfbfaba5d49e50f55fee
Closes-Bug: #1714055
Adds the 'pseudo_vhost' option which when enabled will incorporate the
virtual host into the address semantics. This creates a 'subnet' like
address space for each virtual host. Use this when the messaging bus
does not provide virtual hosting support. It is enabled by default as
to date none of the supported AMQP 1.0 message buses natively support
virtual hosting.
It also updates SSL support: SSL can either use the connection
hostname or the vhost name when validating a server's
certificate. This is controlled by the 'ssl_verify_vhost' option.
This option is disabled by default as it requires both vhost and SNI
support from the server. By default SSL will use the DNS name from
the TransportURL.
Change-Id: I49bb99d1b19e8e7e6fded76198da92ca5f7d65ab
Closes-Bug: #1700835
Partial-Bug: #1706987
We recently move ack/requeue of messages in main/polling thread
of rabbitmq drivers. And break the blocking executor.
This one is not tested by any tests and now deprecated.
This change workaround the issue until we completely remove the
blocking executor.
Change-Id: Id479100f6ff364cf67a199e9b70f9f0c7bf7e1a9
Closes-bug: #1694728
The Pika driver was intended to be a more stable and performant
replacement for the default rabbit driver. However due to lack of
both maintainers and compelling evidence that pika is superior to the
existing rabbit driver in either performance or stability it has been
deprecated for removal.
See:
http://lists.openstack.org/pipermail/openstack-dev/2017-May/116679.html
Change-Id: I98e0123edd3248be665325833283689fc3a897f7
In https://review.openstack.org/#/c/436958, we fix a thread safety
issue. But we make the ack/requeue of message asynchronous. In nominal
case, it works, but if network/rabbit connection issue occurs this
can result to rpc call handle twice. By chance we double check already
processed message ids, and drop duplicates, but that if the message
goes to another node, the mitigation won't work.
This restore the previous behavior, to ensure we run application
callback of rpc.call/rpc.cast only when the message have been
successfully ack.
Change-Id: I62b9e09513e3ebfebc64a941d4b21b6c053b511d