1327 Commits

Author SHA1 Message Date
Jenkins
def1b841cb Merge "Move to debug a too verbose log" 2015-12-12 07:37:30 +00:00
Mehdi Abaakouk
17ccb2306d Move to debug a too verbose log
When a client is gone (died/restart) and somes replies cannot be sent because
the the exchange of this client will never comeback. We log one message per
reply every 0.25 messages during 60 seconds. When the only useful log
is the one where we decide to drop this replies.

This change moves the less important message to debug level.

Change-Id: I508787c0db4dcec2c0027b89eb4e65c4f98022b9
Related-bug: #1524418
2015-12-11 10:59:54 +01:00
Davanum Srinivas
46daf85814 Cleanup parameter docstrings
Change-Id: I301fdd51446bf0c0a6dd0d05b26da0556db8367d
2015-12-11 11:04:13 +03:00
Jenkins
0a0e6d0d50 Merge "notif: Check the driver features in dispatcher" 2015-12-11 04:17:21 +00:00
Jenkins
213176657d Merge "batch notification listener" 2015-12-11 04:16:57 +00:00
Jenkins
4b6144a3db Merge "creates a dispatcher abstraction" 2015-12-10 11:15:48 +00:00
Jenkins
8e792a5f70 Merge "Revert "default of kombu_missing_consumer_retry_timeout"" 2015-12-10 06:49:39 +00:00
Jenkins
28e9004dc1 Merge "Don't trigger error_callback for known exc" 2015-12-10 06:44:32 +00:00
Mehdi Abaakouk
3ee86964fa Revert "default of kombu_missing_consumer_retry_timeout"
This reverts commit 8c03a6db6c0396099e7425834998da5478a1df7c.

Closes-bug: #1524418
Change-Id: I35538a6c15d6402272e4513bc1beaa537b0dd7b9
2015-12-09 19:38:14 +01:00
Mehdi Abaakouk
e72599435c Don't trigger error_callback for known exc
When AMQPDestinationNotFound is raised, we must not
call the error_callback method. The exception is logged
only if needed in upper layer (amqpdriver.py).

Related-bug: #1524418

Change-Id: Ic1ddec2d13172532dbaa572d04a4c22c97ac4fe7
2015-12-09 18:53:38 +01:00
Jenkins
7f08805bc9 Merge "kombu: remove compat of folsom reply format" 2015-12-09 17:22:00 +00:00
Jenkins
fdfc98c6a2 Merge "Follow the plan about the single reply message" 2015-12-09 17:21:50 +00:00
Jenkins
bdd6353c91 Merge "Fix notifier options registration" 2015-12-09 15:49:02 +00:00
Jenkins
251df0ec1b Merge "Fix reconnection when heartbeat is missed" 2015-12-09 15:48:58 +00:00
Mehdi Abaakouk
185693a6ed Improves comment
Change-Id: Idc8002e6d622435aac48304857985c0f82be3e32
2015-12-09 11:23:52 +01:00
Jenkins
68af439724 Merge "Fix multiline strings with missing spaces" 2015-12-09 08:36:56 +00:00
Jenkins
8717f6e38a Merge "Remove unnecessary quote" 2015-12-09 08:36:53 +00:00
Mehdi Abaakouk
148e8380ce Fix reconnection when heartbeat is missed
When a heartbeat is missing we call ensure_connection()
that runs a dummy method to trigger the reconnection
code in kombu. But also the code is triggered only if the
channel is None.

In case of the heartbeat threads we didn't reset the channel
before reconnecting, so the dummy method doesn't do anything.

This change sets the channel to None to ensure the connection
is reestablished before the dummy method is run.

Also it replaces the dummy method by checking the kombu connection
object. So we are sure the connection is reestablished.

Change-Id: I39f8cd23c5a5498e6f4c1aa3236ed27f3b5d7c9a
Closes-bug: #1493890
2015-12-09 06:45:36 +00:00
Mehdi Abaakouk
050024f798 Fix notifier options registration
Change-Id: I37082f6f349e89af6b74e6ec5e5c416902299263
2015-12-08 16:01:49 +01:00
Mehdi Abaakouk
185f94c013 notif: Check the driver features in dispatcher
The transport/driver features check is done into the get listener
methods.
So when these methods are not used the driver features checks is not
done.

This change moves it into the dispatcher layer to ensure the
requirements are always checked.

This changes a bit the behavior of when the check occurs. Before
it was during the listener object initialisation. Now this
when the listener server start.

Change-Id: I4d81a4e8496f04d62e48317829d5dd8b942d501c
2015-12-08 09:14:20 +01:00
Mehdi Abaakouk
4dd644ac20 batch notification listener
Gnocchi performs better if measurements are write in batch
When Ceilometer is used with Gnocchi, this is not possible.

This change introduce a new notification listener that allows that.

On the driver side, a default batch implementation is provided.
It's just call the legacy poll method many times.

Driver can override it to provide a better implementation.
For example, kafka handles batch natively and take benefit of this.

Change-Id: I16184da24b8661aff7f4fba6196ecf33165f1a77
2015-12-08 09:14:20 +01:00
OpenStack Proposal Bot
a1fb6b9776 Updated from global requirements
Change-Id: Ie3e254b5b37a1d74eeb24ce1ae179ca9b4e84707
2015-12-08 02:32:50 +00:00
Mehdi Abaakouk
bdf287e847 creates a dispatcher abstraction
This change creates a dispatcher abstraction
to document the interface of a dispatcher.

And also allows in the futur to have attributes with default values.

Change-Id: I9a7e5e03f89635a3790b3851f492a1a7aab58feb
2015-12-07 16:43:34 +01:00
Stanisław Pitucha
2a4f915891 Remove unnecessary quote
Change-Id: I6ec2297495c1a7ce409ea0de9a92a9720b6e2dca
2015-12-07 15:12:19 +11:00
Stanisław Pitucha
5561a6fd0f Fix multiline strings with missing spaces
Change-Id: Ide9999f6bb80f0f87500270a4fc024462bce0dbf
2015-12-07 15:10:21 +11:00
Oleksii Zamiatin
52ccff7cbc Properly skip zmq tests without ZeroMQ being installed
In this change import_zmq() doesn't raise ImportError any more
for the benefit of skipping tests.
Alarm about zmq unavailability moved to driver's init.

Change-Id: I6e6acc39f42c979333510064d9e845228400d233
Closes-Bug: #1522920
2015-12-04 23:07:09 +02:00
Mehdi Abaakouk
c1d0412e2d kombu: remove compat of folsom reply format
This change removes codepath where _reply_q is not
present in the message dict.

This kind of messages have been deprecated in grizzly and cannot
be emitted since havana.

70891c271e

Change-Id: I20558d9fae8f56970c967aa0def77cfb2a1ca3ec
2015-12-04 15:25:03 +01:00
Mehdi Abaakouk
6ad70713a3 Follow the plan about the single reply message
This change removes the "send_single_reply" option as planned in the bp:

http://specs.openstack.org/openstack/oslo-specs/specs/liberty/oslo.messaging-remove-double-reply.html

Change-Id: Ib88de71cb2008a49a25f302d5e47ed587154d402
2015-12-04 15:25:03 +01:00
Jenkins
ee240fbb8d Merge "default of kombu_missing_consumer_retry_timeout" 2015-12-04 13:26:17 +00:00
Jenkins
a23f8707f8 Merge "rename kombu_reconnect_timeout option" 2015-12-04 13:10:00 +00:00
Jenkins
6e811ec2e5 Merge "Skip Cyrus SASL tests if proton does not support Cyrus SASL" 2015-12-04 13:03:14 +00:00
Mehdi Abaakouk
8c03a6db6c default of kombu_missing_consumer_retry_timeout
This change the default of kombu_missing_consumer_retry_timeout

The initial value of 60 seconds, have been chosen because the default
rpc_response_timeout is 60. That means, the client doesn't wait for
its reply after rpc_response_timeout is reach, so we don't need
to retry it send it its reply more than rpc_response_timeout.

But the real intent of kombu_missing_consumer_retry_timeout is
to mitigate the side effect when the rabbitmq server(s) died/failover/restart.

So the question is more how long we expect the server(s) to come back
and all the oslo.messaging applications to reconnect.

In that case 60 seconds looks a bit high.

Also this 60 seconds have a sad side effect when we can't send the reply
when the rpc client is really gone (like nova-compute restart).
The rabbitmq connection to send the reply is hold during 60 seconds.

I propose 5 seconds because,i n case of failover or restart I expect
everything because normal in less that 5 seconds.

Change-Id: I2ec174e440eb91e950d9453a9de8b97ed5888968
2015-12-04 08:00:31 +01:00
Mehdi Abaakouk
18d1708711 rename kombu_reconnect_timeout option
This change renames kombu_reconnect_timeout to missing_consumer_retry_timeout.
And improves its documentation.

Change-Id: I961cf96108db2f392b13d159f516baac9ff4e989
2015-12-04 08:00:31 +01:00
Jenkins
d1e2fb3be6 Merge "Don't hold the connection when reply fail" 2015-12-03 21:06:21 +00:00
Kenneth Giusti
822b803fb0 Skip Cyrus SASL tests if proton does not support Cyrus SASL
Change-Id: I265d17a2c92b97777a5a97683b95427825872d3a
Closes-Bug: #1508523
2015-12-03 14:29:13 -05:00
Davanum Srinivas
74a0ec8b1c setUp/tearDown decorator for set/clear override
Problem with recursion shows up only in full runs
of Nova for example. So split the code that sets
up the decorator and add a method to cleanup
the decorated set_override during teardown.

Also add a decorator for clear_override similar to
the one for set_override.

Added more tests for all the above.

Change-Id: Ib16af2e770e96d971aef7f5c5d48ffd781477cfe
2015-12-03 11:13:35 +00:00
Jenkins
36fc947b15 Merge "doc: explain rpc call/cast expection" 2015-12-03 07:16:03 +00:00
Davanum Srinivas
b6ad95e1ca Support older notifications set_override keys
Neutron and Ceilometer use set_override to set
the older deprecated key. We should support them
using the ConfFixture

Closes-Bug: #1521776
Change-Id: I2bd77284f80bc4525f062f313b1ec74f2b54b395
2015-12-02 14:05:02 +00:00
Mehdi Abaakouk
daddb82788 Don't hold the connection when reply fail
This change moves the reply retry code to upper layer
to be able to release the connection while we wait between
two retries.

In the worse scenario, a client waits for more than 30 replies
and died/restart, the server tries to send this 30 replies to this
this client and can wait too 60s per replies. During this
replies for other clients are just stuck.

This change fixes that.

Related-bug: #1477914
Closes-bug: #1521958

Change-Id: I0d3c16ea6d2c1da143de4924b3be41d1cea159bd
2015-12-02 12:59:59 +01:00
Jenkins
ba42571b5a Merge "ignore .eggs directory" 2015-12-02 05:09:50 +00:00
Mehdi Abaakouk
cc97ba2e17 doc: explain rpc call/cast expection
This change adds some doc about remote method execution expectation
when rpc call/cast is used.

Change-Id: Idb26413fc9a6747ebcd6fd32b82f63ea97bfae16
2015-12-01 15:50:50 +01:00
Komei Shimamura
67c63031f5 Add a driver for Apache Kafka
Adding a driver for Apache Kafka connection, supporting
notification via Kafka. This driver is experimental
until having functional and integration tests

Change-Id: I7a5d8e3259b21d5e29ed3b795d04952e1d13ad08
Implements: blueprint adding-kafka-support
2015-12-01 14:20:33 +00:00
Davanum Srinivas
33c1010c32 Option group for notifications
In change Ief6f95ea906bfd95b3218a930c9db5d8a764beb9, we 
decoupled RPC and Notifications a bit. We should take another
step and separate out the options for notifications into 
its own group.

Change-Id: Ib51e2839f9035d0cc0e3f459939d9f9003a8c810
2015-11-30 19:30:05 +00:00
Jenkins
f4f40ea9a5 Merge "Move ConnectionPool and ConnectionContext outside amqp.py" 2015-11-30 18:33:06 +00:00
Jenkins
19196fd20f Merge "Use round robin failover strategy for Kombu driver" 2015-11-30 14:44:07 +00:00
Davanum Srinivas
357dcb75ab Move ConnectionPool and ConnectionContext outside amqp.py
ConnectionPool and ConnectionContext can be used by other
drivers (like Kafka) and hence should be outside of amqp.py.
* Moving ConnectionPool to pool.py
* Moving ConnectionContext to common.py
* Moving a couple of global variables to common.py

No other logic changes, just refactoring

Change-Id: I85154509a361690426772ef116590d38a965ca8d
2015-11-30 11:53:48 +00:00
Dmitry Mescheryakov
6ae46796a6 Use round robin failover strategy for Kombu driver
Shuffle strategy we use right now leads to increased reconnection time
and provides no benefit. Sometimes it might lead to RPC operations
timeout because the strategy provides no guarantee on how long the
reconnection process will take. See the referenced bug for details.

On the other side, round-robin strategy provides least achievable
reconnection time. It also provides guarantee that if K of N RabbitMQ
hosts are alive, it will take at most N - K + 1 attempts to
successfully reconnect to RabbitMQ cluster.

With shuffle strategy during failover clients connect to random hosts
and so the load is distributed evenly between alive RabbitMQs.
But since we shuffle list of hosts before providing it to Kombu, load
will be distributed evenly with round-robin strategy as well.

DocImpact
A new configuration option kombu_failover_strategy for Kombu driver is
added. It determines how the next RabbitMQ node is chosen in case the
one we are currently connected to becomes unavailable. It takes effect
only if more than one RabbitMQ node is provided in config. Available
options are:

 * round-robin: each RabbitMQ host in the list is tried in cycle until
   oslo.messaging successfully connects. Since oslo.messaging
   shuffles list of RabbitMQ hosts, the order of hosts in the cycle
   will be random and will not depend on order provided in config.

 * shuffle: oslo.messaging selects a random host from the list and
   tries to connect to it. If connection fails, oslo.messaging repeats
   attempt to connect to another random host. Oslo.messaging stops
   once it successfully connects to a host. Note that in each
   iteration a host to connect is selected independently of previous
   iterations, i.e. it might happen that oslo.messaging will try to
   connect to the same host several times in a row.

The option's default value is round-robin. Before the option was
introduced, the default strategy was shuffle. For the reasoning,
see the main body of the commit message and the referenced bug.

Closes-Bug: #1519851
Change-Id: I9a510c86bd5a6ce8b707734385af1a83de82804e
2015-11-30 14:08:20 +03:00
Jenkins
118da5ffaa Merge "Updated from global requirements" 2015-11-29 12:53:30 +00:00
Jenkins
c57ff6173c Merge "Revert "serializer: remove deprecated RequestContextSerializer"" 2015-11-29 04:36:48 +00:00
Davanum Srinivas (dims)
6cd1dcebc0 Revert "serializer: remove deprecated RequestContextSerializer"
This reverts commit fb2037bcb492137ee7de5488c30ef8941b914e13.

Change-Id: I9b32708340c232369940738ade14cb6cbb02b331
2015-11-29 02:21:46 +00:00