43 Commits

Author SHA1 Message Date
Steven Hardy
3a7baa8fa6 Convert ServiceNetMap evals to hiera interpolation
Since https://review.openstack.org/#/c/514707/ added the net_ip_map
to hieradata, we can look up the per-network bind IPs via hiera
interpolation instead of heat map_replace.

In some cases the ServiceNetMap lookup is used for other things,
but anywhere we make use of the "magic" translation via NetIpMap
is changed the same way.

This will enable more of the configuration data to be exposed per
role vs per node in a future patch (to simplify our ansible
workflow).

Co-authored-by: Bogdan Dobrelya <bdobreli@redhat.com>
Change-Id: Ie3da9fedbfce87e85f74d8780e7ad1ceadda79c8
2018-03-10 08:18:30 +00:00
marios
dec003def8 Convert tags to when statements for Q major upgrade workflow
This converts "tags: stepN" to "when: step|int == N" for the direct
execution as an ansible playbook, with a loop variable 'step'.
The tasks all include the explicit cast |int.

This also adds a set_fact task for handling of the package removal
with the UpgradeRemovePackages parameter (no change to the interface)

The yaml-validate also now checks for duplicate 'when:' statements

Q upgrade spec @ Ibde21e6efae3a7d311bee526d63c5692c4e27b28
Related Blueprint: major-upgrade-workflow
[0]: 394a92f761/tripleo_common/utils/config.py (L141)
Change-Id: I6adc5619a28099f4e241351b63377f1e96933810
2018-01-08 13:57:47 +02:00
Carlos Camacho
927495fe3d Change template names to queens
The new master branch should point now to queens instead of pike.

So, HOT templates should specify that they might contain features
for queens release [1]

[1]: https://docs.openstack.org/heat/latest/template_guide/hot_spec.html#queens

Change-Id: I7654d1c59db0c4508a9d7045f452612d22493004
2017-11-23 10:15:32 +01:00
Zuul
3f42de004c Merge "RabbitMQ should use net_ticktime" 2017-11-12 05:52:42 +00:00
Emilien Macchi
24c756616c Switch RabbitFDLimit to a Puppet integer
Type changed in:
20d159dc6f

We need to update it otherwise we get a Puppet error.

Change-Id: If03b7363295f1f529b7acf4a008ff63da8fef173
Closes-Bug: #1723665
2017-10-14 14:52:48 -07:00
John Eckersberg
962ce364f8 RabbitMQ should use net_ticktime
We no longer need to force low-level TCP timeouts for dead client
detection, but should continue tuning the timeout for dead peer
detection between cluster nodes.  Using the erlang net_ticktime option
is preferrable here.

Closes-Bug: 1717006
Change-Id: Ibd29c03bd69818d79396c379a2d638c018a04b82
2017-10-06 09:02:35 -04:00
John Eckersberg
833e3baeb3 rabbitmq: set cluster_partition_handling to 'ignore'
The pause_minority strategy tends to cause more problems than it
solves.  If a partition is brief enough that no nodes are fenced, the
pausing and unpausing of minority nodes (especially during a partial
partition) frequently causes rabbitmq to crash in odd ways consistent
with race conditions.

By ignoring partitions, we will tolerate brief partitions better.
Longer partitions will be handled via fencing, which does not suffer
from race conditions when pausing/unpausing nodes.

Change-Id: Icb05c6b95a207c4ef818fb90fa9a2c041a5e85cf
2017-10-06 08:49:56 -04:00
Juan Antonio Osorio Robles
1b4df60ac7 Rabbitmq: Enable Erlang distribution TLS
This will be used for the replication traffic as specified in the
dependent commit.

bp tls-via-certmonger
Change-Id: Ia53b9edaa6c6cdd48bcdde64969ae6c16f57ae41
Depends-On: I265c89cb8898a6da78a606664a22c50f5e57a847
2017-08-29 12:01:36 +00:00
Juan Antonio Osorio Robles
4bea8cf918 Use integers for rabbitmq ports
They should be integers as specified in the parameter definition
of the class. Else it'll fail.

Change-Id: I06b6e46c0722516e28e8bff4d481fb4b7a08bd61
Closes-Bug: #1713659
2017-08-29 08:28:36 +00:00
John Eckersberg
ef582bfc6a Increase default RabbitMQ/Erlang TCP timeout from 5 to 15 seconds
This should be greater than the default value of
corosync_token_timeout, which is 10 seconds.  That way, if an entire
cluster node is unavailable, appropriate fencing measures can occur.

With the current settings, it is possible for brief network
interruptions, greater than 5 seconds, but less than 10 seconds, to
occur.  This can cause the RabbitMQ cluster to fail in subtle ways,
but no corrective action taken by pacemaker.

Change-Id: I735d43616c5c623c4398d924713012f595b2e5f9
2017-07-19 11:28:39 -04:00
Giulio Fidente
baf6eee501 Adds network/cidr mapping into a new service property
Makes it possible to resolve network subnets within a service
template; the data is transported into a new property ServiceData
wired into every service which hopefully is generic enough to
be extended in the future and transport more data.

Data can be consumed in service templates to set config values
which need to know what is the subnet where a deamon operates (for
example the Ceph Public vs Cluster network).

Change-Id: I28e21c46f1ef609517175f7e7ee19e28d1c0cba2
2017-07-14 13:44:04 +02:00
Carlos Camacho
0a0e2ee629 Update the template_version alias for all the templates to pike.
Master is now the development branch for pike
changing the release alias name.

Change-Id: I938e4a983e361aefcaa0bd9a4226c296c5823127
2017-05-19 09:58:07 +02:00
Saravanan KR
a096ddab34 Add role specific information to the service template
When a service is enabled on multiple roles, the parameters for the
service will be global. This change enables an option to provide
role specific parameter to services and other templates.

Two new parameters - RoleName and RoleParameters, are added to the
service template. RoleName provides the role name of on which the
current instance of the service is being applied on. RoleParameters
provides the list of parameters which are configured specific to the
role in the environment file, like below:

  parameters_default:
      # Default value for applied to all roles
      NovaReservedHostMemory: 2048
      ComputeDpdkParameters:
          # Applied only to ComputeDpdk role
          NovaReservedHostMemory: 4096

In above sample, the cluster contains 2 roles - Compute, ComputeDpdk.
The values of ComputeDpdkParameters will be passed on to the templates
as RoleParameters while creating the stack for ComputeDpdk role. The
parameter which supports role specific configuration, should find the
parameter first in in the RoleParameters list, if not found, then the
default (for all roles) should be used.
Implements: blueprint tripleo-derive-parameters

Change-Id: I72376a803ec6b2ed93903cc0c95a6ffce718b6dc
2017-05-15 10:06:46 +05:30
Michele Baldessari
90fc4b2e27 Change the default for rabbitmq back to ha-mode: all
In change Ib62001c03e1e08f58cf0c6e0ba07a8879a584084 we switched the
rabbitmq queues HA mode from ha-all to ha-exactly. While this gives us a
nice performance boost with rabbitmq, it makes rabbit less resilient to
network glitches as we painfully found out via
https://bugzilla.redhat.com/show_bug.cgi?id=1441635.

This is the THT part of the change that changes the default to
ha-mode: all.

Closes-Bug: #1686337
Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com>
Co-Authored-By: John Eckersberg <jeckersb@redhat.com>

Change-Id: I7afcf2b3c8deb13fc2134e4cae9c06a44e775384
Depends-On: I9a90e71094b8d8d58b5be0a45a2979701b0ac21c
2017-04-26 15:16:36 +02:00
Juan Antonio Osorio Robles
69c213e3e3 Rabbitmq: Use conditional instead of nested stack for TLS-specific bits
Usually a nested stack is used that contains the TLS-everywhere bits
(config_settings and metadata_settings). Nested stacks are very
resource intensive. So, instead of doing using nested stacks, this patch
changes that to use a conditional, and output the necessary
config_settings  and metadata_settings this way in an attempt to save
resources.

Change-Id: Ic25f84a81aefef91b3ab8db2bc864853ee82c8aa
2017-03-27 13:33:12 +03:00
Juan Antonio Osorio Robles
1992282b88 Pass hieradata for internal TLS for RabbitMQ
As with other services, this passes the necessary hieradata to enable
TLS for RabbitMQ. This will mean (once we set it via puppet-tripleo)
that there will only be TLS connections, as the ssl_only option is being
used.

bp tls-via-certmonger

Change-Id: I960bf747cd5e3040f99b28e2fc5873ca3a7472b5
Depends-On: Ic2a7f877745a0a490ddc9315123bd1180b03c514
2017-03-09 11:08:41 +00:00
Jenkins
f0bed4c7e7 Merge "Force epmd listening to a specific address" 2017-01-20 08:39:37 +00:00
Steven Hardy
04ed7e511d Add neutron service support for composable upgrades
Change-Id: I9c6116ddb4475b798876635cbb701214759fa33b
Partially-Implements: blueprint overcloud-upgrades-per-service
2017-01-13 14:10:55 +00:00
James Slagle
3bd90e2ab8 Set rabbitmq package_provider to yum
When deploying with EnablePackageInstall:True, the rabbitmq puppet
module defaults to the rpm package provider, which then tries to "rpm -i
undef" since we are setting rabbitmq::package_source to undef. Instead
of using the rpm provider at all, we should just use the yum provider to
install whatever rabbitmq rpm's are found in enabled repos.

Change-Id: I29365e675bfde676fde7a54dfc6c660c3970f50a
Partially-implements: blueprint split-stack-software-configuration
2017-01-04 14:22:07 -05:00
Michele Baldessari
437f4df0ea Force epmd listening to a specific address
With this change we export ERL_EPMD_ADDRESS set to the
address rabbitmq is listening too. We need to explicitely
export it so that epmd can pick it up and bind to the address.

Closes-Bug: #1645898

Change-Id: Iacb2ee262da419f61ec3511f42b395f69f5d14da
2016-12-31 23:01:30 +01:00
Steven Hardy
3c6ec654b4 Bump template version for all templates to "ocata"
Heat now supports release name aliases, so we can replace
the inconsistent mix of date related versions with one consistent
version that aligns with the supported version of heat for this
t-h-t branch.

This should also help new users who sometimes copy/paste old templates
and discover intrinsic functions in the t-h-t docs don't work because
their template version is too old.

Change-Id: Ib415e7290fea27447460baa280291492df197e54
2016-12-23 11:43:39 +00:00
Juan Antonio Osorio Robles
de923539c8 Set rabbitmq's port and IP via the config file and not the env file
The RabbitMQ's puppet manifest configures the node's IP and port through
environment variables. While this would usually be fine, it doesn't
allow us to use TLS-only, since it will always try to start a TCP
listener. So, by setting these values through the config file, when
setting ssl_only for rabbitmq, they will effectively be discarded and
thus allow us to use an SSL listener on the same port.

Change-Id: I33d051a8c740baf69b99517378e1f9b0f3cc1681
2016-12-14 14:06:21 +02:00
Jenkins
f0348b0d7a Merge "Revert "Use FQDN for rabbitmq's nodename env variable"" 2016-12-02 18:07:32 +00:00
Ben Nemec
0f1022e8ee Revert "Use FQDN for rabbitmq's nodename env variable"
This seems to have broken the updates job, causing it to fail
with following error:

Can't set long node name!\nPlease check your configuration\n

Related-Bug: 1646873

This reverts commit 3e9fcfd09320ace07bc1bd4cb57feb98cd057332.

Change-Id: I72ba891cd9cd8c4f1bc204144f46aaabbdfd3647
2016-12-02 15:45:21 +00:00
Jenkins
cefb448de5 Merge "Use FQDN for rabbitmq's nodename env variable" 2016-12-02 09:41:28 +00:00
Steven Hardy
dbece39f54 Initial support for composable upgrades with Heat+Ansible
This shows how we could wire in the upgrade steps using Ansible
as was previously proposed e.g in https://review.openstack.org/#/c/321416/
but it's more closely integrated with the new composable services
architecture.

It's also very similar to the approach taken by SpinalStack where
ansible snippets per-service were combined then run in a series of
steps using Ansible tags.

This patch just enables upgrade of keystone - we'll add support for
other patches in subsequent patches.

Partially-Implements: blueprint overcloud-upgrades-per-service
Change-Id: I39f5426cb9da0b40bec4a7a3a4a353f69319bdf9
2016-12-01 13:40:50 +00:00
Juan Antonio Osorio Robles
3e9fcfd093 Use FQDN for rabbitmq's nodename env variable
Change-Id: Iee1afeced0b210a46b273aafc0d40e99d6ee6d4e
2016-12-01 11:18:23 +02:00
Michele Baldessari
c6ddaafe54 Remove double tcp_listen_options entries for rabbit
After a brand new deployment we have the following in rabbitmq.config:
...
  {rabbit, [
    {tcp_listen_options,
         [binary,
         {packet,        raw},
         {reuseaddr,     true},
         {backlog,       128},
         {nodelay,       true},
         {exit_on_close, false}]
    },
    {tcp_listen_options, [binary, {packet, raw}, {reuseaddr, true},
{backlog, 128}, {nodelay, true}, {exit_on_close, false}, {keepalive,
true}]},
...

Let's remove these duplicate entries and make sure that we use the
parameters for the puppet module to set the following values
explicitely (it's the only parameter where we do not use the default
setting from the puppet module):
keepalive = true -> rabbitmq::tcp_keepalive: true

All the other options that we set are the default in the puppet module:
{packet, raw}
{reuseaddr, true}
{backlog, 128}{nodelay, true}
{exit_on_close, false}

Depends-On: I608477d5714a5081b3b4ab3b9fc2932bdd598301
Change-Id: I35921652bd84d1d6be0727051294983d4a0dde10
2016-10-21 07:46:12 +02:00
Jenkins
1bf2b3cc0b Merge "Balance Rabbitmq Queue Master Location on queue declaration with min-masters strategy" 2016-10-03 09:50:23 +00:00
Michele Baldessari
1c5d168544 Change rabbitmq queues HA mode from ha-all to ha-exactly
It turns out that reducing number of rabbitmq queues in cluster
significantly improves performance of cluster especially in the case of
failover recovery time. Right now the cluster uses ha-all mode for rabbitmq
queues.

It is best to change this to "ha-exactly" mode and reduce the number
of queue copies to ceil(N/2) where N is number of controllers in the
cluster - so in typical scenario of 3 controller It would be 2 by
default.

It does not make much sense to keep the copies of queues over whole
cluster since if the quorum of nodes is lost then the rest of cluster
nodes will be stopped anyway. We let the user override this with a
parameter.

I.e. for a 3 node controlplane cluster we will go from this:
pcs resource show rabbitmq
 Resource: rabbitmq (class=ocf provider=heartbeat type=rabbitmq-cluster)
  Attributes: set_policy="ha-all ^(?!amq\.).* {"ha-mode":"all"}"

To this:
pcs resource show rabbitmq
 Resource: rabbitmq (class=ocf provider=heartbeat type=rabbitmq-cluster)
  Attributes: set_policy="ha-all ^(?!amq\.).* {"ha-mode":"exactly","ha-params":2}"

According to Marin Krcmarik's testing recovery time from failure was
reduced significantly.

Partial-Bug: #1628998
Change-Id: Iace6daf27a76cb8ef1050ada0de7ff1f530916c6
2016-10-01 00:28:34 +02:00
Michele Baldessari
5e41f15416 Balance Rabbitmq Queue Master Location on queue declaration with min-masters strategy
It may happen that one of the controllers may become unavailable and
Queue Masters will be located on available controllers during queue
declarations. Once a lost controller will be become available masters of
newly declared queues are not placed with priority to such controller
with obviously lower number of queue masters and thus the distribution
may be unbalanced and one of the controllers may become under
significantly higher load in some circumstances of multiple fail-overs.

With rabbit 3.6.0 rabbitmq introduced a new HA feature of Queue masters
distribution - one of the strategies is min-masters, which picks the
node hosting the minimum number of masters.

One of the ways how to turn such min-masters strategy on is by adding
following into configuration file - rabbitmq.config
{rabbit,[ ..
          {queue_master_locator, <<"min-masters">>},
          .. ]},

Change-Id: I61bcab0e93027282b62f2a97bec87cbb0a6e6551
Closes-Bug: #1629010
2016-09-29 18:50:39 +02:00
Michele Baldessari
859d74810b RabbitMQ threads should be configured dynamically
Currently in puppet/services/rabbitmq.yaml we hardcode the thread pool
size to 30 (via the +A30 snippet):
rabbitmq_environment:
    RABBITMQ_SERVER_ERL_ARGS: '"+K true +A30 +P 1048576 -kernel inet_default_connect_options [{nodelay,true},{raw,6,18,<<5000:64/native>>}] -kernel inet_default_listen_options [{raw,6,18,<<5000:64/native>>}]"'

Upstream rabbit has gained the ability to dynamically configure the
number of threads since 3.6.2 via the following commit:
41ce5ad808

Given that the default was hardcoded in rabbit from at least 3.4.0 up
until 3.6.2 (see LP bug associated to this commit), we can actually
remove this hardcoded value as it overrides a sane default.

Before the change:
/usr/lib64/erlang/erts-7.3.1/bin/beam.smp -W w -A 64 -K true -A30 -P 1048576 ...

After the change:
/usr/lib64/erlang/erts-7.3.1/bin/beam.smp -W w -A 64 -K true -P 1048576 ...

So effectively with this change we will have the following:
- With older rabbitmq versions we keep the +A30 default
- With rabbitmq versions >= 3.6.2 the thread number is dynamically
  computed to nr_cpus * 16

Change-Id: I8d30c7d141c29fcc439d40fc767498520be7966e
Closes-Bug: #1625486
2016-09-20 10:32:44 +02:00
Michele Baldessari
a1dcc16f3a Move rabbit's clustering port away from the ephemeral port range
Currently RabbitMQ cluster uses a predefined port 35672 for clustering.
This port belongs to so-called ephemeral ports range.

Ephemeral ports are the ports kernel assings to application if it
doesn't specify which port to open. So there is a small chance that this
application being started before RabbitMQ itself could grab this port.
While rather unlikely we did see this happen.

Selinux change should already be in place. On my Centos 7 we have:
rabbitmq_port_t                tcp      25672
corenet_tcp_bind_rabbitmq_port(rabbitmq_t)
corenet_tcp_connect_rabbitmq_port(rabbitmq_t)

First noted via:
https://bugzilla.redhat.com/show_bug.cgi?id=1357522

Closes-Bug: #1623818

Depends-On: I0bcd0d063a7a766483426fdd5ea81cbe1dfaa348
Change-Id: I995bd96c2a17614e954ea5bbae4d58998ef420dc
2016-09-16 18:19:04 +02:00
Martin Mágr
25ad7b8e1e Availability monitoring agents support
- adds possibility to install sensu-client on all nodes
- each composable service has it's own subscription

Co-Authored-By: Emilien Macchi <emilien@redhat.com>
Co-Authored-By: Michele Baldessari <michele@redhat.com>
Implements: blueprint tripleo-opstools-availability-monitoring
Change-Id: I6a215763fd0f0015285b3573305d18d0f56c7770
2016-08-31 09:22:59 -04:00
Dan Prince
92f2cfb162 Move RabbitMQ settings out of controller.yaml
This moves the config settings out of controller.yaml for RabbitMQ
and into puppet/services/rabbitmq.yaml.

Related-Bug: #1604414

Change-Id: I6b3d71653fb91b89b85dae7df4088afff22b71ac
2016-08-23 21:29:05 -04:00
Dan Prince
3b62761d2f Add DefaultPasswords to composable services
This patch adds a new DefaultPasswords parameter to
composable services. This is needed to help provide
access to top level password resources that overcloud.yaml
currently manages (passwords for Rabbit, Mysql, etc.).

Moving the RandomString resources into composable services
would cause them to regenerate within the stack. With this
approach we can leave them where they are while we deprecate
the top level mechanism and move the code that uses the
passwords into the composable services.

Change-Id: I4f21603c58a169a093962594e860933306879e3f
2016-08-18 12:45:30 -04:00
Giulio Fidente
885b37c80e Pass ServiceNetMap to services
This will be needed to pick the network where the service has
to bind to from within the service template.

Change-Id: I52652e1ad8c7b360efd2c7af199e35932aaaea8c
2016-08-18 12:36:18 -04:00
Emilien Macchi
315fa31963 Migrate Puppet Hieradata to composable services
Migrate puppet/hieradata/*.yaml parameters to puppet/services/*.yaml
except for some services that are not composable yet.

Co-Authored-By: Juan Antonio Osorio Robles <jaosorior@redhat.com>
Change-Id: I7e5f8b18ee9aa63a1dffc6facaf88315b07d5fd7
2016-07-27 12:23:38 -04:00
Dan Prince
5195d7f891 Composable firewall rules
Split out the firewall rules in puppet/hieradata/controller.yaml
into the composable services

Depends-On: Id370362ab57347b75b1ab25afda877885b047263
Change-Id: Icaecab100d3f278035fbbb3facb9bf6c62c76c03
2016-07-25 15:24:16 +02:00
Dan Prince
6b30ff11d4 Add 'service_name' to composable services
This patch adds a new service_name section to each composable
service. We now have an explicit unit test check to ensure that
service_name exists in tools/yaml-validate.py.

This patch also wires service_names into hieradata on each
of the roles so that tools can access the deployed services locally
during deployment and upgrades.

Change-Id: I60861c5aa760534db3e314bba16a13b90ea72f0c
2016-07-22 07:29:39 -04:00
Chris Jones
731940616a Increase RabbitMQ maximum file descriptors.
We now allow 65536 open file descriptors to better reflect the
real-world settings of downstream consumers of TripleO.

Change-Id: Ib04ea6afb2da1a9101839d9d70bc8891d69700ec
2016-06-24 15:20:55 +00:00
Giulio Fidente
a6438a2082 Pass MysqlVirtualIP via EndpointMap
By passing the MysqlVirtualIP via the EndpointMap we won't need it
to be provided as a parameter to the services.

This follows what is already happening for the glance registry
service with I9186e56cd4746a60e65dc5ac12e6595ac56505f0.

Change-Id: Iad2ab389bf64d0fc8b06eb0e7d29b5370ff27dff
Co-Authored-By: Juan Antonio Osorio Robles <jaosorior@redhat.com>
2016-05-30 10:22:59 +03:00
Emilien Macchi
5b95df3ee3 Deploy RabbitMQ as a composable role
Change the way to implement RabbitMQ, as a composable role.

Implements: blueprint refactor-puppet-manifests
Change-Id: I5fed5c437ad492af75791a9163f99ae292f58895
2016-05-18 21:43:31 +02:00