In Ussuri, nova stopped using separate Ceph keys for the volumes and vms
pools by default. Instead, we set ceph_nova_keyring to the value of
ceph_cinder_keyring by default, which is ceph.client.cinder.keyring.
This is in line with the Ceph OpenStack integration guide [1]. However,
the user used by nova to access the vms pool (ceph_nova_user) defaults
to nova, meaning that nova will still try to use a
ceph.client.nova.keyring, which probably does not exist. We did not see
this issue in CI, because we set ceph_nova_user to cinder.
This change fixes the issue by setting ceph_nova_user to the value of
ceph_cinder_user by default, which is cinder.
Closes-Bug: #1934145
Related-Bug: #1928690
[1] https://docs.ceph.com/en/latest/rbd/rbd-openstack/
Change-Id: I6aa8db2214e07906f1f3e035411fc80ba911a274
On machines with many cores, we were seeing excessive CPU load on systems
that were not very busy. With the following Erlang VM argument we saw
RabbitMQ CPU usage drop from about 150% to around 20%, on a system with
40 hyperthreads.
+S 2:2
By default RabbitMQ starts N schedulers where N is the number of CPU
cores, including hyper-threaded cores. This is fine when you assume all
your CPUs are dedicated to RabbitMQ. Its not a good idea in a typical
Kolla Ansible setup. Here we go for two scheduler threads.
More details can be found here:
https://www.rabbitmq.com/runtime.html#scheduling
and here:
https://erlang.org/doc/man/erl.html#emulator-flags
+sbwt none
This stops busy waiting of the scheduler, for more details see:
https://www.rabbitmq.com/runtime.html#busy-waiting
Newer versions of rabbit may need additional flags:
"+sbwt none +sbwtdcpu none +sbwtdio none"
But this patch should be back portable to older versions of RabbitMQ
used in Train and Stein.
Note that information on this tuning was found by looking at data from:
rabbitmq-diagnostics runtime_thread_stats
More details on that can be found here:
https://www.rabbitmq.com/runtime.html#thread-stats
Related-Bug: #1846467
Change-Id: Iced014acee7e590c10848e73feca166f48b622dc
In the Xena cycle it was decided to remove the Monasca
Grafana fork due to lack of maintenance. This commit removes
the service and provides a limited workaround using the
Monasca Grafana datasource with vanilla Grafana.
Depends-On: I9db7ec2df050fa20317d84f6cea40d1f5fd42e60
Change-Id: I4917ece1951084f6665722ba9a91d47764d3709a
Allow users to import custom grafana dashboards.
Dashboards as JSON files should be placed into
"{{ node_custom_config }}/grafana/dashboards/" folder.
Change-Id: Id0f83b8d08541b3b74649f097b10c9450201b426
This change allows a user to forward control plane logs
directly to Elasticsearch from Fluentd, rather than via
the Monasca Log API when Monasca is enabled. The Monasca
Log API can continue to handle tenant logs.
For many use cases this is simpler, reduces resource
consumption and helps to decouple control plane logging
services from tenant logging services.
It may not always be desired, so is optional and off by
default.
Change-Id: I195e8e4b73ca8f573737355908eb30a3ef13b0d6
The Monasca alerting pipeline provides multi-tenancy alerts and
notifications. It runs as an Apache Storm topology and generally
places a significant memory and CPU burden on monitoring hosts,
particularly when there are lot of metrics. This is fine if the
alerting service is in use, but sometimes it is not. For example
you may use Prometheus for monitoring the control plane, and
wish to offer tenants a monitoring service via Monasca without
alerting and notification functionality. In this case it makes
sense to disable this part of the Monasca pipeline and this patch
adds support for that.
If the service is ever re-enabled, all alerts and notifications
should spawn back automatically since they are persisted in the
central mysql database cluster.
Change-Id: I84aa04125c621712f805f41c8efbc92c8e156db9
The Log Metrics service is an admin only service. We now have
support in Fluentd via the Prometheus plugin to create metrics
from logs. These metrics can be scraped into Monasca or Prometheus.
It therefore makes sense to deprecate this service, starting by
disabling it by default, and then removing it in the Xena release.
This should improve the stability of the Monasca metrics pipeline
by ensuring that all metrics pass via the Monasca API for
validation, and ensure that metrics generated from logs are
available to both Prometheus and Monasca users by default.
Change-Id: I704feb4434c1eece3eb00c19dc5f934fd4bc27b4
Historically Monasca Log Transformer has been for log
standardisation and processing. For example, logs from different
sources may use slightly different error levels such as WARN, 5,
or WARNING. Monasca Log Transformer is a place where these could
be 'squashed' into a single error level to simplify log searches
based on labels such as these.
However, in Kolla Ansible, we do this processing in Fluentd so
that the simpler Fluentd -> Elastic -> Kibana pipeline also
benefits. This helps to avoid spreading out log parsing
configuration over many services, with the Fluentd Monasca output
plugin being yet another potential place for processing (which
should be avoided). It therefore makes sense to remove this
service entirely, and squash any existing configuration which
can't be moved to Fluentd into the Log Perister service. I.e.
by removing this pipeline, we don't loose any functionality,
we encourage log processing to take place in Fluentd, or at least
outside of Monasca, and we make significant gains in efficiency
by removing a topic from Kafka which contains a copy of all logs
in transit.
Finally, users forwarding logs from outside the control plane,
eg. from tenant instances, should be encouraged to process the
logs at the point of sending using whichever framework they are
forwarding them with. This makes sense, because all Logstash
configuration in Monasca is only accessible by control plane
admins. A user can't typically do any processing inside Monasca,
with or without this change.
Change-Id: I65c76d0d1cd488725e4233b7e75a11d03866095c
This option disables copy of certificates from the operator host to
kolla-ansible managed hosts.
This is especially useful if you already have some mechanisms to handle
your certificates directly on your hosts.
Co-Authored-By: Marc 'risson' Schmitt <marc.schmitt@risson.space>
Change-Id: Ie18b2464cb5a65a88c4ac191a921b8074a14f504
Deprecates support for Prometheus v1.x.
In Xena support for it will be removed from Kolla Ansible.
Change-Id: I027b19621196c698e09f79af294ba1b5dbfc0516