21 Commits

Author SHA1 Message Date
Will Szumski
d05578f59f Add extras directory to prometheus config
This provides a generic mechanism to include extra files
that you can reference in prometheus.yml, for example:

scrape_targets:
  - job_name: ipmi
    params:
      module: default
    scrape_interval: 1m
    scrape_timeout: 30s
    metrics_path: /ipmi
    scheme: http
    file_sd_configs:
    - files:
      - /etc/prometheus/extras/file_sd/ipmi-exporter-targets.yml
      refresh_interval: 5m

Change-Id: Ie2f085204b71725b901a179ee51541f1f383c6fa
Related: blueprint custom-prometheus-targets
2020-05-11 13:47:12 +01:00
Zuul
9995f2d89d Merge "Fix Prometheus mysqld exporter pointing to VIP address" 2020-03-01 13:07:33 +00:00
Radosław Piliszek
410fcc6363 Fix Prometheus mysqld exporter pointing to VIP address
Change-Id: I4f553bd0888e200ddf744604c5029e67a95ee2cd
Closes-bug: #1863041
2020-02-13 10:27:45 +01:00
Michal Nasiadka
4e6fe7a6da Remove kolla-ceph
Kolla-Ansible Ceph deployment mechanism has been deprecated in Train [1].

This change removes the Ansible code and associated CI jobs.

[1]: https://review.opendev.org/669214

Change-Id: Ie2167f02ad2f525d3b0f553e2c047516acf55bc2
2020-02-11 11:42:06 +01:00
Scott Solkhon
991bdc5f55 Fix Prometheus template generation
In a deployment where Prometheus is enabled and
Alertmanager is disabled the task "Copying over
prometheus config file" in
'ansible/roles/prometheus/tasks/config.yml' will
fail to template the Prometheus configuration file
'ansible/roles/prometheus/templates/prometheus.yml.j2'
as the variable 'prometheus_alert_rules' does not
contain the key 'files'. This commit fixes this bug.

Change-Id: Idbe1e52dd3693a6f168d475f9230a253dae64480
Closes-Bug: #1854540
2019-11-30 22:54:22 +00:00
Radosław Piliszek
bc053c09c1 Implement IPv6 support in the control plane
Introduce kolla_address filter.
Introduce put_address_in_context filter.

Add AF config to vars.

Address contexts:
- raw (default): <ADDR>
- memcache: inet6:[<ADDR>]
- url: [<ADDR>]

Other changes:

globals.yml - mention just IP in comment

prechecks/port_checks (api_intf) - kolla_address handles validation

3x interface conditional (swift configs: replication/storage)

2x interface variable definition with hostname
(haproxy listens; api intf)

1x interface variable definition with hostname with bifrost exclusion
(baremetal pre-install /etc/hosts; api intf)

neutron's ml2 'overlay_ip_version' set to 6 for IPv6 on tunnel network

basic multinode source CI job for IPv6

prechecks for rabbitmq and qdrouterd use proper NSS database now

MariaDB Galera Cluster WSREP SST mariabackup workaround
(socat and IPv6)

Ceph naming workaround in CI
TODO: probably needs documenting

RabbitMQ IPv6-only proto_dist

Ceph ms switch to IPv6 mode

Remove neutron-server ml2_type_vxlan/vxlan_group setting
as it is not used (let's avoid any confusion)
and could break setups without proper multicast routing
if it started working (also IPv4-only)

haproxy upgrade checks for slaves based on ipv6 addresses

TODO:

ovs-dpdk grabs ipv4 network address (w/ prefix len / submask)
not supported, invalid by default because neutron_external has no address
No idea whether ovs-dpdk works at all atm.

ml2 for xenapi
Xen is not supported too well.
This would require working with XenAPI facts.

rp_filter setting
This would require meddling with ip6tables (there is no sysctl param).
By default nothing is dropped.
Unlikely we really need it.

ironic dnsmasq is configured IPv4-only
dnsmasq needs DHCPv6 options and testing in vivo.

KNOWN ISSUES (beyond us):

One cannot use IPv6 address to reference the image for docker like we
currently do, see: https://github.com/moby/moby/issues/39033
(docker_registry; docker API 400 - invalid reference format)
workaround: use hostname/FQDN

RabbitMQ may fail to bind to IPv6 if hostname resolves also to IPv4.
This is due to old RabbitMQ versions available in images.
IPv4 is preferred by default and may fail in the IPv6-only scenario.
This should be no problem in real life as IPv6-only is indeed IPv6-only.
Also, when new RabbitMQ (3.7.16/3.8+) makes it into images, this will
no longer be relevant as we supply all the necessary config.
See: https://github.com/rabbitmq/rabbitmq-server/pull/1982

For reliable runs, at least Ansible 2.8 is required (2.8.5 confirmed
to work well). Older Ansible versions are known to miss IPv6 addresses
in interface facts. This may affect redeploys, reconfigures and
upgrades which run after VIP address is assigned.
See: https://github.com/ansible/ansible/issues/63227

Bifrost Train does not support IPv6 deployments.
See: https://storyboard.openstack.org/#!/story/2006689

Change-Id: Ia34e6916ea4f99e9522cd2ddde03a0a4776f7e2c
Implements: blueprint ipv6-control-plane
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-10-16 10:24:35 +02:00
Dincer Celik
5ff7bab46b [prometheus] Added support for extra options
This change introduces the way to pass extra options to prometheus.

Currently, prometheus runs with nearly default options, and when clouds
start getting bigger, you need to pass extra parameters to prometheus.

Change-Id: Ic773c0b73062cf3b2285343bafb25d5923911834
2019-09-23 11:25:04 +03:00
Zuul
b7bbbae981 Merge "Adding Prometheus blackbox exporter" 2019-09-20 17:25:04 +00:00
Scott Solkhon
b22375ebfd Adding Prometheus blackbox exporter
This commit follows up the work in Kolla to provide deploy and configure the
Prometheus blackbox exporter.

An example blackbox-exporter module has been added (disabled by default)
called os_endpoint. This allows for the probing of endpoints over HTTP
and HTTPS. This can be used to monitor that OpenStack endpoints return a status
code of either 200 or 300, and the word 'versions' in the payload.

This change introduces a new variable `prometheus_blackbox_exporter_endpoints`.
Currently no defaults are specified because the configuration is heavily
dependent on the deployment.

Co-authored-by: Jack Heskett <Jack.Heskett@gresearch.co.uk>
Change-Id: I36ad4961078d90e2fd70c9a3368f5157d6fd89cd
2019-09-18 11:06:19 +01:00
Mark Flynn
01eb7a63a5 Fix prometheus-alertmanager cluster bug
Edited the
ansible/roles/prometheus/templates/prometheus-alertmanager.json.j2 file
to change the mesh.peer and mesh.listen-address to cluter.peer and
cluster.listen-address.  This stopped alertmanager from crashing with
error "--mesh.peer is an invalid flag"

Change-Id: Ia0447674b9ec377a814f37b70b4863a2bd1348ce
Signed-off-by: Mark Flynn <markandrewflynn@gmail.com>
2019-09-13 14:16:42 -04:00
Doug Szumski
9d495504be Set external web URL for Prometheus services
This change ensures that URLs returned from these services reference
the HAProxy endpoint, rather than the host on which the service is
running.

Closes-Bug: #1825150
Change-Id: I7f966ff749ea37620f1bde7019a598cb9505fa45
2019-04-17 11:24:52 +01:00
Erol Guzoglu
14ab9a7c4e Support the prometheus elasticsearch exporter
This patch implements the support for the elasticsearch-exporter in
kolla-ansible

The configuration and prechecks are reused from the other exporters

Depends-On: Id138f12e10102a6dd2cd8d84f2cc47aa29af3972
Change-Id: Iae0eac0179089f159804490bf71f1cf2c38dde54
2019-03-11 17:25:51 +03:00
Doug Szumski
a55769b00a Update arguments for starting Prometheus exporters
The patch that this depends on in the Kolla repo updates various Prometheus
exporters. In some cases the command line syntax has changed which prevents
them from starting. This commit updates the command line syntax in-line with
the new versions.

Depends-On: I846989b16fa7f76b11b309b7a9764cec8aaf538d
Change-Id: I1c8c56059e51442d7bf2248b9632021cb529b4ba
2019-02-28 09:41:32 +00:00
Jorge Niedbalski
6c64b7c732 [prometheus] Support the prometheus openstack exporter
This patch implements the initial support for the
openstack-exporter[0] in the kolla-ansible
prometheus monitoring system.

The configuration and prechecks are reused from the other
exporters and a new template is provided for generating
a os-client-config file required by the exporter.

The default scrape interval is 60 seconds, but it can
be extended via a configuration option.

[0] https://github.com/Linaro/openstack-exporter

Change-Id: I4a34c4bb56e74b5cd544972cbd6540d9acb6e4a1
2019-01-21 10:41:35 -03:00
Kien Nguyen
835368524e Add Prometheus as Vitrage datasource
Vitrage has already supported Prometheus as
datasource. Kolla can config it automatically,
just need a little changes, for example in
wsgi config file [1].

Co-Authored-By: Hieu LE <hieulq2@viettel.com.vn>

[1] https://review.openstack.org/#/c/584649/8/devstack/apache-vitrage.template

Change-Id: I64028a0dfd9887813b980a31c30c2c1b1046da61
2018-12-11 16:05:05 +07:00
Jorge Niedbalski
0ec41f2092 [prometheus] Allow custom alert rules to be configured.
This patch extends the configuration task for prometheus
to allow the operator to pass a(set) of prometheus alert
rules files, that will be used by alertmanager to produce
alerts.

This functionality is only enabled when the prometheus-alertmanager
service is enabled.

Change-Id: I882759c3774f43640631c1058f8a9cb24e7a60d2
Closes-Bug: #1776529
Signed-off-by: Jorge Niedbalski <jorge.niedbalski@linaro.org>
2018-08-08 12:48:41 -04:00
Jorge Niedbalski
9d2770db11 [prometheus] Enable ceph mgr exporter
This patch enables the ceph mgr prometheus exporter.

If enable_prometheus_ceph_mgr_exporter is set to true,
the ceph mgr prometheus plugin is enabled on the hosts that are part
of the ceph-mgr group, then the exporter is added into the prometheus-server
configuration file.

Change-Id: Ia2f879401e585e6043f69cc5e3ab1a1f72f7f033
2018-07-23 05:39:52 +00:00
Jorge Niedbalski
1596475db6 [prometheus] Initial implementation of prometheus-alertmanager
This patch extends the prometheus role for being able
to deploy the prometheus-alertmanager[0] container.

The variable enable_prometheus_alertmanager
decides if the container should be deployed and enabled.

If enabled, the following configuration and actions are performed:

- The alerting section on the prometheus-server configuration
is added pointing the prometheus-alertmanager host group as targets.

- HAProxy is configured to load-balance over the prometheus-alertmanager
host group. (external/internal).

Please note that a default (dummy) configuration is provided, that
allows the service to start, the operator should extend it via a node custom config

[0] https://github.com/openstack/kolla/tree/master/docker/prometheus/prometheus-alertmanager

Change-Id: I3a13342c67744a278cc8d52900a913c3ccc452ae
Closes-Bug: 1774725
Signed-off-by: Jorge Niedbalski <jorge.niedbalski@linaro.org>
2018-07-11 16:20:35 -04:00
Mark Giles
41254b6c46 Add cAdvisor for Prometheus monitoring
cAdvisor (Container Advisor) provides metrics on resource usage and
performance characteristics of running containers.  This change
deploys a cadvisor container and configures prometheus to scrape
data from it.

Change-Id: I55dd4fee954f9be68efda397746861ddaaa0a565
Partially-Implements: blueprint prometheus
2018-05-29 08:55:58 -04:00
Jorge Niedbalski
3b61cc702d [prometheus] Add memcached_exporter.
This patch adds the prometheus_memcached_exporter[0] to the
list of  available exporters, following the conventions
used by the previously integrated exporters.

[0] https://github.com/openstack/kolla/tree/master/docker/prometheus-memcached-exporter

Change-Id: I103b0ee19ef2fd17ce19a27d60773675ad234c1c
Closes-Bug: #1773303
Signed-off-by: Jorge Niedbalski <jorge.niedbalski@linaro.org>
2018-05-25 01:45:13 -04:00
Mathias Ewald
4d1f37359d Add role to deploy prometheus
This patch adds the ansible role to deploy the prometheus service which
can be used to collect performance metrics accross the environment

Partially-Implements: blueprint prometheus
Change-Id: I908b9c9dad63ab5c9b80be1e3a80a4fc8191cb9e
2018-04-19 10:58:15 -04:00