92 Commits

Author SHA1 Message Date
Dmitriy Rabotyagov
3d2aed2c2d Use version test instead of version_compare
This test was changed to 'version' in ansible 2.5 [1].

[1] https://docs.ansible.com/ansible/2.8/user_guide/playbooks_tests.html#version-comparison

Change-Id: I21efa77fc743f9530d307dc06c8a345475d35dfa
2019-09-10 10:09:32 +00:00
cloudnull
c12168e419
updates for elk6.7x
Some of the plugins are irrelevant now so with this release they've been
removed. Additionally the machine-learning switch in the updated beats
no longer does anything so its also been removed.

Change-Id: Ibac0177a61af5392cb80888a8fca1fa9ebe3ad4b
Signed-off-by: cloudnull <kevin@cloudnull.com>
2019-04-15 11:38:09 -05:00
cloudnull
28cb67cf33
improve deployments on 14.04
Change-Id: Ic2c335d8c3ede9dad2edb86a76139bdb71bdb6f7
Signed-off-by: cloudnull <kevin@cloudnull.com>
2019-03-07 20:05:43 -06:00
Zuul
8702dca38c Merge "Update heartbeat config for the latest stable release" 2019-02-27 14:04:25 +00:00
Kevin Carter
aabf90d1a4 Update heartbeat config for the latest stable release
Change-Id: I0db06c07ac9320c5db927f23e32fdb8194e5106b
Signed-off-by: Kevin Carter <kevin@cloudnull.com>
2019-02-27 06:26:13 +00:00
Kevin Carter
280ff11746 Update auditbeat config for the latest stable release
Change-Id: I468992009f562ca7d48fb88aab41edb552e23831
Signed-off-by: Kevin Carter <kevin@cloudnull.com>
2019-02-27 06:26:09 +00:00
Kevin Carter
c74eed3845 update packetbeat config for the latest release
Change-Id: If370e015ec2ec33b6f6e744958d7bcbed041ab42
Signed-off-by: Kevin Carter <kevin@cloudnull.com>
2019-02-26 22:29:53 -06:00
Kevin Carter
2d3c0d55f4 Update metricbeat config for the latest release
Change-Id: I312a0c272143973050f81f34867471098cec3286
Signed-off-by: Kevin Carter <kevin@cloudnull.com>
2019-02-26 22:22:46 -06:00
Kevin Carter
4490ed3dea Update journalbeat config
The journalbeat configuration has been updated to make it
similar to all other beats. This change updates our config
so that it is functional with the latest journalbeat release.

Change-Id: Ic70a031bdeb57f2f5439763a3bf9f6b7001e6a31
Signed-off-by: Kevin Carter <kevin@cloudnull.com>
2019-02-26 22:03:12 -06:00
cloudnull
a3afb64654
Ensure the default version of Java is set
When installing ES and Logstash the system version of java needs to
match the expected version of java ES and Logstash will use. This
change, uses the `update-alternatives` command to set the java version
to the expected value when more than one java exists on a system.
The nessisity for this change came from OS level upgrades within
environments running OLD versions of ES. Upon upgrading the base OS
our playbooks could not complete an upgrade of ES which was due to the
java expectations. Once the alternantives were set accordingly the
upgrade completed without issues.

Change-Id: I9025967f723ee17940e11789f503e342cdad6f2a
Signed-off-by: cloudnull <kevin@cloudnull.com>
2019-02-26 11:36:22 -06:00
cloudnull
a4d2b3c1f9 Correct service name when running with upstart
The service name for packetbeat needs to be set correctly, this does
that.

Change-Id: I39c10914ba2d0f16b6ebb94da480ad13f455a08f
Signed-off-by: cloudnull <kevin@cloudnull.com>
2019-02-19 21:39:36 +00:00
cloudnull
c6493a812b
Add monitorstack data collection into ES
The monitorstack data collection can export data into elasticsearch.
A playbook has been added to deploy the data collection probes which
will leverage systemd-timers to run the probes on regular intervals. The
systemd timers will be deployed per-probe and run within the utility,
compute, and memcached hosts. Any place the probes are deployed an
isolated user will ensure to fence the probes from the cluster and limit
access. OpenStack probes will only be deployed when an openstack-sdk
clouds config is found within the system.

Change-Id: Ic5cd5fd51a7e0763c0a2db40af4150b8851bc748
Signed-off-by: cloudnull <kevin@cloudnull.com>
2019-02-18 08:46:39 -06:00
Kevin Carter
326fde4895 Increment nginx check port
In the event that both NGINX and apache are co-existing on the same machine
the status port check for both platforms will be the same, and that will
cause one of the services to not start. This change increments the NGINX
check port to ensure there are no conflicts.

Change-Id: I03d5d351fff2d6926f35ca860c01f5a075de42aa
Signed-off-by: Kevin Carter <kevin@cloudnull.com>
2019-02-14 20:42:56 -06:00
Kevin Carter
52c2702587 Fix dashboards and possible port conflicts
This change ensures we are not creating a port conflict for apache/nginx when
status is enabled. This change also ensures dashboards are created correctly
resolving an issue with index-patterns containing a regex.

Change-Id: I8228fc9832d02518d2db843c96abf6dffc63bdfc
Signed-off-by: Kevin Carter <kevin@cloudnull.com>
2019-02-14 18:14:28 -06:00
Kevin Carter
9c9efd9eb5 Change the q_mem and h_mem to lower and upper limits
This change removes the {h,q}_mem options in favor of a new variable which
clearly states the upper and lower limits for a given deployment. This change
also makes these options a lot more conservative by default which will allow
the deployment to better run on shared infra.

Change-Id: I169f457198c11edc4881a04df65312f6c4f67feb
Signed-off-by: Kevin Carter <kevin@cloudnull.com>
2019-02-14 08:50:29 -06:00
cloudnull
0a0a4a0880
Add the ability to enable or disable rollups / indexes
This change creates a new option to enable or disbale rollup jobs. This
is also providing the default basic index patterns for kibana index
patterns and elastic indexes.

Change-Id: I60e96a2cdbe27de760b54c4d9d43bcde4d09bbf5
Signed-off-by: cloudnull <kevin@cloudnull.com>
2019-02-11 23:14:51 -06:00
cloudnull
03d25dce3d
Add logstash ingestion for collectd
This change will allow logstash to ingest metrics from collectd. New
options have been added to enable the deployment and configure it.

Change-Id: I995c0db69fc68d5f5bcae27ce16956876368e2a8
Signed-off-by: cloudnull <kevin@cloudnull.com>
2019-02-11 00:13:37 -06:00
Zuul
38f817aee7 Merge "Omit dahsboard on elk setup by default" 2019-02-04 06:02:11 +00:00
Kevin Carter
6017fc0e89 Add the ability to set the JVM heap size
This change makes it possible for users to set the `elastic_heap_size_default`
value. Before this change, the option was unreachable due to a series of facts
ganerated template values. The options `elastic_heap_size` or `logstash_heap_size`
have also been exposed giving deployers the ability to define service specific
heap sizes as needed.

Change-Id: Ida3a57fdcff388f8e4bb3f325b787205a6183970
Signed-off-by: Kevin Carter <kevin@cloudnull.com>
2019-01-30 09:53:20 -06:00
Kevin Carter
151d80382c Omit dahsboard on elk setup by default
With the introduction of the "infrustructure" panel and "canvas" becoming
stable, there's not a lot of reason to import the  general beat dashboards.
The default dashboard are almost always in a state of disrepare and take a
long time to import on high traffic clusters.

This channge removes the default dashboard from the beat setup role by
default. If a deployer wishes to renable the default dashboards, or add any
other beat flags, the variable `elastic_setup_flags` can be used to extend
the setup.

Change-Id: If44845f53e4d0cb1e91ec804060316fb852b4bfa
Signed-off-by: Kevin Carter <kevin@cloudnull.com>
2019-01-27 20:13:31 -06:00
Kevin Carter
82cc72e166 Read the path for the logstash queue path
The queue path within logstash may be a symlink which will fail to mount
as tempfs. To ensure queue path can be tempfs, a readlink command is
used to fetch the true path, which will be used in a mount when nessisary.

Change-Id: I5fe6bf311e0621c98766ae458371b5f11f89a61f
Signed-off-by: Kevin Carter <kevin@cloudnull.com>
2019-01-25 13:53:41 -06:00
Kevin Carter
abd6661b4e
Update conditionals and namespaced options
This change implements namespaced variables and conditionals in needed
services. This will ensure systems running these playbooks are able to
be deployed in isolation without making osa specific assumptions.

Change-Id: Ia20b8514144f0b0bf925d405f06ef2ddc28f1003
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2019-01-23 09:38:40 -06:00
Kevin Carter
892d617dc6
Add a default retention policy for skydive indexes
Given the ops tools now have a skydive deployment capability there needs
curator needs to be able to detect the addition of skydive indexes and
build a curator policy accordingly.

This change adds the new retention policy to the overlay inventory
providing a sane default for most environments.

The retention action files have been updated to remove the "-" as an
index separator. This was done because not all indexes use a dash as a
divider.

Change-Id: I5b61720f27da00e0c3b92341355b09ea6c01caba
2019-01-15 17:35:09 -06:00
Kevin Carter
b23ec9f8d9 Initial commit to add skydive
This commit adds playbooks and roles to the ops tooling setup to
build, deploy, and operate environments with skydive within in
it.

Skydive is a network analyzer which will allow users to explore
their topology in real-time using a defined storage back-end for
captures, alerts, and more.

The initial implementation of skydive deploys agents throughout
the environment and wires them all back to a cluster of analyzers
which leverage elasticsearch for its persistent storage back-end.
Storage back-ends are load balanced from the within the analyzer
nodes using the traefik light-weight reverse proxy. This setup
gives skydive a fully fault tolerant deployment.

Tests have been added to ensure the binary installation process
is validated. While these jobs are non-voting today, they'll be
iterated on and made passing in the subsequent PRs. All jobs are
following the selective pattern which allows these tools to be
gated in the mono-repo without impacting all other tools within
the environment.

Change-Id: Iaa1152566f2b615d67a33dc94ebdbebb1b492a9d
Signed-off-by: Kevin Carter <kevin@cloudnull.com>
2019-01-14 03:03:08 -06:00
Kevin Carter
a1d6ebe4d3 remove dynamic ns.enable generators
The ns.enabled generators will fail when running packetbeat with a limit.
These generators were dynamically enabling/disabling packetbeat features
based on things discovered in the environment however they we're
attempting to be a little to fancy, especially when running packetbeat
in a non-osa cloud. The values for the services have been reset to the
provider defaults and should teh deployer want to configure these option
they can use config_template.

Change-Id: I36d7298ca5142e8b5f926ab5d59ab8283704b5af
Signed-off-by: Kevin Carter <kevin@cloudnull.com>
2019-01-10 16:00:37 -06:00
Kevin Carter
7491b6df8e Update the embedded-ansible-setup process to be configurable
This change allows the embedded ansible process to be configurable by
the end user.
  * Python requirements and ansible roles will all now be user
    configurable.
  * Setup is now a local only playbook. This playbook replaces the bash
    commands we were rerunning when the `bootstrap-embedded-ansible.sh`
    script was executed.
  * Embedded ansible version is now 2.7.5 as default.
  * Deprecation warnings have been resolved.
  * Tests impacted by this change have been updated.

Change-Id: I4303c44e249cda31457a4f05a681e298d225a8b7
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2019-01-04 11:46:19 -06:00
Kevin Carter
5586d8a80f Convert template setup to a role
This change reduces code throughout the playbooks thereby speeding up
the task execution.
  * A new role named `elastic_beat_setup` was created to
    facilitate template setup as needed.
  * Beats retention policies are now defined on the elastic-logstash
    nodes instead of on all target hosts. This method will speed-up
    deployments on massive installations while streamlining all deployments.
  * Kibana variable assumptions have been fixed. This will allow for
    deployments without Kibana to be accomplished.

Change-Id: I36343264042e81dfcb68bad0f6c3a503e525eceb
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2019-01-02 20:38:47 -06:00
Kevin Carter
9a896aa81a Fence options before casting to json
These options could be "undefined" which is an object and not json
serializable. This change ensures if an option is undefined it defaults
to an empty set which will allow the option to be json serialized.

Change-Id: I1a81bafa441aa6400bfbec50d57e56df4d09bda3
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-12-20 17:38:18 -06:00
Kevin Carter
eb3bcb8daa Add local facts for template creations
The template creates can take a lot of time when dealing with large data
sets. This change makes it so template create will only happen on a
greenfield installation or when upgrading. A simple rerun of the
playbooks will not trigger template creations which will allow deployers
the ability to better change or modify deploymens without having to
worry about extended runtimes due to template interactions.

Change-Id: Ia9b77277553fbdbe0444737f39ec3de75f07cc0f
2018-12-19 17:59:21 -06:00
Kevin Carter
ad91d5773e Extend auto change detection
This change makes it possible for a deployer to modify the set of
indexes and weights assosiated with them. If modified, the local
facts will be automatically updated.

Change-Id: Iaea1f22d8aad2abdd02801dd9acad5f969b78d0e
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-12-19 16:15:28 -06:00
Zuul
f7552334ba Merge "Add missing prometheus port for ceph auto discover" 2018-12-19 16:25:22 +00:00
Michael Vollman
0ea548f979 Add missing prometheus port for ceph auto discover
When ceph prometheus metrics are auto discovered the metricbeat config
should point to the ceph mgr prometheus port. Adding missing
brackets around metricset so the default is treated as an array.
Dropping ceph dir detection for prometheus auto discover and relying on
is the port availability and inventory group only.

Change-Id: Iaba0fdece00414e17bc172f39e624374a9d273e8
2018-12-19 09:56:42 -05:00
Kevin Carter
e08c58dd15 modify fact gathering to use local facts
This change makes the retention gathering operation faster by storing
the retention values as "local facts". The local facts are then
referenced in templates loading from the local fact file instead of
running repetitive queries which are slow making very large deployments
cumbersome. To make the retention policy fact gathering process smarter
it will now automatically refresh if undefined or should the
elasticsearch cluster size change. This will ensure we're improving
speed of execution while also catering to the needs of deployers to
grow, or shrink, elasticsearch cluster sizes.

Documentation has been added regarding the new option and why it may be
of use to depoyers.

Change-Id: I3936ee94461ac39fb8bc78dc2c873e6067552461
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-12-19 01:37:06 -06:00
Kevin Carter
3a69a1c43d Update elk_6x for 6.5.x
This change updates the roles / playbooks to begin using Elasticsearch
relesae 6.5.x. Core to this change is the conversion of the journalbeat
role from custom compiled go, to simple package install which was made
possible by the folks at elastic within this release. Because of the
conversion the "beats-community" playbook has been removed given its now
empty.

A change to the bootstrap script was made allowing it to parse an OS id
with a "-" in it, like "opensuse-tumbleweed".

Change-Id: Ic9b80234d6a6ce876bff885f3223874602d55dd6
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-11-27 18:56:22 +00:00
Michael Vollman
3f70fc76d9 Add auto discover for ceph prometheus metrics
Add to elk_metrics_6x the capability to auto discover the ceph mgr
prometheus plugin metric port.  Pull ceph metrics from prometheus when
the plugin is enabled.

Change-Id: I530a99f42e396ba7b2cd2c1b3d587f528ef84242
2018-11-09 12:42:28 -05:00
Michael Vollman
bc180319ba Metricbeat role check for a ceph api to monitor
Before enabling the metricbeat ceph module, first check that the
restapi port is up and responding.

Change-Id: Ic795df02b93ca22c19fe67d6d2319889dc0f06a0
2018-11-08 15:50:50 +00:00
Bjoern Teipel
7c819e05dc Fixes related to installation on Trusty
As the role based macro links are pointing to unreachable destination
and are not required, they will be removed.
Additionally the timesouts are increased for API commands to ES, along
with minor changes around the upstart system manager

Change-Id: I2572bce230af2fd43261c9b0bf903bfd9655959e
2018-10-22 18:12:42 +00:00
Michael Vollman
20dcf18064 Add vars to configure prometheus module
Add new vars prometheus_enabled and prometheus_config to be used to
configure the metricbeats prometheus module.  The module will be used to
query prometheus exporters and write the metrics pulled from prometheus
to elasticsearch.

Change-Id: I6c52e28cc982a13472aa8f6d3098a22adcb7d3c1
2018-10-17 12:59:54 -04:00
Michael Vollman
b56dbc6b4f Fix missing logstash device error
The findmnt command is printing the fsroot as [/dir] at the end
of the source device and that is causing as a parsing error to
occur when it is assumed that only the device string is returned leading
to an error looking up an invalid device.

Change-Id: If95f8e0ed8154ad0277972159afac9f967b79c8f
2018-10-15 14:28:31 -04:00
Kevin Carter
d3b53d6f80 add lxc3 support
Change-Id: I0275461719bbfde31534a352809c50c3f05d7daf
Signed-off-by: Kevin Carter <kevin@cloudnull.com>
2018-10-11 15:02:22 -05:00
Zuul
10c4c0a2d0 Merge "[Trivial Fix] Replace Chinese punctuation with English punctuation" 2018-09-28 05:30:11 +00:00
Zuul
c05b21dd26 Merge "Refactor Filebeat configuration file" 2018-09-28 03:04:36 +00:00
inspurericzhang
3af2caebb8 [Trivial Fix] Replace Chinese punctuation with English punctuation
Curly quotes(Chinese punctuation) usually input from Chinese input method.
When read from english context, it makes some confusion.

Change-Id: I1b34eef0913dc0cda1c58d27e8f53ffdcfc3aa22
2018-09-28 09:41:38 +08:00
Mohammed Naser
aa647953e0 Refactor Filebeat configuration file
- Avoid checking item by item, we always enable modules and
  prospectors, with an option to disable with opt-in
- Updated MySQL and Apache modules to point to right path
- Improved and clean-up tagging
- All the prospectors are managed using a variable

Change-Id: I2a091669d6a77fd2c89a073cf9071292793e2f6b
2018-09-27 14:54:51 -04:00
Kevin Carter
4c86cb9be2
Add changes to the sysconfig defaults file
These changes mirror systemd tunables for elasticsearch and are needed
to ensure any OS without systemd (like Ubuntu 14.04) has the same
capabilities and OS's with systemd. This also adds a specific sysctl
file to use when making sysctl changes. This will ensure we're not
subjecting our deployment to other changes from other sources, like an
OSA playbook run.

Change-Id: Ic0e0bc0f93a12298c1e2f634cf5a1b4c6be2995e
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-09-27 09:44:09 -05:00
Zuul
b2c9cf4221 Merge "Add host metadata to core beats output" 2018-09-25 15:59:17 +00:00
Zuul
27ef77f601 Merge "Add tags for beats setup tasks" 2018-09-25 15:24:03 +00:00
Jonathan Rosser
1a48236ced Add host metadata to core beats output
This change adds fields such as host OS, version and platform to the
core beats output, giving extra query/filter capabilities.

Change-Id: Iff61bb4402eaa45b8f1c134a6a39cebe6613cbf3
2018-09-25 13:34:18 +01:00
Jonathan Rosser
ac46b2be6a Fix journalbeat installation for mixed environments
The previous code would terminate the play immediatley if any hosts
in the environment did not have a journal directory. This change runs
the journalbeat install role selectively on hosts that have the journal
directory, and skips hosts that do not.

In addition a legacy task to stop the play after uninstallation is removed,
this functionality is currently broken.

Change-Id: I412e3594c4b2292caafafb580bb4ede9ccfd3944
2018-09-25 12:34:21 +01:00
Jonathan Rosser
8cf20bfea2 Add tags for beats setup tasks
Previously the beat setup tasks were tagged with 'setup' but the include
statements were not, so the tasks were always skipped when using '--tags
setup'. This change adds tags to the includes so that the tasks are executed
as expected

Change-Id: If16069cd273d84a22b229b8140e5a8d56eed86d1
2018-09-25 12:30:22 +01:00