Kevin Carter 0d4a4a92c7
Converg the logstash pipelines and enhance memory backed queues
The multi-logstash pipeline setup, while amazingly fast, was crashing
and causing index errors when under high load for a long period of time.
Because of the crashing behavior and the fact that the folks from
Elastic describe multi-pipeline queues to be "beta" at this time the
logstash pipelines have been converted back into a single pipeline.

The memory backed queue options are now limited by a ram disk (tmpfs)
which will ensure that a burst within the queue does not cause OOM
issues and ensures a highly performant deployment and limiting memory
usage at the same time. Memory backed queues will be enabled when the
underlying system is using "rotational" media as detected by ansible
facts. This will ensure a fast and consistent experience across all
deployment types.

Pipeline/ml/template/dashboard setup has been added to the beat
configurations which will ensure beats are properly configured even
when running in an isolated deployment and outside of normal operations
where beats are generally configured on the first data node.

Change-Id: Ie3c775f98b14f71bcbed05db9cb1c5aa46d9c436
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-09-16 23:44:58 -05:00

466 lines
17 KiB
Django/Jinja

{% macro output_elasticsearch(host, data_hosts) -%}
#-------------------------- Elasticsearch output -------------------------------
output.elasticsearch:
# Boolean flag to enable or disable the output module.
enabled: true
# Array of hosts to connect to.
# Scheme and port can be left out and will be set to the default (http and 9200)
# In case you specify and additional path, the scheme is required: http://localhost:9200/path
# IPv6 addresses should always be defined as: https://[2001:db8::1]:9200
hosts: {{ data_hosts | to_json }}
# Set gzip compression level.
compression_level: 3
# Optional protocol and basic auth credentials.
#protocol: "https"
#username: "elastic"
#password: "changeme"
# Dictionary of HTTP parameters to pass within the url with index operations.
#parameters:
#param1: value1
#param2: value2
# Number of workers per Elasticsearch host.
worker: 1
# Optional index name. The default is "apm" plus date
# and generates [apm-]YYYY.MM.DD keys.
# In case you modify this pattern you must update setup.template.name and setup.template.pattern accordingly.
#index: "apm-%{[beat.version]}-%{+yyyy.MM.dd}"
# Optional ingest node pipeline. By default no pipeline will be used.
#pipeline: ""
# Optional HTTP Path
#path: "/elasticsearch"
# Custom HTTP headers to add to each request
#headers:
# X-My-Header: Contents of the header
# Proxy server url
#proxy_url: http://proxy:3128
# The number of times a particular Elasticsearch index operation is attempted. If
# the indexing operation doesn't succeed after this many retries, the events are
# dropped. The default is 3.
#max_retries: 3
# The maximum number of events to bulk in a single Elasticsearch bulk API index request.
# The default is 50.
#bulk_max_size: 50
# Configure http request timeout before failing an request to Elasticsearch.
#timeout: 90
# Use SSL settings for HTTPS. Default is true.
#ssl.enabled: true
# Configure SSL verification mode. If `none` is configured, all server hosts
# and certificates will be accepted. In this mode, SSL based connections are
# susceptible to man-in-the-middle attacks. Use only for testing. Default is
# `full`.
#ssl.verification_mode: full
# List of supported/valid TLS versions. By default all TLS versions 1.0 up to
# 1.2 are enabled.
#ssl.supported_protocols: [TLSv1.0, TLSv1.1, TLSv1.2]
# SSL configuration. By default is off.
# List of root certificates for HTTPS server verifications
#ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]
# Certificate for SSL client authentication
#ssl.certificate: "/etc/pki/client/cert.pem"
# Client Certificate Key
#ssl.key: "/etc/pki/client/cert.key"
# Optional passphrase for decrypting the Certificate Key.
#ssl.key_passphrase: ''
# Configure cipher suites to be used for SSL connections
#ssl.cipher_suites: []
# Configure curve types for ECDHE based cipher suites
#ssl.curve_types: []
# Configure what types of renegotiation are supported. Valid options are
# never, once, and freely. Default is never.
#ssl.renegotiation: never
{%- endmacro %}
{% macro output_logstash(host, data_hosts, processors, named_index) -%}
output.logstash:
# Boolean flag to enable or disable the output module.
enabled: true
# The Logstash hosts
hosts: {{ data_hosts | to_json }}
# Number of workers per Logstash host.
worker: 1
# Set gzip compression level.
compression_level: 3
# Optional maximum time to live for a connection to Logstash, after which the
# connection will be re-established. A value of `0s` (the default) will
# disable this feature.
#
# Not yet supported for async connections (i.e. with the "pipelining" option set)
#ttl: 30s
# Optional load balance the events between the Logstash hosts. Default is false.
loadbalance: true
# Number of batches to be sent asynchronously to logstash while processing
# new batches.
pipelining: 2
# If enabled only a subset of events in a batch of events is transferred per
# transaction. The number of events to be sent increases up to `bulk_max_size`
# if no error is encountered.
slow_start: true
# The maximum number of events to bulk in a single Logstash request. The
# default is the number of cores multiplied by the number of threads,
# the resultant is then multiplied again by 128 which results in a the defined
# bulk max size. If the Beat sends single events, the events are collected
# into batches. If the Beat publishes a large batch of events (larger than
# the value specified by bulk_max_size), the batch is split. Specifying a
# larger batch size can improve performance by lowering the overhead of
# sending events. However big batch sizes can also increase processing times,
# which might result in API errors, killed connections, timed-out publishing
# requests, and, ultimately, lower throughput. Setting bulk_max_size to values
# less than or equal to 0 disables the splitting of batches. When splitting
# is disabled, the queue decides on the number of events to be contained in a
# batch.
bulk_max_size: {{ (processors | int) * 128 }}
{% if named_index is defined %}
# Optional index name. The default index name is set to {{ named_index }}
# in all lowercase.
index: '{{ named_index }}'
{% endif %}
# SOCKS5 proxy server URL
#proxy_url: socks5://user:password@socks5-server:2233
# Resolve names locally when using a proxy server. Defaults to false.
#proxy_use_local_resolver: false
# Enable SSL support. SSL is automatically enabled, if any SSL setting is set.
#ssl.enabled: true
# Configure SSL verification mode. If `none` is configured, all server hosts
# and certificates will be accepted. In this mode, SSL based connections are
# susceptible to man-in-the-middle attacks. Use only for testing. Default is
# `full`.
#ssl.verification_mode: full
# List of supported/valid TLS versions. By default all TLS versions 1.0 up to
# 1.2 are enabled.
#ssl.supported_protocols: [TLSv1.0, TLSv1.1, TLSv1.2]
# Optional SSL configuration options. SSL is off by default.
# List of root certificates for HTTPS server verifications
#ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]
# Certificate for SSL client authentication
#ssl.certificate: "/etc/pki/client/cert.pem"
# Client Certificate Key
#ssl.key: "/etc/pki/client/cert.key"
# Optional passphrase for decrypting the Certificate Key.
#ssl.key_passphrase: ''
# Configure cipher suites to be used for SSL connections
#ssl.cipher_suites: []
# Configure curve types for ECDHE based cipher suites
#ssl.curve_types: []
# Configure what types of renegotiation are supported. Valid options are
# never, once, and freely. Default is never.
#ssl.renegotiation: never
{%- endmacro %}
{% macro setup_dashboards(beat_name) -%}
# These settings control loading the sample dashboards to the Kibana index. Loading
# the dashboards are disabled by default and can be enabled either by setting the
# options here, or by using the `-setup` CLI flag or the `setup` command.
setup.dashboards.enabled: false
# The directory from where to read the dashboards. The default is the `kibana`
# folder in the home path.
#setup.dashboards.directory: ${path.home}/kibana
# The URL from where to download the dashboards archive. It is used instead of
# the directory if it has a value.
#setup.dashboards.url:
# The file archive (zip file) from where to read the dashboards. It is used instead
# of the directory when it has a value.
#setup.dashboards.file:
# In case the archive contains the dashboards from multiple Beats, this lets you
# select which one to load. You can load all the dashboards in the archive by
# setting this to the empty string.
#setup.dashboards.beat: {{ beat_name }}
# The name of the Kibana index to use for setting the configuration. Default is ".kibana"
#setup.dashboards.kibana_index: .kibana
# The Elasticsearch index name. This overwrites the index name defined in the
# dashboards and index pattern. Example: testbeat-*
#setup.dashboards.index:
# Always use the Kibana API for loading the dashboards instead of autodetecting
# how to install the dashboards by first querying Elasticsearch.
#setup.dashboards.always_kibana: false
{%- endmacro %}
{% macro setup_template(beat_name, host, data_nodes, elasticsearch_replicas) -%}
# A template is used to set the mapping in Elasticsearch
# By default template loading is enabled and the template is loaded.
# These settings can be adjusted to load your own template or overwrite existing ones.
# Set to false to disable template loading.
setup.template.enabled: {{ host == data_nodes[0] }}
# Template name. By default the template name is "{{ beat_name }}-%{[beat.version]}"
# The template name and pattern has to be set in case the elasticsearch index pattern is modified.
setup.template.name: "{{ beat_name }}-%{[beat.version]}"
# Template pattern. By default the template pattern is "-%{[beat.version]}-*" to apply to the default index settings.
# The first part is the version of the beat and then -* is used to match all daily indices.
# The template name and pattern has to be set in case the elasticsearch index pattern is modified.
setup.template.pattern: "{{ beat_name }}-%{[beat.version]}-*"
# Path to fields.yml file to generate the template
setup.template.fields: "${path.config}/fields.yml"
# Overwrite existing template
setup.template.overwrite: {{ host == data_nodes[0] }}
{% set shards = ((data_nodes | length) * 3) | int %}
# Elasticsearch template settings
setup.template.settings:
# A dictionary of settings to place into the settings.index dictionary
# of the Elasticsearch template. For more details, please check
# https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html
index:
number_of_shards: {{ shards }}
codec: best_compression
# This provides for an index split of up to 2 times the number of available shards
number_of_routing_shards: {{ (shards | int) * 2 }}
# The default number of replicas will be based on the number of data nodes
# within the environment with a limit of 2 replicas.
number_of_replicas: {{ elasticsearch_replicas | int }}
# A dictionary of settings for the _source field. For more details, please check
# https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-source-field.html
_source:
enabled: true
{%- endmacro %}
{% macro setup_kibana(host) -%}
# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
setup.kibana:
# Kibana Host
# Scheme and port can be left out and will be set to the default (http and 5601)
# In case you specify and additional path, the scheme is required: http://localhost:5601/path
# IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
host: "{{ host }}"
# Optional protocol and basic auth credentials.
#protocol: "https"
#username: "elastic"
#password: "changeme"
# Optional HTTP Path
#path: ""
# Use SSL settings for HTTPS. Default is true.
#ssl.enabled: true
# Configure SSL verification mode. If `none` is configured, all server hosts
# and certificates will be accepted. In this mode, SSL based connections are
# susceptible to man-in-the-middle attacks. Use only for testing. Default is
# `full`.
#ssl.verification_mode: full
# List of supported/valid TLS versions. By default all TLS versions 1.0 up to
# 1.2 are enabled.
#ssl.supported_protocols: [TLSv1.0, TLSv1.1, TLSv1.2]
# SSL configuration. By default is off.
# List of root certificates for HTTPS server verifications
#ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]
# Certificate for SSL client authentication
#ssl.certificate: "/etc/pki/client/cert.pem"
# Client Certificate Key
#ssl.key: "/etc/pki/client/cert.key"
# Optional passphrase for decrypting the Certificate Key.
#ssl.key_passphrase: ''
# Configure cipher suites to be used for SSL connections
#ssl.cipher_suites: []
# Configure curve types for ECDHE based cipher suites
#ssl.curve_types: []
{%- endmacro %}
{% macro beat_logging(beat_name) -%}
# There are three options for the log output: syslog, file, stderr.
# Under Windows systems, the log files are per default sent to the file output,
# under all other system per default to syslog.
# Sets log level. The default log level is info.
# Available log levels are: critical, error, warning, info, debug
#logging.level: info
# Enable debug output for selected components. To enable all selectors use ["*"]
# Other available selectors are "beat", "publish", "service"
# Multiple selectors can be chained.
#logging.selectors: [ ]
# Send all logging output to syslog. The default is false.
#logging.to_syslog: true
# If enabled, apm-server periodically logs its internal metrics that have changed
# in the last period. For each metric that changed, the delta from the value at
# the beginning of the period is logged. Also, the total values for
# all non-zero internal metrics are logged on shutdown. The default is true.
#logging.metrics.enabled: true
# The period after which to log the internal metrics. The default is 30s.
#logging.metrics.period: 30s
# Logging to rotating files. Set logging.to_files to false to disable logging to
# files.
logging.to_files: true
logging.files:
# Configure the path where the logs are written. The default is the logs directory
# under the home path (the binary location).
path: /var/log/beats
# The name of the files where the logs are written to.
name: {{ beat_name }}.log
# Configure log file size limit. If limit is reached, log file will be
# automatically rotated
#rotateeverybytes: 10485760 # = 10MB
# Number of rotated log files to keep. Oldest files will be deleted first.
keepfiles: 2
# The permissions mask to apply when rotating log files. The default value is 0600.
# Must be a valid Unix-style file permissions mask expressed in octal notation.
#permissions: 0600
# Set to true to log messages in json format.
#logging.json: false
{%- endmacro %}
{% macro xpack_monitoring_elasticsearch(host, data_hosts, processors) -%}
# metricbeat can export internal metrics to a central Elasticsearch monitoring cluster.
# This requires xpack monitoring to be enabled in Elasticsearch.
# The reporting is disabled by default.
# Set to true to enable the monitoring reporter.
xpack.monitoring.enabled: true
# Uncomment to send the metrics to Elasticsearch. Most settings from the
# Elasticsearch output are accepted here as well. Any setting that is not set is
# automatically inherited from the Elasticsearch output configuration, so if you
# have the Elasticsearch output configured, you can simply uncomment the
# following line, and leave the rest commented out.
xpack.monitoring.elasticsearch:
# Array of hosts to connect to.
# Scheme and port can be left out and will be set to the default (http and 9200)
# In case you specify and additional path, the scheme is required: http://localhost:9200/path
# IPv6 addresses should always be defined as: https://[2001:db8::1]:9200
hosts: {{ data_hosts | to_json }}
# Set gzip compression level.
compression_level: 9
# Optional protocol and basic auth credentials.
#protocol: "https"
#username: "beats_system"
#password: "changeme"
# Dictionary of HTTP parameters to pass within the url with index operations.
#parameters:
#param1: value1
#param2: value2
# Custom HTTP headers to add to each request
headers:
X-Node-Name: {{ host }}
# Proxy server url
#proxy_url: http://proxy:3128
# The number of times a particular Elasticsearch index operation is attempted. If
# the indexing operation doesn't succeed after this many retries, the events are
# dropped. The default is 3.
max_retries: 5
# The maximum number of events to bulk in a single Elasticsearch bulk API index request.
# The default is 50.
bulk_max_size: {{ (processors | int) * 64 }}
# Configure http request timeout before failing an request to Elasticsearch.
timeout: 120
# Use SSL settings for HTTPS.
#ssl.enabled: true
# Configure SSL verification mode. If `none` is configured, all server hosts
# and certificates will be accepted. In this mode, SSL based connections are
# susceptible to man-in-the-middle attacks. Use only for testing. Default is
# `full`.
#ssl.verification_mode: full
# List of supported/valid TLS versions. By default all TLS versions 1.0 up to
# 1.2 are enabled.
#ssl.supported_protocols: [TLSv1.0, TLSv1.1, TLSv1.2]
# SSL configuration. By default is off.
# List of root certificates for HTTPS server verifications
#ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]
# Certificate for SSL client authentication
#ssl.certificate: "/etc/pki/client/cert.pem"
# Client Certificate Key
#ssl.key: "/etc/pki/client/cert.key"
# Optional passphrase for decrypting the Certificate Key.
#ssl.key_passphrase: ''
# Configure cipher suites to be used for SSL connections
#ssl.cipher_suites: []
# Configure curve types for ECDHE based cipher suites
#ssl.curve_types: []
# Configure what types of renegotiation are supported. Valid options are
# never, once, and freely. Default is never.
#ssl.renegotiation: never
{%- endmacro %}