Unrestrict octal values rule since benefits of file modes readability
exceed possible issues with yaml 1.2 adoption in future k8s versions.
These issues will be addressed when/if they occur.
Also ensure osh-infra is a required project for lint job, that matters
when running job against another project.
Change-Id: Ic5e327cf40c4b09c90738baff56419a6cef132da
Signed-off-by: Andrii Ostapenko <andrii.ostapenko@att.com>
This patchset is required for the patch set https://review.opendev.org/#/c/737629.
The kuberntes python api requires these permissions, for this script to work properly.
Change-Id: I69f2ca40ab6068295a4cb2d85073183ca348af1e
This commit rewrites lint job to make template linting available.
Currently yamllint is run in warning mode against all templates
rendered with default values. Duplicates detected and issues will be
addressed in subsequent commits.
Also all y*ml files are added for linting and corresponding code changes
are made. For non-templates warning rules are disabled to improve
readability. Chart and requirements yamls are also modified in the name
of consistency.
Change-Id: Ife6727c5721a00c65902340d95b7edb0a9c77365
The current copyright refers to a non-existent group
"openstack helm authors" with often out-of-date references that
are confusing when adding a new file to the repo.
This change removes all references to this copyright by the
non-existent group and any blank lines underneath.
Change-Id: I1882738cf9757c5350a8533876fd37b5920b5235
This allows the ability to set the domain name in the
Nagios deployment. This change goes along with a change
to imageswhich will allow the ability to append the
domain name to the host name in Nagios so the full
FQDN appears in the dashboard.
Change-Id: I512112921111e49345f19dfca70406b56dd55452
This change updates the prometheus alerting rules to use ranged vectors
in their expressions, to avoid situations wher missed scrapes would
cause scalar metrics to "go stale" - resetting the alert timer.
Only the ceph alerts are affected by this change.
Change-Id: Ib47866d12616aaa808e6a09c58aa4352e338a152
Co-Authored-By: Meghan Heisler <mkheisler93@gmail.com>
This change addresses the results that were found when running
bandit against the templated python files in the various charts.
This also makes the bandit gate only run when python template
files are changed as well as makes the job voting.
Change-Id: Ia158f5f9d6d791872568dafe8bce69575fece5aa
This patch set updates and tests the apiVersion for rbac.authorization.k8s.io
from v1beta1 to v1 in preparation for its removal in k8s 1.20.
Change-Id: I4e68db1f75ff72eee55ecec93bd59c68c179c627
Signed-off-by: Tin Lam <tin@irrational.io>
I noticed a some nagios service checks were checking prometheus
alerts which did not exist in our default prometheus configuration.
In one case a prometheus alert did not match the naming convention
of similar alerts.
One nagios service check, ceph_monitor_clock_skew_high, does not
have a corresponding alert at all, so I've changed it to check the
node_ntmp_clock_skew_high
alert, where a node has the label ceph-mon="enabled".
Change-Id: I2ebf9a4954190b8e2caefc8a61270e28bf24d9fa
This disables the Nagios page tours option. This option is enabled
by default, which results in a youtube video being overlaid on
each Nagios page.
Change-Id: Ifd80a8d122dcbe145315b37753a72e1309e1d210
Signed-off-by: Steve Wilkerson <sw5822@att.com>
This adds support for arbitrary object definitions via the conf
key in the Nagios chart. This allows for customizing the
definitions required by different deployment targets instead of
assuming all nagios deployments are monitoring and targeting the
same hosts and executing the same service checks and commands.
This also adds reference overrides to the chart for elasticsearch,
postgresql, and openstack nagios objects that are deployed in the
single and multinode jobs here
Change-Id: I6475ca980447591b5b691220eb841a2ab958e854
Signed-off-by: Steve Wilkerson <sw5822@att.com>
This updates charts that consume images built from osh-images to
use tags other than the :latest tags. This will be followed up
with the definition of jobs to allow for vetting out of updated
images, as reliance on :latest tags assumes any change merged into
osh-images will result in functionally correct behavior (which has
shown to not be the case traditionally)
Change-Id: I181aa56ed187604dc7583d8081e53cc69eb27310
Signed-off-by: Steve Wilkerson <sw5822@att.com>
This updates the ceph health check command in Nagios to use the
updated plugin that determines the active ceph-mgr instance
endpoint to use before querying for ceph's health. This results in
more robust and reliable reporting of ceph's overall health
Depends-On: https://review.opendev.org/#/c/693900/
Change-Id: I5eeb076e5af3c820dbdcc3cc321cefcb5f85ef8d
Signed-off-by: Steve Wilkerson <sw5822@att.com>
It was observed in some charts' values.yaml that the values defining
lifecycle upgrade parameters were incorrectly placed.
This change aims to correct these instances by adding a deployment-
type subkey corresponding with the deployment types identified in
the chart's templates dir, and indenting the values appropriately.
Change-Id: Id5437b1eeaf6e71472520f1fee91028c9b6bfdd3
This change updates the selenium_tests container image
to one which installs python3.
The selenium-test.py template file has been refactored
to match the structure of the selenium tests in
/tools/gate/selenium
Depends On: https://review.opendev.org/688436
Change-Id: I49e0cfd05f27f868864a98e8e68ffe79e28c0f03
This updates the grafana and nagios helm test pod templates to
use the internal endpoints for their selenium tests instead of the
public endpoints when defined
Change-Id: I1138cb29a808894d3339bc1b07c3a60804b9546f
Signed-off-by: Steve Wilkerson <sw5822@att.com>
Added new X-Content-Type-Options: nosniff header to make sure the browser
does not try to detect a different Content-Type than what is actually
sent (can lead to XSS)
Added new X-Frame-Options: sameorigin header to protect against
drag and drop clickjacking attacks in older browsers
Added new Content-Security-Policy: script-src self for implementation
Added new HTTP Security header X-XSS-Protection:1 mode=block to
sanitize the page, when a XSS attack is detected, the browser will
prevent rendering of the page
Change-Id: Ic79bbb96484a7f1a497c001883783338fd26a47a
Earlier the Nagios alert monitor was percent based as in when the percent of OSD
down is greater than 80, it will send alert.
>check_prom_alert!ceph_osd_down_pct_high!CRITICAL- CEPH OSDs down is
more than 80 percent!OK- CEPH OSDs down is less than 80 percent
Updated the code in nagios values.yaml to send alert when even 1 OSD is
down:
>check_prom_alert!ceph_osd_down!CRITICAL- One or more CEPH OSDs are down
>for more than 5 minutes!OK- All the CEPH OSDs are up
Change-Id: Id24c4a0cca64674890dae3599edc0c90d9534e90
This updates the Nagios chart to include an init container for
generating the host and host group definitions Nagios requires to
function. The benefit is that Nagios does not need to constantly
attempt to update its host and host group definitions, which
currently triggers a restart of the Nagios service even in cases
where the host file hasn't changed. With the introduction of an
init container for handling this, we can also remove the service
check definition and command definition for executing the plugin
at periodic intervals
Depends-On: https://review.opendev.org/668197
Change-Id: Id1d63d8c99850b960eb352361d7796162bd6be2f
Signed-off-by: Steve Wilkerson <sw5822@att.com>
This updates the Nagios image used to the image that is built
out of openstack-helm-images instead of the image hosted in quay.
This new image includes the updated host definition plugin that
uses the kubernetes python client instead of prometheus queries,
so the check_prometheus_hosts command has also been updated to
reflect the change in required arguments
Change-Id: If3440ca9be3227fc48cd698a7d44501e6747bb1e
Signed-off-by: Steve Wilkerson <sw5822@att.com>
This adds the affinity key to the pod spec for the grafana,
nagios, kube-state-metrics, and openstack-exporter charts as it
was previously missed
Change-Id: Ifefa88d7f33607b4d595effa5fbf72f3387e5081
Signed-off-by: Steve Wilkerson <sw5822@att.com>
This adds selenium tests for the Nagios chart via a helm test
pod to help ensure the Nagios deployment is functional and
accessible
Change-Id: I44f30fbac274546abadba0290de029ed2b9d1958
Signed-off-by: Steve Wilkerson <sw5822@att.com>
This updates the Nagios chart to use the helm-toolkit template
renderer snippet for generating the Nagios configuration files.
This was done to make the exposure of the configuration files
simpler for those who are more familiar with traditional Nagios
configuration files, as well as allowing for values overrides for
adding custom host names or custom object definitions to nagios
objects (as Nagios doesn't easily allow for this via environment
accessible macros).
Change-Id: I84d5c83d84d6438af5f3ab57997e80e8b1fc8312
Signed-off-by: Steve Wilkerson <sw5822@att.com>
This adds a basic egress policy to the charts run by the
network-policy check. A change was recently merged requiring
the eggress tag to be in the chart but did not add it, this
addresses that
Change-Id: I60669c9351db7854cba8c69723eb783a966d2a56
This PS adds emptydirs backing the /tmp directory in pods, which
is required in most cases for full operation when using a read only
filesystem backing the container.
Additionally some yaml indent issues are resolved.
Change-Id: I8b7f1614da059783254aa6efc09facf23fca3cad
Signed-off-by: Pete Birley <pete@port.direct>
This reverts commit e20242fbdb.
removing readOnlyRootFilesystem flag since pods are running to "crashLoopBackOff" state by implementing HTK functionality
when we have set the readOnly flag at pod without HTK functionality the changes were not effected. That is why it passed the gate.
Change-Id: I6027be601b4241b26b0fbc3c70c886714dac4a48
This adds the release-annotation to the pod spec for the charts in
openstack-helm-infra. This also adds missing configmap annotations
to charts in openstack-helm-infra
Change-Id: Ie23f0c16a7a21d3929e98928db2bbcef69ae6490
This updates the ingress controller image to v0.23.0, which was
required to add support for configuring cookie max age and expires
for ingresses via annotations on the ingress.
This also removes the --enable-dynamic-configuration flag, as the
flag is no longer valid in 0.23.0 due to the functionality being
a default behavior of the nginx ingress controller in recent
releases
Change-Id: I4917797c43ec973ed0bb311fc305b01f10abd4e5
This updates the logging format and configuration for the apache
reverse proxies used for elasticsearch, kibana, nagios and
prometheus to enable logging of the remote clients used to access
these services
Change-Id: Id07e4294ea18203fbb890b78424a232c2d59cb82