74 Commits

Author SHA1 Message Date
Zuul
7cff0c4d7f Merge "ceph-manage-journal: add support for mpath device" 2022-05-24 15:58:07 +00:00
Jackie Huang
f00e55b736 ceph-manage-journal: add support for mpath device
* Add the missing 's' to fix the syntax error:

  File "/usr/sbin/ceph-manage-journal", line 200, in mount_data_partition
    print("Failed to mount %(node)s to %(path), aborting" % params)
ValueError: unsupported format character ',' (0x2c) at index 35

* Add a function to find mpath node in /dev/mapper

Test Plan:

PASS: AIO-SX with Ceph, 1 osd
PASS: AIO-SX with Ceph, 2 osd
PASS: AIO-SX with Ceph, 4 osd

Story: 2010046
Task: 45427

Signed-off-by: Jackie Huang <jackie.huang@windriver.com>
Signed-off-by: Thiago Miranda <ThiagoOliveira.Miranda@windriver.com>
Change-Id: I08f1f226343bf0140abb1ec8825533abb3f57e43
2022-05-24 12:40:41 +00:00
Andrei Suciu
0b3bdc6f66 Debian: replace ceph workarounds
Description:
- replace library path
- change call for getting stack trace
- update ownership for /var/lib/ceph
- remove ceph user creation

-Test Plan:
PASSED: build packages and image /Debian
PASSED: bootstrap and unlock /Debian
PASSED: checked for failed processes /Debian
PASSED: check system application-list /Debian
PASSED: checked for ceph alarms /Debian
PASSED: checked ceph status, puppet logs /Debian
PASSED: checked ceph and system application-list status after
unlock /CentOS

Story: 2009965
Task: 45438

Change-Id: If864d288e5b63928f18a5b31551b4cd479b00fe8
2022-05-24 12:31:47 +00:00
Dan Voiculeasa
5bcfd552de debian: Fix ceph lsb script
This work is part of Debian integration effort.
This work only affect Debian. We can port this to CentOS without
issues.

This prevents Maintenance check for /etc/services.d/controller/ceph.sh
from successfully completing after unlock, which results in a reboot.

Debian uses /lib/lsb/init-functions vs CentOS /etc/init.d/functions.
init-functions calls hooks from /lib/lsb/init-functions.d/.
One of the hooks redirect the lsb script call to a systemctl call.
Systemctl calls for ceph service don't work on CentOS or Debian.
There is no sourcing of /etc/init.d/functions so we don't need it for
/lib/lsb/init-functions either.

Using the reasoning above drop sourcing of /lib/lsb/init-functions.

Tests on AIO-SX:
CentOS: not affected, skip
Debian:
PASS: live patch controller, unlock, no unwanted reboot initiated by
Maintenance
PASS: build-pkgs, extract contents and check /etc/init.d/ceph

Story: 2009101
Task: 44791
Signed-off-by: Dan Voiculeasa <dan.voiculeasa@windriver.com>
Change-Id: I49b79e78b0f832096dca98ca2cfd68c454679b95
2022-03-16 15:42:51 +02:00
Yue Tao
4a709349a9 meta_data.yaml: add sha256sum checksum
Test Plan:
Pass: Verify sha256sum checksum via "download -s"

Story: 2008846
Task: 44578

Signed-off-by: Yue Tao <Yue.Tao@windriver.com>
Change-Id: I78d9dff2af0afb18c6db4e8d2d39ef79b5cf5864
2022-03-03 14:30:40 +08:00
Leonardo Fagundes Luz Serrano
83065c5298 Add debian package for Ceph
Add debian packaging infrastructure for
integ/ceph to build a debian package.

Test Plan: build-pkg; build-image; same contents as RPM

PASS build-pkg
PASS build-image
PASS same contents and permissions as RPM

Attention:

In order to avoid memory issues during the build,
please do one of the following:

- Developers with only 32G RAM will need to
temporarily unmount /var/lib/sbuild/build
so that the build system uses the disk instead of tmpfs

OR

- update /etc/fstab to set the size for
the sbuild tmpfs filesystem in the pkgbuilder container:

tmpfs /var/lib/sbuild/build tmpfs uid=sbuild,gid=sbuild,mode=2770,size=40G 0 0

Note:
Build times can be long. In order to accelerate it,
adjust the values of MINIKUBECPUS/MINIKUBEMEMORY
in import-stx file (tools repo) before building
the containers with stx-init-env.

Depends-On: https://review.opendev.org/c/starlingx/tools/+/827884

Story: 2009101
Task: 44304

Signed-off-by: Leonardo Fagundes Luz Serrano <Leonardo.FagundesLuzSerrano@windriver.com>
Change-Id: Idc8ee1ebac5c973622c1c599f4a04c001bfa89a6
2022-02-11 17:19:41 +00:00
Zuul
459541141c Merge "Enable generation of Ceph's Python 3 packages" 2022-01-21 00:26:05 +00:00
Felipe Sanches Zanoni
94b8a78799 Ceph build failure
Ceph build failure after nspr library update.

This library is used only for library tests.
To fix this and preventing to happen again, all tests
are not compiled anymore.

Test Plan:
    PASS: Compile master branch without build-avoidance and
          verify it finishes with no errors.

Closes-Bug: 1958560
Signed-off-by: Felipe Sanches Zanoni <Felipe.SanchesZanoni@windriver.com>
Change-Id: I74046f1e76b242655f86c71354248f1bcb9ff76a
2022-01-20 14:52:12 -05:00
Delfino Curado
563c59599d Enable generation of Ceph's Python 3 packages
Changed ceph.spec to enable the generation of python 3 packages.
It's important to highlight that the python 2 packages will continue
to be generated and they are the ones used on StarlingX installation.

The python 3 packages will only be, originally on stx-base-image.

There is also a clean up on centos_tarball-dl.lst of commented lines
of ceph submodules that were updated.

Test plan:
Complete build run
Starlingx installation
stx-openstack apply - check that the helm chart can create ceph pools

Depends-On: https://review.opendev.org/c/starlingx/tools/+/824575
Story: 2009074
Task: 44281

Signed-off-by: Delfino Curado <delfinogomes.curadofilho@windriver.com>
Change-Id: I52dac30849a7072b80cad388b16d2b50ea22391a
2022-01-13 11:01:05 -05:00
Felipe Sanches Zanoni
b0b59243b2 Ceph mgr-restful-plugin has new server_port config location
Ceph mgr-restful-plugin was running ceph-mgr on port 8003 instead of
port 7999.

The problem was that mgr-restful-plugin was configuring the server
port at mgr/restful/server_port key in Mimic.
This key has changed to config/mgr/mgr/restful/server_port in
Nautilus.

Test Plan:
 - Tested on AIO-SX using netstat to check the port and curl to get
data using port 7999.

Story: 2009074
Task: 44160

Signed-off-by: Felipe Sanches Zanoni <Felipe.SanchesZanoni@windriver.com>
Change-Id: Ib534089bd30c5b1e2c7db98bbd2f495b1545f420
2021-12-09 21:23:36 +00:00
Felipe Sanches Zanoni
205b6e48b2 Fix mgr-restful-plugin not running correctly
After upgrading to ceph nautilus, the mgr-restful-plugin log shows a
message of command failure when running 'ceph config-key get
config/mgr/restful/controller-0/crt'.

This happens on both controllers and can lead to spotty access by
components that need REST API access.

Changing the path to the certificate from
'config/mgr/restful/controller-0/crt' to
'config/mgr/mgr/restful/controller-0/crt' and the path to the key from
'config/mgr/restful/controller-0/key' to
'config/mgr/mgr/restful/controller-0/key' fixed the problem

Test plan:
 - Tested on AIO-DX

Story: 2009074
Task: 44100
Signed-off-by: Felipe Sanches Zanoni <Felipe.SanchesZanoni@windriver.com>
Change-Id: Ifb0d3c7b8b3669472ef3b579951b9850fdf4bbbc
2021-11-30 20:03:45 +00:00
Delfino Curado
a869978f09 Updating ceph build_srpm.data
Updating TIS_BASE_SRCREV to reflect the source rev of branch 14.2.22
of stx-ceph.

Updating TIS_PATCH_VER to account for the 43 previous packaging
changes that went in with Mimic.

Test plan:
 - Build ceph package and check the package name

Story: 2009074
Task: 44013

Signed-off-by: Delfino Curado <delfinogomes.curadofilho@windriver.com>
Change-Id: I6e51dedd62e851c4716bc27812a447d08694ed46
2021-11-19 17:45:59 -05:00
Delfino Curado
6db6fe5bbd Change ceph-mon configuration
Disabling by default the warnings related to monitors allowing
insecure global_id reclaim as well as defining
"auth allow insecure global id reclaim" to true by default to
all monitors. The main goal here is to enable a mixed set of
ceph versions.

A next step is to enable through service parameters to the user
to mix non-compliant ceph clients installed by other application.

Gdisk was added again as this is necessary for StarlingX

Test plan:

PASS: Build successfully
PASS: Install on AIO-SX, AIO-DX, Standard and Storage configs
successfully and without alarms (fm alarm-list) or ceph warnings
(ceph -s).
PASS: platform-integ-apps is applied successfully

Story: 2009074
Task: 43464

Signed-off-by: Delfino Curado <delfinogomes.curadofilho@windriver.com>
Change-Id: I5f3e432444b60ab73136431bb94bb6ab532ae0ab
2021-10-26 18:47:36 -04:00
Delfino Curado
0b038dae3c Add ceph-disk to build
This needs to be done because we want to keep compatibility with
puppet-ceph-2.4.1-1 and our current version of puppet.

As this version of puppet-ceph only uses ceph-disk we will keep it
until we are able to move on to ceph-volume. Probably this will be
possible when StartlingX is using version 3.1.1 of puppet-ceph.

Test plan:

PASS: Build successfully

Story: 2009074
Task: 43465

Signed-off-by: Delfino Curado <delfinogomes.curadofilho@windriver.com>
Change-Id: Ie9570f01728df28ee4ea357b1e618c5a4c0a3803
2021-10-26 17:11:56 -04:00
Delfino Curado
d92e321f71 Integrate ceph version 14 in StarlingX build
Add the upgraded submodules as dependencies in
centos_tarball-dl.lst file. It's important to highlight that dpdk is
added twice because seastar and SPDK depends on different versions of
dpdk.

    * boost_1_72_0.tar.bz2
    * c-ares-fd6124c74da0801f23f9d324559d8b66fb83f533.tar.gz
    * civetweb-bb99e93da00c3fe8c6b6a98520fb17cf64710ce7.tar.gz
    * dmclock-4496dbc6515db96e08660ac38883329c5009f3e9.tar.gz
    * dpdk-96fae0e24c9088d9690c38098b25646f861a664b.tar.gz
    * dpdk-a1774652fbbb1fe7c0ff392d5e66de60a0154df6.tar.gz
    * fmt-80021e25971e44bb6a6d187c0dac8a1823436d80.tar.gz
    * intel-ipsec-mb-134c90c912ea9376460e9d949bb1319a83a9d839.tar.gz
    * rocksdb-4c736f177851cbf9fb7a6790282306ffac5065f8.tar.gz
    * seastar-0cf6aa6b28d69210b271489c0778f226cde0f459.tar.gz
    * spawn-5f4742f647a5a33b9467f648a3968b3cd0a681ee.tar.gz
    * spdk-fd292c568f72187e172b98074d7ccab362dae348.tar.gz
    * zstd-b706286adbba780006a47ef92df0ad7a785666b6.tar.gz

Merged the changes of ceph 14 spec in this repo. For now python3
is disabled by default for StarlingX build but this will probably
change in the future. For python3 build to work, more dependencies
will be needed.

Test plan:

PASS: Build successfully

Depends-On: https://review.opendev.org/c/starlingx/tools/+/814591
Story: 2009074
Task: 42946
Signed-off-by: Delfino Curado <delfinogomes.curadofilho@windriver.com>
Change-Id: Iab9d0b57b00da4ba595d2b2f24194f058c850f5b
2021-10-26 17:09:26 -04:00
Charles Short
0acb956dce Fix python3 incompatibility
- socket requires bytes and we need to explicitly convert str to bytes.
- check_output() returns bytes, while python2 returns str, passing
  universal_newlines=True it will return str no matter what python
  version is used.

Story: 2006796
Task: 42297

Signed-off-by: Charles Short <charles.short@windriver.com>
Change-Id: Ie3921c4ae6211a8b0d290bdbdb195ce07036afbc
(cherry picked from commit 26c16b3eb84297998f11fca9a2b92d2adaa60f0d)
2021-07-26 14:35:12 -04:00
Zuul
0f497f800e Merge "On AIO-DX only start Ceph MON and MDS via MTC" 2021-06-29 20:19:03 +00:00
Pedro Henrique Linhares
12d564b37d On AIO-DX only start Ceph MON and MDS via MTC
Defer start of Ceph ODSs to SM in order to avoid a race condition
between MTC and SM when starting OSDs. This is only required for AIO-DX
where SM manages the floating monitor and OSDs.

Closes-Bug: 1932351
Signed-off-by: Pedro Henrique Linhares <PedroHenriqueLinhares.Silva@windriver.com>
Change-Id: Ia718ae696d8158e63660ee54d226271a6bcb476e
2021-06-29 13:26:30 -04:00
Charles Short
3cec8b6ac9 Address python3 string issues with subprocess
This patch updates our Popen call to enable
newlines for calls that we parse or consume the output for.
Without universal_newlines=True, the output is treated as bytes
under python3 which leads to issues later where we are using it as
strings.

See https://docs.python.org/3/glossary.html#term-universal-newlines

Story: 2006796
Task: 42696

Signed-off-by: Charles Short <charles.short@windriver.com>
Change-Id: I9b93907c05486b1f76aebe181af812c243285d6a
2021-06-25 12:19:10 -04:00
Mihnea Saracin
3225570530 Execute once the ceph services script on AIO
The MTC client manages ceph services via ceph.sh which
is installed on all node types in
/etc/service.d/{controller,worker,storage}/ceph.sh

Since the AIO controllers have both controller and worker
personalities, the MTC client will execute the ceph script
twice (/etc/service.d/worker/ceph.sh,
/etc/service.d/controller/ceph.sh).
This behavior will generate some issues.

We fix this by exiting the ceph script if it is the one from
/etc/services.d/worker on AIO systems.

Closes-Bug: 1928934
Change-Id: I3e4dc313cc3764f870b8f6c640a6033822639926
Signed-off-by: Mihnea Saracin <Mihnea.Saracin@windriver.com>
2021-05-20 18:08:47 +03:00
Robert Church
46d8d8fdf1 Add conditions to when RBD devices are unmounted
ceph-preshutdown.sh is called as a post operation when docker is
stopped/restarted. Based on current service dependencies, when docker is
restarted this will also trigger a restart of containerd.

Puppet manifests will restart containerd and docker for various
operations both on system boot and during runtime operations when their
configuration has changed.

This update adds conditions to ensure that the RBD devices are only
unmounted when the system is shutting down. This avoids the RBD backed
persistent volumes from being forcibly removed from running pods and
being remounted read-only during these restart scenarios.

Change-Id: I7adfddf135debcc8bcaa1f93866e1a276b554c88
Closes-Bug: #1901449
Signed-off-by: Robert Church <robert.church@windriver.com>
2020-12-14 19:04:31 -05:00
Dongqi Chen
af359d4938 Add auto-versioning to starlingx/integ packages
This update makes use of the PKG_GITREVCOUNT variable
to auto-version the packages in this repo.

Story: 2007750
Task: 39951
Change-Id: I854419c922b9db4edbbf6f1e987a982ec2ec7b59
Signed-off-by: Dongqi Chen <chen.dq@neusoft.com>
2020-06-24 09:48:28 +08:00
Zuul
502e80c7fa Merge "Change ceph manager port" 2020-04-15 14:27:21 +00:00
Dan Voiculeasa
e7bbd7e7b1 Change ceph manager port
Free port 5001 to be used by keystone.

Story: 2007347
Task: 39392

Change-Id: Id789591bf22931494e970aaf3b12e9e5cbe223fa
Signed-off-by: Dan Voiculeasa <dan.voiculeasa@windriver.com>
2020-04-14 10:55:44 +03:00
Paul Vaduva
bed7388b67 Release FDs when stuck peering recovery
During stuck peering recovery if file descriptors are
not released the state machine does not advance to
OPERATIONAL state

Partial-bug: 1856064

Change-Id: I3fba7be661ebf223eac63608574323ad98d33b75
Signed-off-by: Paul Vaduva <Paul.Vaduva@windriver.com>
2020-03-11 08:11:51 -04:00
Dan Voiculeasa
11fd5d9cd4 ceph-init-wrapper: Detect stuck peering OSDs and restart them
OSDs might become stuck peering.
Recover from such state.

Closes-bug: 1851287

Change-Id: I2ef1a0e93d38c3d041ee0c5c1e66a4ac42785a68
Signed-off-by: Dan Voiculeasa <dan.voiculeasa@windriver.com>
2019-11-25 09:37:48 +00:00
Zuul
d51e846143 Merge "ceph: mgr-restful-plugin set ceph-mgr config file path" 2019-09-11 18:16:56 +00:00
Zuul
bc4877e5bb Merge "ceph: mgr restful plugin set certificate to match host name" 2019-09-11 16:35:05 +00:00
Daniel Badea
edc7f8495d ceph: mgr-restful-plugin set ceph-mgr config file path
Explicitly set ceph-mgr configuration file path to
/etc/ceph/ceph.conf to avoid surprises. ceph-mon
and ceph-osd are also started with '-c' (--conf)
pointing to /etc/ceph/ceph.conf.

Change-Id: I4915952f17b4d96a8fce3b4b96335693f9b6c76b
Closes-bug: 1843082
Signed-off-by: Daniel Badea<daniel.badea@windriver.com>
2019-09-11 16:30:06 +00:00
Zuul
4b6a275e4f Merge "ceph-init-wrapper use flock instead of flag files" 2019-09-09 19:34:31 +00:00
Daniel Badea
fcaa49ecaf ceph: mgr restful plugin set certificate to match host name
python-cephclient certificate validation fails when connecting
to ceph-mgr restful plugin because server URL doesn't match
CommonName (CN) or SubjectAltName (SAN).

Setting CN to match server hostname fixes this issue but
raises a warning caused by missing SAN.

Using CN=ceph-restful and SAN=<hostname> fixes the issue
and clears the warning.

Change-Id: I6e8ca93c7b51546d134a6eb221c282961ba50afa
Closes-bug: 1828470
Signed-off-by: Daniel Badea <daniel.badea@windriver.com>
2019-09-09 06:53:58 +00:00
Scott Little
062ec89dbb Relocated some packages to repo 'utilities'
List of relocated subdirectories:

ceph/ceph-manager
ceph/python-cephclient
filesystem/nfscheck
logging/logmgmt
security/tpm2-openssl-engine
security/wrs-ssl
tools/collector
tools/engtools/hostdata-collectors
utilities/build-info
utilities/namespace-utils
utilities/pci-irq-affinity-agent
utilities/platform-util
utilities/tis-extensions
utilities/update-motd

Story: 2006166
Task: 35687
Depends-On: I665dc7fabbfffc798ad57843eb74dca16e7647a3
Change-Id: I2bf543a235507a4eff644a7feabd646a99d1474f
Signed-off-by: Scott Little <scott.little@windriver.com>
Depends-On: I85dda6d09028f57c1fb0f96e4bcd73ab9b9550be
Signed-off-by: Scott Little <scott.little@windriver.com>
2019-09-05 20:31:36 -04:00
Daniel Badea
9faad45703 ceph-init-wrapper use flock instead of flag files
When swact occurs and ceph-init-wrapper is slow to respond
to a status request it gets killed by SM. This means the
corresponding flag file that marks status in progress is left
behind.

When controller swacts back ceph-init-wrapper sees status
in progress and waits for it to finish (with a timeout).
Because it does not respond fast enough SM tries to start
again ceph-init-wrapper to get ceph-mon service up and running.

This happens a couple of times until the service is declared
failed and controller swacts back.

To fix this we need to use flock instead of flag files as the
locks will be automatically released by the OS when process
is killed.

Change-Id: If1912e8575258a4f79321d8435c8ae1b96b78b98
Closes-bug: 1840176
Signed-off-by: Daniel Badea <daniel.badea@windriver.com>
2019-08-27 14:53:32 +00:00
Zuul
7a8add1636 Merge "ceph: mgr-restful-plugin restarts on controller unlock" 2019-08-01 17:45:58 +00:00
Daniel Badea
d409d78cc1 ceph: mgr-restful-plugin restarts on controller unlock
When standby controller is unlocked its mgr-restful-plugin
service starts and generates node specific self-signed
certificates to be used by the restful plugin. This operation
triggers a restart of the "active" mgr restful plugin
which in turn causes Ceph REST API requests to fail.

This failure is handled on the active controller by
restarting the service. This happens while stx-openstack
is reapplied and is the reason why mariadb pod fails to start.

Change ceph-mgr and restful plugin config and startup
procedure so a secondary ceph-mgr service doesn't disrupt
the active one.

Closes-Bug: 1837581
Change-Id: Id8e5e56d48669498202ed319a9aad68365b51f23
Signed-off-by: Daniel Badea <daniel.badea@windriver.com>
2019-07-31 13:47:19 +00:00
Stefan Dinescu
12f604b4dd Change ceph-init-wrapper wait logic
The stop, start and restart commands are waiting for any status
commands to finish before attempting the actual command

This would cause issues as some commands that are related to OSDs
only would wait for monitor status and vice-versa.

Depending on the number of OSD, the osd status command would take
too much time to finish, resulting on a "stop mon" command to
wait just as much, even though it didn't need to

Changes in this commit:
- commands related to OSD and monitors have their own wait times
  and separate flag files
- add improved logging to better see if the script is waiting
  for a certain function too finish

Change-Id: Ia03981b2b49f999e8a96aa12361209a418da4c50
Closes-bug: 1836075
Depends-On: I3ace73650e4fe9aafc84c82e2ffe048f2039305e
Signed-off-by: Stefan Dinescu <stefan.dinescu@windriver.com>
2019-07-31 11:34:07 +03:00
Zuul
a89dcc262f Merge "python-cephclient: populate items list for all nodes except osd" 2019-07-26 15:38:10 +00:00
Daniel Badea
03636c6fcb python-cephclient: populate items list for all nodes except osd
cephclient wrapper is converting a flat list of dictionaries returned
by Ceph Mimic's osd_crush_tree() to nested dictionaries (actual tree)
as expected sysinv. While doing this it looks at the "children"
attribute and if there's none then it skips populating current
node "items".

For storage nodes that don't have any attached OSDs the corresponding
tree entry will not have an "items" attribute. When sysinv tries to
get OSD's by storage it tries to access it and crashes.

Fix by creating empty "items" attribute unless node type is "osd".

Closes-Bug: 1834539
Change-Id: Icc5988407c9773d10d2cd1078e08ae213075f793
Signed-off-by: Daniel Badea <daniel.badea@windriver.com>
2019-07-24 16:41:10 +00:00
Daniel Badea
bc4bebfb92 ceph: mgr-restful-plugin pid file issues
mgr-restful-plugin is writing service pid to associated pid file
but does not flush or close the file descriptor.

When SM tris to read the pid number from pid file it fails because
the file exists but it's empty.

Flush pid file after writing pid value.

Closes-Bug: 1836897
Change-Id: If34293719f330d89c150fff8491c40a08581a58b
Signed-off-by: Daniel Badea <daniel.badea@windriver.com>
2019-07-17 12:52:50 +00:00
Zuul
3903e7ca44 Merge "ceph: refactor mgr-restful-plugin state machine" 2019-07-10 16:26:00 +00:00
Daniel Badea
a4759b8e5b ceph: refactor mgr-restful-plugin state machine
Replace mgr-restful-plugin service monitoring based on state
machine with explicit transitions with a function that goes
through the following steps:
- wait for Ceph cluster to become available
- configure and start ceph-mgr
- configure and enable restful plugin
- send periodic requests to REST API

Procedure to recover from errors: restart ceph-mgr, update
certificates, run again through configuration steps, wait
for Ceph cluster.

mgr-restful-plugin components:

1. init script: parse command line parameters, start service
   monitoring if not already running, request status via
   control socket, stop service monitoring.

2. service monitoring: create process running ceph-mgr
   confguration steps, report status based on current
   rest api availability and init/recovery procedure,
   stop ceph-mgr and helper process

3. monitoring helper: successively run commands to configure
   and start ceph-mgr and restful plugin, send periodic requests
   to ceph REST API, update service monitoring with current
   failure status (no response from REST API or ceph-mgr
   restarts).

When ceph-mgr fails too many times mgr-restful-plugin exists
and relies on SM to restart it (ceph-mgr is also restarted).

Change-Id: Id5342624948024ce2891e32ee6648c910a6e7391
Closes-Bug: 1828024
Signed-off-by: Daniel Badea <daniel.badea@windriver.com>
2019-05-29 13:12:53 +00:00
Scott Little
663edb4567 Ceph build script improvements to prevent needless rebuilds
Problem: The file $SRC_DIR/src/.git_version is created by the
custom build script and is not cleaned upon exit.  The presence
of the new file under $SRC_DIR will trigger a rebuild on the
next iteration.

Solution: Add a cleanup of the file to cover the normal exit case.
This means $SRC_DIR/src/.git_version will not be present at the
start of the next build.

Problem: A lot of tarballs are copied into the build by the
build script, rather than being listed in the COPY_LIST.  This
breaks the md5 checksum mechanism used to determine if a rebuild
is required.  A change to a tarball will be ignored and not
trigger a rebuild.

Solution: Move the code that generates the list of input tarballs
into build_srpm.data.  The file is sourced, so COPY_LIST can
be populated dynamically.

Problem: Script returns success when rpmbuild fails.

Solution: Propogate the error code to the final exit.

Change-Id: I2e760c24ecd3ce2d237863b948863c2a876d24fa
Closes-Bug: 1830130
Co-authored-by: Shuicheng Lin <shuicheng.lin@intel.com>
Signed-off-by: Scott Little <scott.little@windriver.com>
2019-05-24 13:10:28 -04:00
Daniel Badea
72c3fa95b0 python-cephclient: delete finished requests
ceph-mgr REST API supports synchronous and asynchronous requests.
In asynchronous mode clients can run multiple requests in parallel
then poll to get status of finished requests.

ceph-mgr restful plugin keeps a list of requests that were initiated
by the client and forwarded towards ceph-mgr. It expects the client
to delete finished requests after retrieving current status.

python-cephclient is making synchronous requests (using POST to
"/request?wait=1") but the server is converting them asynchronus
then polls for status on its side. So after getting a response back
the client is still expected to DELETE "/request?id=..."

Currently it's not doing that and ceph-mgr restful plugin is
accumulating a list of all requests ever made by python-cephclient

Change-Id: If8d5c8b27135fde45116e05bb04b655d9574c5ca
Closes-Bug: 1828549
Signed-off-by: Daniel Badea <daniel.badea@windriver.com>
2019-05-10 12:18:47 +00:00
Ovidiu Poncea
d7dd9cf6a4 ceph: mtc process management improvement
These commits were lost accidentally when StarlingX code
was migrated from stx-ceph (ceph jewel version) repository
to stx-integ:

* keep Ceph up when locking a 1 node configuration

    On 1 node configration we need ceph to be operational when
    node is locked otherwise sysinv will give errors when trying
    to configure it.
    Implements:
    containerization-2002844-CEPH-persistent-storage-backend-for-Kubernetes

    Story: 2002844
    Task: 26877

* enable MTC management of Ceph proceses on worker nodes

    Worker nodes may now have a ceph monitor enabled, so remove checks
    preventing it and make sure that ceph.sh is present in /etc/system.d/worker/.
    Origin patch: https://github.com/starlingx-staging/stx-ceph/
                  commit/4fa893a39b4025957f9725d3f75ea502d081ea76

Depends-On: Ibe104b32f568bb59a02b84c255983323d5d14757
Change-Id: Ifd7e6876b853f5629195451a0c4af240d40ebee8
Signed-off-by: Ovidiu Poncea <Ovidiu.Poncea@windriver.com>
Signed-off-by: Daniel Badea <daniel.badea@windriver.com>
2019-04-26 08:56:45 +00:00
Daniel Badea
428a3ff771 ceph: ceph.conf set max pgs per osd
When installing StarlingX in AIO-SX configuration Ceph cluster enters
"too many PGs per OSD" health warn status because of the small number of
OSDs available combined with the number of pools times pg_num  and
replication factor.

In Ceph Luminous the default value for mon_max_pg_per_osd is set to 200
which is below minimum requirements for StarlingX AIO-SX.

See https://ceph.com/community/new-luminous-pg-overdose-protection/
"""
  There is now a mon_max_pg_per_osd limit (default: 200) that prevents
  you from creating new pools or adjusting pg_num or replica count for
  existing pools if it pushes you over the configured limit (as
  determined by dividing the total number of PG instances by the total
  number of “in” OSDs).
"""

To fix this issue we need to set higher values for: mon_max_pg_per_osd
and osd_max_pg_per_osd_hard_ratio.

Story: 2003605
Task: 28860

Depends-On: I7b534e31868e53ec479c2321d6883604c12aa6d3
Change-Id: I302c850191d8ca9548ee12053f803df5abfdd5b4
Signed-off-by: Daniel Badea <daniel.badea@windriver.com>
2019-04-26 08:56:28 +00:00
Daniel Badea
d45193bf25 ceph-manager: fix tox issues
Fix pep8 issues and remove py27 section because there is no test
defined.

Depends-On: I7c6bff4d8986c1fd75c3c9d353557c5eafcdcde0
Change-Id: I7b534e31868e53ec479c2321d6883604c12aa6d3
Signed-off-by: Daniel Badea <daniel.badea@windriver.com>
2019-04-26 08:54:44 +00:00
Changcheng Liu
ac35cc5dc9 ceph-manager: update mgr-restful-plugin endpoint
ceph-rest-api was removed in Ceph Mimic. Update endpoint to remove
 "api/v0.1" path that's not needed by ceph-mgr restful plugin.

Story: 2003605
Task: 28860

Depends-On: I1952a12032bcd08d17786bd817d1e15ce36d5afd
Change-Id: I7c6bff4d8986c1fd75c3c9d353557c5eafcdcde0
Co-Authored-By: Daniel Badea <daniel.badea@windriver.com>
Signed-off-by: Changcheng Liu <changcheng.liu@intel.com>
Signed-off-by: Daniel Badea <daniel.badea@windriver.com>
2019-04-26 08:51:44 +00:00
Daniel Badea
aa94367086 python-cephclient: rewrite to use ceph-mgr restful plugin
Not using any of the existing python-cephclient package.
(it's a new package sharing the same name). Released
under Apache-2.0 license as part of StarlingX.

Additional commits:
- create session on first API request
- add missing osd_remove function
- add timeout when retrieving url and password
- sanitize osd ID input
- osd_crush_tree discards json output

Story: 2003605
Task: 28860

Depends-On: I31fb9aac89c44bbce24939197446caa987d395cb
Change-Id: I1952a12032bcd08d17786bd817d1e15ce36d5afd
Signed-off-by: Daniel Badea <daniel.badea@windriver.com>
2019-04-26 08:51:19 +00:00
Daniel Badea
5fe6c52773 ceph: add mgr-restful-plugin service
ceph-rest-api was removed in Ceph Mimic.  Similar functionality is
provided by ceph-mgr restful plugin. This new service needs to be
configured and started instead of ceph-rest-api:

* enable mgr-restful-plugin in ceph.conf

* add mgr-restrul-plugin service file

* add mgr-restful-plugin monitoring daemon
  waiting for ceph to become available before
  it configures and starts ceph-mgr daemon
  providing RESTful API endpoints

Story: 2003605
Task: 28860

Depends-On: Iaa3319a7647e5622037d12c53673da0e4199ceb4
Change-Id: I31fb9aac89c44bbce24939197446caa987d395cb
Co-Authored-By: Yong Hu <yong.hu@intel.com>
Co-Authored-By: Changcheng Liu <changcheng.liu@intel.com>
Signed-off-by: Daniel Badea <daniel.badea@windriver.com>
2019-04-26 08:50:02 +00:00
Dehao Shang
27426985a3 ceph: build script with submodule archive support
Boost and Ceph submodules are stored as archives in SRPM.
They are unpacked in %prep stage so they can be picked up
by build process.

Ceph submodules are added as Source: dependencies with
a path prefix matching the subfolder where the corresponding
archive will be unpacked.

When adding a new submodule please follow this procedure:

1. make sure sudmodule.tar.gz is available in

   DOWNLOADS_DIR="$CGCS_BASE/downloads"

2. add corresponding source line in ceph.spec prefixed by
   the path in the source tree where the archive should
   be unpacked:

   Source29: src/path/to/submodule/submodule.tar.gz

3. add a new line to unpack the submodule:

   unpack_submodule "%{SOURCE29}" "%(dirname %{SOURCEURL29})"

Story: 2003605
Task: 28856

Depends-On: Ic5a03fe903c5119e6f01bd888093360e7e663bbb
Change-Id: Ic9c4aed8dbab5d3e141cf9c1b2b1892731b14779
Co-Authored-By: Scott Little <scott.little@windriver.com>
Co-Authored-By: Tingjie Chen <tingjie.chen@intel.com>
Co-Authored-By: Daniel Badea <daniel.badea@windriver.com>
Signed-off-by: Dehao Shang <dehao.shang@intel.com>
Signed-off-by: Changcheng Liu <changcheng.liu@intel.com>
Signed-off-by: Daniel Badea <daniel.badea@windriver.com>
2019-04-25 20:40:54 +00:00