This patch extends adds a new field 'instance_ids' in payloads of two
cluster events:
- DBaaSClusterShrink (during start and end notification),
- DBaaSClusterGrow (during end notification).
Moreover, additional end notifications after growing and shrinking
cluster have been added.
The purpose of this change if to enable better integration with
tools for monitoring resources usage.
Change-Id: I2c39b2c3bff65f88e46944eda22209bdc92803bc
Signed-off-by: Kasper Hasior <k.hasior@samsung.com>
Co-Authored-By: Kasper Hasior <k.hasior@samsung.com>
Story: #2005520
Task: #30639
Now Trove doesn't support to specify keypair when creating the db
instance, the ssh key is injected into the guest agent image at the
build time, which makes it very hard to manage.
This patch adds a config option `nova_keypair` that is used as keypair
name when creating db instance. The old way of the image building will
be changed in the subsequent patches.
Change-Id: I41d4e41fc4bc413cdd48b8d761429b0204481932
Story: #2005429
Task: #30462
Use `management_networks` instead. `management_networks`will be used
as admin networks which will be attached to Trove instance
automatically.
Change-Id: I5c6004b568c3a428bc0f0a8b0e36665d3c5b3087
This adds basic framework for trove-status upgrade
check commands. For now it has only "check_placeholder"
check implemented.
Real checks can be added to this tool in the future.
Change-Id: Idfeab4c06cba6f841c17ab6e255a29e8707bfa55
Story: 2003657
Task: 26162
Currently, listing instances only allows to get basic information about
entities. To get the details, one need to query instance "show" endpoint
for each instance separately. This is inefficient and exposes API to a
heavier load.
There are use cases in which we want to obtain detailed information
about all instances. In particular, in services integrating with Trove.
For example, Vitrage project requires this information to build vertices
and edges in the resource graph for RCA analysis.
Change-Id: I33252cce41c27cc7302c860dde1f6448ecdf3991
Signed-off-by: Bartosz Zurkowski <b.zurkowski@samsung.com>
Currently we are not able to specify the endpoint_type
for Neutron, Nova and Cinder clients with single tenant.
publicURL is configured by default but it could be nice
to have the possibility to choose anything else.
Change-Id: Ibb791cacc0e08de2d87b4348f84c9e573849ec51
Closes-Bug: #1776229
Currently when create a mongodb cluster, mongos and configsvr
use the volume_size of replica-set node. But mongos and configvr
are not data node, they don't need volume space as large as data
node. This patch attend to help user specify the number, the volume
size and the volume type of mongos/configserver with
extended_properties[1] argument when creating mongodb. Currently,
the supported parameters are, num_configsvr, num_mongos,
configsvr_volume_size, configsvr_volume_type, mongos_volume_size
and mongos_volume_type.
[1] https://review.openstack.org/#/c/206931/
Closes-Bug: #1734907
Signed-off-by: zhanggang <zhanggang@cmss.chinamobile.com>
Change-Id: Ie48f3961b21f926f983c6713a76b0492952cf4c7
When promoting one slave to the new master in a replication group,
previously the old master will be attached to the new one right after
the new master is on. For MariaDB, attaching the old master to the new
one, new GTID may be created on the old master and also may be synced
to some of the other replicas, as they're still connecting to the old
master. The new GTID does not exists in the new master, making these
slaves diverged from the master. After that, when the diverged slave
connects to the new master, 'START SLAVE' will fail with logs like:
[ERROR] Error reading packet from server: Error: connecting slave
requested to start from GTID X-XXXXXXXXXX-XX, which is not in the
master's binlog. Since the master's binlog contains GTIDs with
higher sequence numbers, it probably means that the slave has
diverged due to executing extra erroneous transactions
(server_errno=1236)
And these slaves will be left orphan and errored after
promote_to_replica_source finishs.
Attaching the other replicas to the new master before dealing with the
old master will fix this problem and the failure of the
trove-scenario-mariadb-multi Zuul job as well.
Closes-Bug: #1754539
Change-Id: Ib9c01b07c832f117f712fd613ae55c7de3561116
Signed-off-by: Zhao Chao <zhaochao1984@gmail.com>
As no content will be returned to the client if a root-disable request
succeeds, a HTTP 204 (Not Content) response is more appropriate.
Redis root-disable scenario test fails because it's return HTTP 204, but
all API related tests are expecting a HTTP 200. Although changing Redis
root-disable API is a much simpler way to resolve the problem, migrating
from HTTP 200 to HTTP 204 should be a better solution. Related tests and
documents are also updated accordingly.
APIImpact
Change-Id: If732a578009fd35436e810fb7ceceefd1ada3778
Signed-off-by: Zhao Chao <zhaochao1984@gmail.com>
Current Nova server volume support is broken. Nova also declared the
'os-volumes_boot' will be deprecated in the future. As creating volumes
by cinderclient has been supoorted for a long time, we could just drop
support of Nova server volume.
This patch also migrate to the new block_device_mapping_v2 parameter of
Nova servers creating API.
Closes-Bug: #1673408
Change-Id: I74d86241a5a0d0b1804b959313432168f68faf89
Signed-off-by: Zhao Chao <zhaochao1984@gmail.com>
trove/common/strategies/cluster/experimental/galera_common/api.py.
Method "shrink" in class GaleraCommonCluster,when use DBInstance.find_all
should set argument deleted=False, otherwise it may missing raise a
ClusterShrinkMustNotLeaveClusterEmpty exception.
Same problem at galera_common/taskmanager.py. Method "shrink_cluster" in
GaleraCommonClusterTasks, call DBInstance.findall() with deleted=False
to exclude deleted nodes and that can avoid a NotFound error.
Change-Id: Ibb377630b830da06485fc17a1a723dc1055d9b01
Closes-Bug: 1699953
Redis configuration validation for the 'repl-backlog-size' parameter
uses a wrong MIN value of '0'.When set to less than 16384 value,
I can see that the value in redis.conf[1], but through the
'config get *' see are 16384[2]. Because the minimum default value
in redis is 16384[3]. So I want to modify Min value to 16384.
[1]: repl-backlog-size 0
[2]: 59) "repl-backlog-size"
60) "16384"
[3]:58f79e2ff4/src/server.h (L110)
Closes-Bug: #1697596
Change-Id: I81cb1c02943edf0af3d7bf67ff2f083a4c07d518
Server side support for the new 'reapply' command.
This reapplies a given module to all instances that it had
previously been applied to.
Originally, a module designated live-update would automatically
be re-applied whenever it was updated. Adding a specific
command however, allows operators/users more control over
how the new payload would be distributed. Old 'modules'
could be left if desired, or updated with the new command.
Scenario tests were updated to test the new command.
DocImpact: update documentation to reflect module-reapply command
Change-Id: I4aea674ebe873a96ed22b5714263d0eea532a4ca
Depends-On: Ic4cc9e9085cb40f1afbec05caeb04886137027a4
Closes-Bug: #1554903
Fixed the module-instances command to return a paginated
list of instances. Also added a --count_only flag to the
command to return a summary of the applied instances
based on the MD5 of the module (this is most useful
for live_update modules, to see which ones haven't been
updated).
Also cleaned up the code a bit, putting some methods
into files where they made more sense (and would cause
less potential collisions during import).
Change-Id: I963e0f03875a1b93e2e1214bcb6580c507fa45fe
Closes-Bug: #1554900
Implement configuration attach and detach API for clusters.
Implement rolling strategy for applying configuration changes
(both attach and detach follow the same pattern).
1. Persist the changes on all nodes (leaving nodes in RESTART_REQUIRED state).
2. Update Trove records.
3. Apply changes dynamically via one or all node(s) if possible
(and remove RESTART_REQUIRED flag from all nodes).
Notes:
The single instance implementation has been restructured (similar to above)
such that it always leaves the instance in one of the three states:
a) Unchanged
b) Changes persisted but not applied
(Instance has configuration attached but requires restart.
It is safe restart manually or detach the group to avoid
any changes)
c) Changes persisted and applied (if possible)
This implemenation should always leave the cluster (and each instance)
in a consistent state.
Runtime configuration will not be changed until it is first persisted
on all nodes.
If there is a failure during step 1) the cluster is still running
the old configuration. Some instances may have new configuration
persisted, but not applied.
The cluster will not have configuration attached unless it can
be applied to all nodes.
The individual nodes will have configuration attached as soon as it is
persisted on the guest.
It is safe to retry, reapplying the same configuration on a node is
noop.
It is safe to detach. Removing configuration from nodes without one
is a noop.
It is safe to detach the configuration from individual nodes via
single-instance API.
It is safe to attach the configuration to remaining nodes via
single-instance API and rerun cluster attach to update Trove records.
If 3) fails for whatewer reason the instances are left
in RESTART_REQUIRED state.
It is safe to retry or detach configuration or restart the
instances manually.
Also fixed various minor cluster issues.
Implements: blueprint cluster-configuration-groups
Change-Id: I7c0a22c6a0287128d0c37e100589c78173fd9c1a
The boolean values in module-list/show were returned
as 0/1 however the OpenStack standard is to return
true/false so these values have been modified.
Change-Id: Ib986e4adff0c06e96ea6533f9756928a0a055bfd
Closes-Bug: 1656398
Implement cluster rolling restart strategy.
Add support for Cassandra and PXC.
Add some missing cluster upgrade infrastructure.
Implements: blueprint cluster-restart
Co-Authored-By: Petr Malik <pmalik@tesora.com>
Co-Authored-By: Peter Stachowski <peter@tesora.com>
Change-Id: I21e654a8dd2dc6a74aa095604f78db4e96c70d64
A method for specifying 'priority' modules plus a way to rank the order
in which modules are applied has been added. Two new attributes
'priority_apply' and 'apply_order' are available in the payload
on create and update. In addition, an is_admin flag was added as an
automatic attribute, to be set when someone with admin credentials
creates a module or updates an existing module with 'admin-only'
options. This allows better control on the driver plugin
side with regards to security concerns, etc. The attribute is
now passed in to the guest 'apply' interface for use by the driver.
All three of these attributes are stored in the Trove database.
An admin can create a 'non-admin' module by passing in --full_access
on the command line (or python interface). This will cause an
error if any admin-only options are selected.
Scenario tests have been added to verify that the modules are
applied in the correct order. The timestamp for the 'updated'
field on the guest was also enhanced to allow for fractional
seconds, since most applies take less than a second.
The issue where modules were allowed to be applied even if
they belonged to a different datastore has been fixed and
scenario tests added to check for this case.
Change-Id: I7fcd0cf12790564ba62e7d6451fff96f763e539d
Implements: blueprint module-management-ordering
Cinder supports multiple volume types and volume types can be
explicitly requested in create requests. This change allows users to
restrict the allowed volume types for a given datastore/version in a
manner similar to flavors.
Co-Authored-By: amrith <amrith@tesora.com>
Change-Id: I790751ade042e271ba1cc902a8ef4d3c3a8dc557
Implements: blueprint associate-volume-type-datastore
Added support for the nic and az parameter in cluster grow for
mongodb.
Redis and cassandra already fully supported these fields.
Change-Id: If1cecbd0a893bb493187cdad0c563e6ea681d250
Closes-Bug: #1641675
The Oslo Policy library provides support for RBAC policy
enforcement across all OpenStack services.
Update the devstack plugin to copy the default policy file
over to /etc/trove in the gate environments.
Note: Not adding a rule for 'reset-password' instance
action as that API was discontinued years ago
and is now just waiting for removal (Bug: 1645866).
DocImpact
Co-Authored-By: Ali Adil <aadil@tesora.com>
Change-Id: Ic443a4c663301840406cad537159eab7b0b5ed1c
Implements: blueprint trove-policy
The config file gets restored so the config cache needs refreshing.
Touch .guestagent.prepare.end file after upgrade to fix the problem of
the guestagent not knowing if prepare has ran.
Write the volume mount to fstab to persist after subsequent restarts.
Change-Id: I3831de12c999ef8818e80ecdb29f1d86ff8cd5c8
Closes-bug: #1645460
Depends-On: I5c1714b7839b2736c50f2daa2f4506c4006815a1
During guestagent volume operations the calls to unmount check if the
volume is mounted first. The python function os.path.ismount returns
False if the directory if not readable by the current user, breaking
this functionality for some datastores.
The symptom of this failure is that the device ends up mounted twice
during prepare and then fails to unmount fully during resize.
The fix is to create a custom is_mount function that runs as the root user.
Change-Id: I151402717386230371bafcedc170d70b3588e912
Closes-Bug: #1645773
The Compute ID (server_id) and Volume ID (volume_id)
associated with a trove instance are useful information
for an administrator. This commit add these fields to the
trove show output. They will only be visible to users
with admin rights.
Change-Id: I4a39b59ae610803f5aaf849f2e20ebb6e4ea1565
Closes-Bug: 1633581
There is a race condition in showing a cluster wherein the
server gets the list of instances from the db and then iterates
over the list to get the server information. If a shrink
operation is in progress, it can happen that one of the
instances is no longer present when trying to retrieve
the server info, and this causes the show command to throw
a NotFound error.
This is now trapped and the 'missing' server excluded from the list.
Change-Id: I54edc4acac09ca2278f525c08ad0d87576f0549e
Closee-Bug: 1643002
When you do a cluster-show, the instance list contains all ips.
The summary (which is the one displayed and used for tests)
only lists the first ip from each instance. It should list them
all, so the user could potentially chose which one they wanted
(i.e. an IPV4 address vs. an IPV6 one).
Depends-On: I54edc4acac09ca2278f525c08ad0d87576f0549e
Change-Id: I2ede074eb9bdf26420750f19f3aa4b8d057c5d7d
Closes-Bug: 1642695
The module feature of Trove has been designed to be idempotent, in
that a module should be able to be applied more than once with no ill
side effects. Unfortunately a new instance_modules record is written
for each apply, potentially leaving records behind on module-remove
that make it impossible to delete the module.
This has been fixed and code put in place on module-apply and
module-delete to remove any extraneous records.
Depends-On: Ia4fc545a10c7c16532aefd73818dd7d90c9c271b
Change-Id: I09b301c1fb8311f9c5bee07d0af398071da3dd24
Closes-Bug: #1640010
This is an initial attempt at supporting multiple regions. It should
handle the mechanics of deploying an instance/volume to a remote
region. Additional changes may be required to allow the guest
agent on the instance to connect back to the originating region.
Co-Authored-By: Petr Malik <pmalik@tesora.com>
Change-Id: I780de59dae5f90955139ab8393cf7d59ff3a21f6
Removed an extra 'source-pgdata' argument and replaced
'-D' with more verbose version of the same flag.
Give the replicator user superaccess.
Change-Id: Id8e3eefad60666e73c029a03ce59e765d390e908
Closes-Bug: 1633515
SafeConfigParser is deprecated in Python 3.2 and log warning
like " DeprecationWarning: The SafeConfigParser class has
been renamed to ConfigParser in Python 3.2. This alias will be
removed in future versions. Use ConfigParser directly instead."
So use ConfigParser in Python 3.2+.
Closes-Bug: #1618666
Change-Id: I30fe51324ffcc0afbd02799449daee8f628634b6