openstack-manuals/doc/common/get_started_data_processing.rst
KATO Tomoyuki 0261a73482 [common] Normalize get started contents
* file naming
* service name - to follow official service names
* heading - for inclusion:
    http://docs.openstack.org/mitaka/install-guide-ubuntu/keystone.html

Change-Id: Ifb65971526c2914bf8fc16a10fdafd1c864c38fa
2016-04-15 21:56:09 +09:00

41 lines
1.3 KiB
ReStructuredText

================================
Data Processing service overview
================================
The Data processing service for OpenStack (sahara) aims to provide users
with a simple means to provision data processing (Hadoop, Spark)
clusters by specifying several parameters like Hadoop version, cluster
topology, node hardware details and a few more. After a user fills in
all the parameters, the Data processing service deploys the cluster in a
few minutes. Sahara also provides a means to scale already provisioned
clusters by adding or removing worker nodes on demand.
The solution addresses the following use cases:
* Fast provisioning of Hadoop clusters on OpenStack for development and
QA.
* Utilization of unused compute power from general purpose OpenStack
IaaS cloud.
* Analytics-as-a-Service for ad-hoc or bursty analytic workloads.
Key features are:
* Designed as an OpenStack component.
* Managed through REST API with UI available as part of OpenStack
dashboard.
* Support for different Hadoop distributions:
* Pluggable system of Hadoop installation engines.
* Integration with vendor specific management tools, such as Apache
Ambari or Cloudera Management Console.
* Predefined templates of Hadoop configurations with the ability to
modify parameters.
* User-friendly UI for ad-hoc analytics queries based on Hive or Pig.