First tentative of docbook tidy (tidy -xml -i filename.xml)

Change-Id: Ibb0392bbd1ab54c8db47fbc862eeeabdc7047c5b
This commit is contained in:
razique 2012-03-16 01:01:01 +01:00
parent c1286e5772
commit 10705e1084

@ -1,251 +1,638 @@
<?xml version="1.0" encoding="UTF-8"?>
<?xml version="1.0" encoding="utf-8"?>
<chapter xmlns="http://docbook.org/ns/docbook"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0"
xml:id="ch_getting-started-with-openstack">
<title>Getting Started with OpenStack</title>
<para>OpenStack is a collection of open source technology that provides massively scalable open
source cloud computing software. Currently OpenStack develops two related projects:
OpenStack Compute, which offers computing power through virtual machine and network
management, and OpenStack Object Storage which is software for redundant, scalable object
storage capacity. Closely related to the OpenStack Compute project is the Image Service
project, named Glance. OpenStack can be used by corporations, service providers, VARS, SMBs,
researchers, and global data centers looking to deploy large-scale cloud deployments for
private or public clouds. </para>
<section xml:id="what-is-openstack">
<title>What is OpenStack?</title>
<para>OpenStack offers open source software to build public and private clouds. OpenStack is
a community and a project as well as open source software to help organizations run
clouds for virtual computing or storage. OpenStack contains a collection of open source
projects that are community-maintained including OpenStack Compute (code-named Nova),
OpenStack Object Storage (code-named Swift), and OpenStack Image Service (code-named
Glance). OpenStack provides an operating platform, or toolkit, for orchestrating clouds. </para>
<para>OpenStack is more easily defined once the concepts of cloud computing become
apparent, but we are on a mission: to provide scalable, elastic cloud computing for
both public and private clouds, large and small. At the heart of our mission is a
pair of basic requirements: clouds must be simple to implement and massively
scalable.</para>
<para>If you are new to OpenStack, you will undoubtedly have questions about installation,
deployment, and usage. It can seem overwhelming at first. But don't fear, there are
places to get information to guide you and to help resolve any issues you may run into
during the on-ramp process. Because the project is so new and constantly changing, be
aware of the revision time for all information. If you are reading a document that is a
few months old and you feel that it isn't entirely accurate, then please let us know
through the mailing list at <link xlink:href="https://launchpad.net/~openstack"
>https://launchpad.net/~openstack</link> so it can be updated or removed. </para>
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0"
xml:id="ch_getting-started-with-openstack">
<title>Getting Started with OpenStack</title>
<para>OpenStack is a collection of open source technology that
provides massively scalable open source cloud computing software.
Currently OpenStack develops two related projects: OpenStack
Compute, which offers computing power through virtual machine and
network management, and OpenStack Object Storage which is software
for redundant, scalable object storage capacity. Closely related to
the OpenStack Compute project is the Image Service project, named
Glance. OpenStack can be used by corporations, service providers,
VARS, SMBs, researchers, and global data centers looking to deploy
large-scale cloud deployments for private or public clouds.</para>
<section xml:id="what-is-openstack">
<title>What is OpenStack?</title>
<para>OpenStack offers open source software to build public and
private clouds. OpenStack is a community and a project as well as
open source software to help organizations run clouds for virtual
computing or storage. OpenStack contains a collection of open
source projects that are community-maintained including OpenStack
Compute (code-named Nova), OpenStack Object Storage (code-named
Swift), and OpenStack Image Service (code-named Glance).
OpenStack provides an operating platform, or toolkit, for
orchestrating clouds.</para>
<para>OpenStack is more easily defined once the concepts of cloud
computing become apparent, but we are on a mission: to provide
scalable, elastic cloud computing for both public and private
clouds, large and small. At the heart of our mission is a pair of
basic requirements: clouds must be simple to implement and
massively scalable.</para>
<para>If you are new to OpenStack, you will undoubtedly have
questions about installation, deployment, and usage. It can seem
overwhelming at first. But don't fear, there are places to get
information to guide you and to help resolve any issues you may
run into during the on-ramp process. Because the project is so
new and constantly changing, be aware of the revision time for
all information. If you are reading a document that is a few
months old and you feel that it isn't entirely accurate, then
please let us know through the mailing list at
<link xlink:href="https://launchpad.net/~openstack">
https://launchpad.net/~openstack</link>so it can be updated or
removed.</para>
</section>
<section xml:id="components-of-openstack">
<title>Components of OpenStack</title>
<para>There are currently three main components of OpenStack:
Compute, Object Storage, and Image Service. Let's look at each in
turn.</para>
<para>OpenStack Compute is a cloud fabric controller, used to
start up virtual instances for either a user or a group. It's
also used to configure networking for each instance or project
that contains multiple instances for a particular project.</para>
<para>OpenStack Object Storage is a system to store objects in a
massively scalable large capacity system with built-in redundancy
and failover. Object Storage has a variety of applications, such
as backing up or archiving data, serving graphics or videos
(streaming data to a user&#226;&#8364;&#8482;s browser), storing
secondary or tertiary static data, developing new applications
with data storage integration, storing data when predicting
storage capacity is difficult, and creating the elasticity and
flexibility of cloud-based storage for your web
applications.</para>
<para>OpenStack Image Service is a lookup and retrieval system
for virtual machine images. It can be configured in three ways:
using OpenStack Object Store to store images; using Amazon's
Simple Storage Solution (S3) storage directly; or using S3
storage with Object Store as the intermediate for S3
access.</para>
<para>The following diagram shows the basic relationships between
the projects, how they relate to each other, and how they can
fulfill the goals of open source cloud computing.</para>
<informalfigure>
<mediaobject>
<imageobject>
<imagedata fileref="figures/OpenStackCore.png" />
</imageobject>
</mediaobject>
</informalfigure>
</section>
<section xml:id="openstack-architecture-overview">
<title>OpenStack Project Architecture Overview</title>
<para>by
<link xlink:href="http://ken.pepple.info">Ken
Pepple</link></para>
<para>Before we dive into the conceptual and logic architecture,
let&#226;&#8364;&#8482;s take a second to explain the OpenStack
project:</para>
<blockquote>
<para>OpenStack is a collection of open source technologies
delivering a massively scalable cloud operating system.</para>
</blockquote>
<para>You can think of it as software to power your own
Infrastructure as a Service (IaaS) offering like
<link xlink:href="http://aws.amazon.com">Amazon Web
Services</link>. It currently encompasses three main
projects:</para>
<itemizedlist>
<listitem>
<para>
<link xlink:href="https://launchpad.net/swift">
Swift</link>which provides object/blob storage. This is
roughly analogous to Rackspace Cloud Files (from which it is
derived) or Amazon S3.</para>
</listitem>
<listitem>
<para>
<link xlink:href="https://launchpad.net/glance">
Glance</link>which provides discovery, storage and retrieval
of virtual machine images for OpenStack Nova.</para>
</listitem>
<listitem>
<para>
<link xlink:href="https://launchpad.net/nova">
Nova</link>which provides virtual servers upon demand. This
is similar to Rackspace Cloud Servers or Amazon EC2.</para>
</listitem>
</itemizedlist>
<para>While these three projects provide the core of the cloud
infrastructure, OpenStack is open and evolving
&#226;&#8364;&#8221;
<link xlink:href="http://wiki.openstack.org/Projects">there will
be more projects</link>(there are already related projects for
<link xlink:href="https://launchpad.net/horizon">web
interfaces</link>and a
<link xlink:href="http://wiki.openstack.org/QueueService">queue
service</link>). With that brief introduction,
let&#226;&#8364;&#8482;s delve into a conceptual architecture and
then examine how OpenStack Compute could map to it.</para>
<section xml:id="cloud-provider-conceptual-architecture">
<info>
<author>
<personname>
<firstname>Ken</firstname>
<lineage>Pepple</lineage>
</personname>
</author>
<title>Cloud Provider Conceptual Architecture</title>
</info>
<para>Imagine that we are going to build our own IaaS cloud and
offer it to customers. To achieve this, we would need to
provide several high level features:</para>
<orderedlist>
<listitem>
<para>Allow application owners to register for our cloud
services, view their usage and see their bill (basic
customer relations management functionality)</para>
</listitem>
<listitem>
<para>Allow Developers/DevOps folks to create and store
custom images for their applications (basic build-time
functionality)</para>
</listitem>
<listitem>
<para>Allow DevOps/Developers to launch, monitor and
terminate instances (basic run-time functionality)</para>
</listitem>
<listitem>
<para>Allow the Cloud Operator to configure and operate the
cloud infrastructure</para>
</listitem>
</orderedlist>
<para>While there are certainly many, many other features that
we would need to offer (especially if we were to follow a more
complete industry framework like
<link xlink:href="http://www.tmforum.org/BusinessProcessFramework/1647/home.html">
eTOM</link>), these four get to the very heart of providing
IaaS. Now assuming that you agree with these four top level
features, you might put together a conceptual architecture that
looks something like this:</para>
<informalfigure>
<mediaobject>
<imageobject>
<imagedata scale="70"
fileref="figures/nova-cactus-conceptual.png" />
</imageobject>
</mediaobject>
</informalfigure>
<para>In this model, I&#226;&#8364;&#8482;ve imagined four sets
of users (developers, devops, owners and operators) that need
to interact with the cloud and then separated out the
functionality needed for each. From there,
I&#226;&#8364;&#8482;ve followed a pretty common tiered
approach to the architecture (presentation, logic and
resources) with two orthogonal areas (integration and
management). Let&#226;&#8364;&#8482;s explore each a little
further:</para>
<itemizedlist>
<listitem>
<para>As with presentation layers in more typical
application architectures, components here interact with
users to accept and present information. In this layer, you
will find web portals to provide graphical interfaces for
non-developers and API endpoints for developers. For more
advanced architectures, you might find load balancing,
console proxies, security and naming services present here
also.</para>
</listitem>
<listitem>
<para>The logic tier would provide the intelligence and
control functionality for our cloud. This tier would house
orchestration (workflow for complex tasks), scheduling
(determining mapping of jobs to resources), policy (quotas
and such) , image registry (metadata about instance
images), logging (events and metering).</para>
</listitem>
<listitem>
<para>There will need to be integration functions within
the architecture. It is assumed that most service providers
will already have a customer identity and billing systems.
Any cloud architecture would need to integrate with these
systems.</para>
</listitem>
<listitem>
<para>As with any complex environment, we will need a
management tier to operate the environment. This should
include an API to access the cloud administration features
as well as some forms of monitoring. It is likely that the
monitoring functionality will take the form of integration
into an existing tool. While I&#226;&#8364;&#8482;ve
highlighted monitoring and an admin API for our fictional
provider, in a more complete architecture you would see a
vast array of operational support functions like
provisioning and configuration management.</para>
</listitem>
<listitem>
<para>Finally, since this is a compute cloud, we will need
actual compute, network and storage resources to provide to
our customers. This tier provides these services, whether
they be servers, network switches, network attached storage
or other resources.</para>
</listitem>
</itemizedlist>
<para>With this model in place, let&#226;&#8364;&#8482;s shift
gears and look at OpenStack Compute&#226;&#8364;&#8482;s
logical architecture.</para>
</section>
<section xml:id="components-of-openstack"><title>Components of OpenStack</title>
<para>There are currently three main components of OpenStack: Compute, Object Storage, and
Image Service. Let's look at each in turn.</para>
<para>OpenStack Compute is a cloud fabric controller, used to start up virtual instances for
either a user or a group. It's also used to configure networking for each instance or
project that contains multiple instances for a particular project. </para>
<para>OpenStack Object Storage is a system to store objects in a massively scalable large
capacity system with built-in redundancy and failover. Object Storage has a variety of
applications, such as backing up or archiving data, serving graphics or videos
(streaming data to a users browser), storing secondary or tertiary static data,
developing new applications with data storage integration, storing data when predicting
storage capacity is difficult, and creating the elasticity and flexibility of
cloud-based storage for your web applications.</para>
<para>OpenStack Image Service is a lookup and retrieval system for virtual machine images.
It can be configured in three ways: using OpenStack Object Store to store images; using
Amazon's Simple Storage Solution (S3) storage directly; or using S3 storage with Object
Store as the intermediate for S3 access.</para>
<para>The following diagram shows the basic relationships between the projects, how they
relate to each other, and how they can fulfill the goals of open source cloud computing. </para>
<informalfigure>
<mediaobject>
<imageobject>
<imagedata fileref="figures/OpenStackCore.png"/>
</imageobject>
</mediaobject></informalfigure>
<section xml:id="openstack-nova-logical-architecture">
<title>OpenStack Compute Logical Architecture</title>
<para>Now that we&#226;&#8364;&#8482;ve looked at a proposed
conceptual architecture, let&#226;&#8364;&#8482;s see how
OpenStack Compute is logically architected. At the time of this
writing, Cactus was the newest release (which means if you are
viewing this after around July 2011, this may be out of date).
There are several logical components of OpenStack Compute
architecture but the majority of these components are custom
written python daemons of two varieties:</para>
<itemizedlist>
<listitem>
<para>WSGI applications to receive and mediate API calls (
<code>nova-api</code>,
<code>glance-api</code>, etc.)</para>
</listitem>
<listitem>
<para>Worker daemons to carry out orchestration tasks (
<code>nova-compute</code>,
<code>nova-network</code>,
<code>nova-schedule</code>, etc.)</para>
</listitem>
</itemizedlist>
<para>However, there are two essential pieces of the logical
architecture are neither custom written nor Python based: the
messaging queue and the database. These two components
facilitate the asynchronous orchestration of complex tasks
through message passing and information sharing. Putting this
all together we get a picture like this:</para>
<informalfigure>
<mediaobject>
<imageobject>
<imagedata scale="70"
fileref="figures/nova-cactus-logical.png" />
</imageobject>
</mediaobject>
</informalfigure>
<para>This complicated, but not overly informative, diagram as
it can be summed up in three sentences:</para>
<itemizedlist>
<listitem>
<para>End users (DevOps, Developers and even other
OpenStack components) talk to
<code>nova-api</code>to interface with OpenStack
Compute</para>
</listitem>
<listitem>
<para>OpenStack Compute daemons exchange info through the
queue (actions) and database (information) to carry out API
requests</para>
</listitem>
<listitem>
<para>OpenStack Glance is basically a completely separate
infrastructure which OpenStack Compute interfaces through
the Glance API</para>
</listitem>
</itemizedlist>
<para>Now that we see the overview of the processes and their
interactions, let&#226;&#8364;&#8482;s take a closer look at
each component.</para>
<itemizedlist>
<listitem>
<para>The
<code>nova-api</code>daemon is the heart of the OpenStack
Compute. You may see it illustrated on many pictures of
OpenStack Compute as API and &#226;&#8364;&#339;Cloud
Controller&#226;&#8364;. While this is partly true, cloud
controller is really just a class (specifically the
CloudController in trunk/nova/api/ec2/cloud.py) within the
<code>nova-api</code>daemon. It provides an endpoint for
all API queries (either
<link xlink:href="http://docs.rackspacecloud.com/api/">
OpenStack API</link>or
<link xlink:href="http://docs.amazonwebservices.com/AWSEC2/latest/APIReference/">
EC2 API</link>), initiates most of the orchestration
activities (such as running an instance) and also enforces
some policy (mostly quota checks).</para>
</listitem>
<listitem>
<para>The
<code>nova-schedule</code>process is conceptually the
simplest piece of code in OpenStack Compute: take a virtual
machine instance request from the queue and determines
where it should run (specifically, which compute server
host it should run on). In practice however, I am sure this
will grow to be the most complex as it needs to factor in
current state of the entire cloud infrastructure and apply
complicated algorithm to ensure efficient usage. To that
end,
<code>nova-schedule</code>implements a pluggable
architecture that let&#226;&#8364;&#8482;s you choose (or
write) your own algorithm for scheduling. Currently, there
are several to choose from (simple, chance, etc) and it is
a area of hot development for the future releases of
OpenStack Compute.</para>
</listitem>
<listitem>
<para>The
<code>nova-compute</code>process is primarily a worker
daemon that creates and terminates virtual machine
instances. The process by which it does so is fairly
complex (
<link xlink:href="http://www.laurentluce.com/?p=227">see
this blog post by Laurence Luce for the gritty
details</link>) but the basics are simple: accept actions
from the queue and then perform a series of system commands
(like launching a KVM instance) to carry them out while
updating state in the database.</para>
</listitem>
<listitem>
<para>As you can gather by the name,
<code>nova-volume</code>manages the creation, attaching and
detaching of persistent volumes to compute instances
(similar functionality to
<link xlink:href="http://aws.amazon.com/ebs/">
Amazon&#226;&#8364;&#8482;s Elastic Block Storage</link>).
It can use volumes from a variety of providers, such as
iSCSI.</para>
</listitem>
<listitem>
<para>The
<code>nova-network</code>worker daemon is very similar to
<code>nova-compute</code>and
<code>nova-volume</code>. It accepts networking tasks from
the queue and then performs tasks to manipulate the network
(such as setting up bridging interfaces or changing
<code>iptables</code>rules).</para>
</listitem>
<listitem>
<para>The queue provides a central hub for passing messages
between daemons. This is currently implemented with
<link xlink:href="http://www.rabbitmq.com/">
RabbitMQ</link>today, but theoretically could be any
<link xlink:href="http://www.amqp.org/confluence/display/AMQP/Advanced+Message+Queuing+Protocol">
AMPQ message queue</link>supported by the python
<link xlink:href="http://barryp.org/software/py-amqplib/">
ampqlib</link>.</para>
</listitem>
<listitem>
<para>The
<link xlink:href="http://en.wikipedia.org/wiki/SQL">SQL
database</link>stores most of the build-time and run-time
state for a cloud infrastructure. This includes the
instance types that are available for use, instances in
use, networks available and projects. Theoretically,
OpenStack Compute can support any database supported by
<link xlink:href="http://www.sqlalchemy.org/">
SQL-Alchemy</link>but the only databases currently being
widely used are
<link xlink:href="http://www.sqlite.org/">
sqlite3</link>(only appropriate for test and development
work),
<link xlink:href="http://mysql.com/">MySQL</link>and
<link xlink:href="http://www.postgresql.org/">
PostgreSQL</link>.</para>
</listitem>
<listitem>
<para>OpenStack Glance is a separate project from OpenStack
Compute, but as shown above, complimentary. While it is an
optional part of the overall compute architecture, I
can&#226;&#8364;&#8482;t imagine that most OpenStack
Compute installations will not be using it (or a
complimentary product). There are three pieces to Glance:
<code>glance-api</code>,
<code>glance-registry</code>and the image store. As you can
probably guess,
<code>glance-api</code>accepts API calls, much like
<code>nova-api</code>, and the actual image blobs are
placed in the image store. The
<code>glance-registry</code>stores and retrieves metadata
about images. The image store can be a number of different
object stores, include OpenStack Object Storage
(Swift).</para>
</listitem>
<listitem>
<para>Finally, another optional project that we will need
for our fictional service provider is an user dashboard. I
have picked the OpenStack Dashboard here, but there are
also several other web front ends available for OpenStack
Compute. The OpenStack Dashboard provides a web interface
into OpenStack Compute to give application developers and
devops staff similar functionality to the API. It is
currently implemented as a
<link xlink:href="http://www.djangoproject.com/">
Django</link>web application.</para>
</listitem>
</itemizedlist>
<para>This logical architecture represents just one way to
architect OpenStack Compute. With its pluggable architecture,
we could easily swap out OpenStack Glance with another image
service or use another dashboard. In the coming releases of
OpenStack, expect to see more modularization of the code
especially in the network and volume areas.</para>
</section>
<section xml:id="openstack-architecture-overview"><title>OpenStack Project Architecture Overview</title>
<para>by <link xlink:href="http://ken.pepple.info">Ken Pepple</link></para><para>Before we dive into the conceptual and logic architecture, lets take a second to explain the OpenStack project: </para><blockquote><para>OpenStack is a collection of open source technologies delivering a massively scalable cloud operating system.</para></blockquote><para>You can think of it as software to power your own Infrastructure as a Service (IaaS) offering like <link xlink:href="http://aws.amazon.com">Amazon Web Services</link>. It currently encompasses three main projects:</para><itemizedlist><listitem><para><link xlink:href="https://launchpad.net/swift">Swift</link> which provides object/blob storage. This is roughly analogous to Rackspace Cloud Files (from which it is derived) or Amazon S3.</para></listitem><listitem><para><link xlink:href="https://launchpad.net/glance">Glance</link> which provides discovery, storage and retrieval of virtual machine images for OpenStack Nova.</para></listitem><listitem><para><link xlink:href="https://launchpad.net/nova">Nova</link> which provides virtual servers upon
demand. This is similar to Rackspace Cloud Servers or Amazon EC2.</para></listitem></itemizedlist><para>While these three projects provide the core of the cloud infrastructure, OpenStack is open and
evolving — <link xlink:href="http://wiki.openstack.org/Projects">there will be more
projects</link> (there are already related projects for <link
xlink:href="https://launchpad.net/horizon">web interfaces</link> and a
<link xlink:href="http://wiki.openstack.org/QueueService">queue service</link>).
With that brief introduction, lets delve into a conceptual architecture and then
examine how OpenStack Compute could map to it. </para>
<section xml:id="cloud-provider-conceptual-architecture">
<info><author><personname><firstname>Ken</firstname><lineage>Pepple</lineage></personname></author><title>Cloud Provider Conceptual Architecture</title></info><para>Imagine that we are going to build our own IaaS cloud and offer it to customers. To achieve this, we would need to provide several high level features:</para><orderedlist><listitem><para>Allow application owners to register for our cloud services, view their usage and see their bill (basic customer relations management functionality)</para></listitem><listitem><para>Allow Developers/DevOps folks to create and store custom images for their applications (basic build-time functionality)</para></listitem><listitem><para>Allow DevOps/Developers to launch, monitor and terminate instances (basic run-time functionality)</para></listitem><listitem><para>Allow the Cloud Operator to configure and operate the cloud infrastructure</para></listitem></orderedlist><para>While there are certainly many, many other features that we
would need to offer (especially if we were to follow a
more complete industry framework like <link
xlink:href="http://www.tmforum.org/BusinessProcessFramework/1647/home.html"
>eTOM</link>), these four get to the very heart of
providing IaaS. Now assuming that you agree with these
four top level features, you might put together a
conceptual architecture that looks something like
this:</para>
<informalfigure><mediaobject><imageobject><imagedata scale="70" fileref="figures/nova-cactus-conceptual.png"/></imageobject></mediaobject></informalfigure>
<para>In this model, Ive imagined four sets of users (developers, devops, owners and operators)
that need to interact with the cloud and then separated out the functionality needed
for each. From there, Ive followed a pretty common tiered approach to the
architecture (presentation, logic and resources) with two orthogonal areas
(integration and management). Lets explore each a little further: </para><itemizedlist><listitem><para>As with presentation layers in more typical application architectures, components here interact with users to accept and present information. In this layer, you will find web portals to provide graphical interfaces for non-developers and API endpoints for developers. For more advanced architectures, you might find load balancing, console proxies, security and naming services present here also.</para></listitem><listitem><para>The logic tier would provide the intelligence and control functionality for our cloud. This tier would house orchestration (workflow for complex tasks), scheduling (determining mapping of jobs to resources), policy (quotas and such) , image registry (metadata about instance images), logging (events and metering). </para></listitem><listitem><para>There will need to be integration functions within the architecture. It is assumed that most service providers will already have a customer identity and billing systems. Any cloud architecture would need to integrate with these systems.</para></listitem><listitem><para>As with any complex environment, we will need a management tier to operate the environment. This should include an API to access the cloud administration features as well as some forms of monitoring. It is likely that the monitoring functionality will take the form of integration into an existing tool. While Ive highlighted monitoring and an admin API for our fictional provider, in a more complete architecture you would see a vast array of operational support functions like provisioning and configuration management.</para></listitem><listitem><para>Finally, since this is a compute cloud, we will need actual compute, network and storage resources to provide to our customers. This tier provides these services, whether they be servers, network switches, network attached storage or other resources.</para></listitem></itemizedlist><para>With this model in place, lets shift gears and look at OpenStack Computes logical
architecture.</para></section><section xml:id="openstack-nova-logical-architecture"><title>OpenStack Compute Logical Architecture</title><para>Now that weve looked at a proposed conceptual architecture, lets see how OpenStack Compute
is logically architected. At the time of this writing, Cactus was the newest release
(which means if you are viewing this after around July 2011, this may be out of
date). There are several logical components of OpenStack Compute architecture but
the majority of these components are custom written python daemons of two
varieties:</para><itemizedlist><listitem><para>WSGI applications to receive and mediate API calls (<code>nova-api</code>, <code>glance-api</code>, etc.)</para></listitem><listitem><para>Worker daemons to carry out orchestration tasks (<code>nova-compute</code>, <code>nova-network</code>, <code>nova-schedule</code>, etc.)</para></listitem></itemizedlist><para>However, there are two essential pieces of the logical architecture are neither custom written nor Python based: the messaging queue and the database. These two components facilitate the asynchronous orchestration of complex tasks through message passing and information sharing. Putting this all together we get a picture like this:</para><informalfigure><mediaobject><imageobject><imagedata scale="70" fileref="figures/nova-cactus-logical.png"/></imageobject></mediaobject></informalfigure><para>This complicated, but not overly informative, diagram as it can be summed up in three sentences:</para><itemizedlist><listitem><para>End users (DevOps, Developers and even other OpenStack components) talk to
<code>nova-api</code> to interface with OpenStack Compute</para></listitem><listitem><para>OpenStack Compute daemons exchange info through the queue (actions) and database (information)
to carry out API requests</para></listitem><listitem><para>OpenStack Glance is basically a completely separate infrastructure which OpenStack Compute
interfaces through the Glance API</para></listitem></itemizedlist><para>Now that we see the overview of the processes and their interactions, lets take a closer look at each component.</para><itemizedlist><listitem><para>The <code>nova-api</code> daemon is the heart of the OpenStack Compute. You may see it
illustrated on many pictures of OpenStack Compute as API and “Cloud
Controller”. While this is partly true, cloud controller is really just a
class (specifically the CloudController in trunk/nova/api/ec2/cloud.py)
within the <code>nova-api</code> daemon. It provides an endpoint for all API
queries (either <link xlink:href="http://docs.rackspacecloud.com/api/"
>OpenStack API</link> or <link
xlink:href="http://docs.amazonwebservices.com/AWSEC2/latest/APIReference/"
>EC2 API</link>), initiates most of the orchestration activities (such
as running an instance) and also enforces some policy (mostly quota
checks).</para></listitem><listitem><para>The <code>nova-schedule</code> process is conceptually the simplest piece of code in OpenStack
Compute: take a virtual machine instance request from the queue and
determines where it should run (specifically, which compute server host it
should run on). In practice however, I am sure this will grow to be the most
complex as it needs to factor in current state of the entire cloud
infrastructure and apply complicated algorithm to ensure efficient usage. To
that end, <code>nova-schedule</code> implements a pluggable architecture
that lets you choose (or write) your own algorithm for scheduling.
Currently, there are several to choose from (simple, chance, etc) and it is
a area of hot development for the future releases of OpenStack
Compute.</para></listitem><listitem><para>The <code>nova-compute</code> process is primarily a worker daemon that creates and terminates virtual machine instances. The process by which it does so is fairly complex (<link xlink:href="http://www.laurentluce.com/?p=227">see this blog post by Laurence Luce for the gritty details</link>) but the basics are simple: accept actions from the queue and then perform a series of system commands (like launching a KVM instance) to carry them out while updating state in the database.</para></listitem><listitem><para>As you can gather by the name, <code>nova-volume</code> manages the creation, attaching and detaching of persistent volumes to compute instances (similar functionality to <link xlink:href="http://aws.amazon.com/ebs/">Amazons Elastic Block Storage</link>). It can use volumes from a variety of providers, such as iSCSI.</para></listitem><listitem><para>The <code>nova-network</code> worker daemon is very similar to <code>nova-compute</code> and <code>nova-volume</code>. It accepts networking tasks from the queue and then performs tasks to manipulate the network (such as setting up bridging interfaces or changing <code>iptables</code> rules).</para></listitem><listitem><para>The queue provides a central hub for passing messages between daemons. This is currently implemented with <link xlink:href="http://www.rabbitmq.com/">RabbitMQ</link> today, but theoretically could be any <link xlink:href="http://www.amqp.org/confluence/display/AMQP/Advanced+Message+Queuing+Protocol">AMPQ message queue</link> supported by the python <link xlink:href="http://barryp.org/software/py-amqplib/">ampqlib</link>.</para></listitem><listitem><para>The <link xlink:href="http://en.wikipedia.org/wiki/SQL">SQL database</link> stores most of the
build-time and run-time state for a cloud infrastructure. This includes the
instance types that are available for use, instances in use, networks
available and projects. Theoretically, OpenStack Compute can support any
database supported by <link xlink:href="http://www.sqlalchemy.org/"
>SQL-Alchemy</link> but the only databases currently being widely used
are <link xlink:href="http://www.sqlite.org/">sqlite3</link> (only
appropriate for test and development work), <link
xlink:href="http://mysql.com/">MySQL</link> and <link
xlink:href="http://www.postgresql.org/">PostgreSQL</link>.</para></listitem><listitem><para>OpenStack Glance is a separate project from OpenStack Compute, but as shown above,
complimentary. While it is an optional part of the overall compute
architecture, I cant imagine that most OpenStack Compute installations will
not be using it (or a complimentary product). There are three pieces to
Glance: <code>glance-api</code>, <code>glance-registry</code> and the image
store. As you can probably guess, <code>glance-api</code> accepts API calls,
much like <code>nova-api</code>, and the actual image blobs are placed in
the image store. The <code>glance-registry</code> stores and retrieves
metadata about images. The image store can be a number of different object
stores, include OpenStack Object Storage (Swift).</para></listitem><listitem><para>Finally, another optional project that we will need for our fictional service provider is an
user dashboard. I have picked the OpenStack Dashboard here, but there are
also several other web front ends available for OpenStack Compute. The
OpenStack Dashboard provides a web interface into OpenStack Compute to give
application developers and devops staff similar functionality to the API. It
is currently implemented as a <link
xlink:href="http://www.djangoproject.com/">Django</link> web
application.</para></listitem></itemizedlist><para>This logical architecture represents just one way to architect OpenStack Compute. With its
pluggable architecture, we could easily swap out OpenStack Glance with another image
service or use another dashboard. In the coming releases of OpenStack, expect to see
more modularization of the code especially in the network and volume areas.</para></section>
<section xml:id="nova-conceptual-mapping"><title>Nova Conceptual Mapping</title><para>Now that weve seen a conceptual architecture for a fictional cloud provider and examined the logical architecture of OpenStack Nova, it is fairly easy to map the OpenStack components to the conceptual areas to see what we are lacking:</para><informalfigure><mediaobject><imageobject><imagedata scale="50" fileref="figures/nova-cactus-conceptual-coverage.png"/></imageobject></mediaobject></informalfigure><para>As you can see from the illustration, Ive overlaid logical components of OpenStack Nova, Glance and Dashboard to denote functional coverage. For each of the overlays, Ive added the name of the logical component within the project that provides the functionality. While all of these judgements are highly subjective, you can see that we have a majority coverage of the functional areas with a few notable exceptions:</para><itemizedlist><listitem><para>The largest gap in our functional coverage is logging and billing. At the moment, OpenStack Nova doesnt have a billing component that can mediate logging events, rate the logs and create/present bills. That being said, most service providers will already have one (or <emphasis>many</emphasis>) of these so the focus is really on the logging and integration with billing. This could be remedied in a variety of ways: augmentations of the code (which should happen in the next release “Diablo”), integration with commercial products or services (perhaps <link xlink:href="http://www.zuora.com/">Zuora</link>) or custom log parsing. </para></listitem><listitem><para>Identity is also a point which will likely need to be augmented. Unless we are running a stock
LDAP for our identity system, we will need to integrate our solution with
OpenStack Compute. Having said that, this is true of almost all cloud
solutions.</para></listitem><listitem><para>The customer portal will also be an integration point. While OpenStack Compute provides a user
dashboard (to see running instance, launch new instances, etc.), it doesnt
provide an interface to allow application owners to signup for service,
track their bills and lodge trouble tickets. Again, this is probably
something that it is already in place at our imaginary service provider. </para></listitem><listitem><para>Ideally, the Admin API would replicate all functionality that wed be able to do via the
command line interface (which in this case is mostly exposed through the
nova-manage command). This will get better in the Diablo release with the
<link xlink:href="http://wiki.openstack.org/NovaAdminAPI">Admin
API</link> work.</para></listitem><listitem><para>Cloud monitoring and operations will be an important area of focus for our service provider. A
key to any good operations approach is good tooling. While OpenStack Compute
provides nova-instancemonitor, which tracks compute node utilization, were
really going to need a number of third party tools for monitoring. </para></listitem><listitem><para>Policy is an extremely important area but very provider specific. Everything from quotas
(which are supported) to quality of service (QoS) to privacy controls can
fall under this. Ive given OpenStack Nova partial coverage here, but that
might vary depending on the intricacies of the providers needs. For the
record, the Cactus release of OpenStack Compute provides quotas for instances
(number and cores used, volumes (size and number), floating IP addresses and
metadata.</para></listitem><listitem><para>Scheduling within OpenStack Compute is fairly rudimentary for larger installations today. The
pluggable scheduler supports chance (random host assignment), simple (least
loaded) and zone (random nodes within an availability zone). As within most
areas on this list, this will be greatly augmented in Diablo. In development
are distributed schedulers and schedulers that understand heterogeneous
hosts (for support of GPUs and differing CPU architectures).</para></listitem></itemizedlist><para>As you can see, OpenStack Compute provides a fair basis for our mythical service provider, as
long as the mythical service providers are willing to do some integration here and
there. </para>
<para>Note that since the time of this writing, OpenStack Identity Service has been
added.</para></section></section>
<section xml:id="why-cloud">
<title>Why Cloud?</title>
<para>In data centers today, many computers suffer the same underutilization in computing
power and networking bandwidth. For example, projects may need a large amount of
computing capacity to complete a computation, but no longer need the computing power
after completing the computation. You want cloud computing when you want a service
that's available on-demand with the flexibility to bring it up or down through
automation or with little intervention. The phrase "cloud computing" is often
represented with a diagram that contains a cloud-like shape indicating a layer where
responsibility for service goes from user to provider. The cloud in these types of
diagrams contains the services that afford computing power harnessed to get work done.
Much like the electrical power we receive each day, cloud computing provides subscribers
or users with access to a shared collection of computing resources: networks for
transfer, servers for storage, and applications or services for completing tasks. </para>
<para>These are the compelling features of a cloud:</para>
<itemizedlist spacing="compact">
<listitem>
<para>On-demand self-service: Users can provision servers and networks with little
human intervention. </para></listitem>
<listitem>
<para>Network access: Any computing capabilities are available over the network.
Many different devices are allowed access through standardized mechanisms. </para></listitem>
<listitem>
<para>Resource pooling: Multiple users can access clouds that serve other consumers
according to demand. </para></listitem>
<listitem>
<para>Elasticity: Provisioning is rapid and scales out or in based on need. </para></listitem>
<listitem>
<para>Metered or measured service: Just like utilities that are paid for by the
hour, clouds should optimize resource use and control it for the level of
service or type of servers such as storage or processing.</para></listitem>
</itemizedlist>
<para>Cloud computing offers different service models depending on the capabilities a
consumer may require. </para>
<itemizedlist>
<listitem><para>SaaS: Software as a Service. Provides the consumer the ability to use the software
in a cloud environment, such as web-based email for example. </para></listitem>
<listitem><para>PaaS: Platform as a Service. Provides the consumer the ability to deploy
applications through a programming language or tools supported by the cloud platform
provider. An example of platform as a service is an Eclipse/Java programming
platform provided with no downloads required. </para></listitem>
<listitem><para>IaaS: Infrastructure as a Service. Provides infrastructure such as computer
instances, network connections, and storage so that people can run any software or
operating system. </para></listitem>
</itemizedlist>
<para>When you hear terms such as public cloud or private cloud, these refer to the
deployment model for the cloud. A private cloud operates for a single organization, but
can be managed on-premise or off-premise. A public cloud has an infrastructure that is
available to the general public or a large industry group and is likely owned by a cloud
services company. The NIST also defines community cloud as shared by several
organizations supporting a specific community with shared concerns. </para>
<para>Clouds can also be described as hybrid. A hybrid cloud can be a deployment model, as a
composition of both public and private clouds, or a hybrid model for cloud computing may
involve both virtual and physical servers. </para>
<para>What have people done with cloud computing? Cloud
computing can help with large-scale computing needs or can
lead consolidation efforts by virtualizing servers to make
more use of existing hardware and potentially release old
hardware from service. People also use cloud computing for
collaboration because of its high availability through
networked computers. Productivity suites for word
processing, number crunching, and email communications,
and more are also available through cloud computing. Cloud
computing also avails additional storage to the cloud
user, avoiding the need for additional hard drives on each
user's desktop and enabling access to huge data storage
capacity online in the cloud. </para>
<para>For a more detailed discussion of cloud computing's essential
characteristics and its models of service and deployment, see <link
xlink:href="http://www.nist.gov/itl/cloud/"
>http://www.nist.gov/itl/cloud/</link>, published by the US
National Institute of Standards and Technology.</para>
<section xml:id="nova-conceptual-mapping">
<title>Nova Conceptual Mapping</title>
<para>Now that we&#226;&#8364;&#8482;ve seen a conceptual
architecture for a fictional cloud provider and examined the
logical architecture of OpenStack Nova, it is fairly easy to
map the OpenStack components to the conceptual areas to see
what we are lacking:</para>
<informalfigure>
<mediaobject>
<imageobject>
<imagedata scale="50"
fileref="figures/nova-cactus-conceptual-coverage.png" />
</imageobject>
</mediaobject>
</informalfigure>
<para>As you can see from the illustration,
I&#226;&#8364;&#8482;ve overlaid logical components of
OpenStack Nova, Glance and Dashboard to denote functional
coverage. For each of the overlays, I&#226;&#8364;&#8482;ve
added the name of the logical component within the project that
provides the functionality. While all of these judgements are
highly subjective, you can see that we have a majority coverage
of the functional areas with a few notable exceptions:</para>
<itemizedlist>
<listitem>
<para>The largest gap in our functional coverage is logging
and billing. At the moment, OpenStack Nova
doesn&#226;&#8364;&#8482;t have a billing component that
can mediate logging events, rate the logs and
create/present bills. That being said, most service
providers will already have one (or
<emphasis>many</emphasis>) of these so the focus is really
on the logging and integration with billing. This could be
remedied in a variety of ways: augmentations of the code
(which should happen in the next release
&#226;&#8364;&#339;Diablo&#226;&#8364;), integration with
commercial products or services (perhaps
<link xlink:href="http://www.zuora.com/">Zuora</link>) or
custom log parsing.</para>
</listitem>
<listitem>
<para>Identity is also a point which will likely need to be
augmented. Unless we are running a stock LDAP for our
identity system, we will need to integrate our solution
with OpenStack Compute. Having said that, this is true of
almost all cloud solutions.</para>
</listitem>
<listitem>
<para>The customer portal will also be an integration
point. While OpenStack Compute provides a user dashboard
(to see running instance, launch new instances, etc.), it
doesn&#226;&#8364;&#8482;t provide an interface to allow
application owners to signup for service, track their bills
and lodge trouble tickets. Again, this is probably
something that it is already in place at our imaginary
service provider.</para>
</listitem>
<listitem>
<para>Ideally, the Admin API would replicate all
functionality that we&#226;&#8364;&#8482;d be able to do
via the command line interface (which in this case is
mostly exposed through the nova-manage command). This will
get better in the Diablo release with the
<link xlink:href="http://wiki.openstack.org/NovaAdminAPI">
Admin API</link>work.</para>
</listitem>
<listitem>
<para>Cloud monitoring and operations will be an important
area of focus for our service provider. A key to any good
operations approach is good tooling. While OpenStack
Compute provides nova-instancemonitor, which tracks compute
node utilization, we&#226;&#8364;&#8482;re really going to
need a number of third party tools for monitoring.</para>
</listitem>
<listitem>
<para>Policy is an extremely important area but very
provider specific. Everything from quotas (which are
supported) to quality of service (QoS) to privacy controls
can fall under this. I&#226;&#8364;&#8482;ve given
OpenStack Nova partial coverage here, but that might vary
depending on the intricacies of the providers needs. For
the record, the Cactus release of OpenStack Compute
provides quotas for instances (number and cores used,
volumes (size and number), floating IP addresses and
metadata.</para>
</listitem>
<listitem>
<para>Scheduling within OpenStack Compute is fairly
rudimentary for larger installations today. The pluggable
scheduler supports chance (random host assignment), simple
(least loaded) and zone (random nodes within an
availability zone). As within most areas on this list, this
will be greatly augmented in Diablo. In development are
distributed schedulers and schedulers that understand
heterogeneous hosts (for support of GPUs and differing CPU
architectures).</para>
</listitem>
</itemizedlist>
<para>As you can see, OpenStack Compute provides a fair basis
for our mythical service provider, as long as the mythical
service providers are willing to do some integration here and
there.</para>
<para>Note that since the time of this writing, OpenStack
Identity Service has been added.</para>
</section>
</section>
<section xml:id="why-cloud">
<title>Why Cloud?</title>
<para>In data centers today, many computers suffer the same
underutilization in computing power and networking bandwidth. For
example, projects may need a large amount of computing capacity
to complete a computation, but no longer need the computing power
after completing the computation. You want cloud computing when
you want a service that's available on-demand with the
flexibility to bring it up or down through automation or with
little intervention. The phrase "cloud computing" is often
represented with a diagram that contains a cloud-like shape
indicating a layer where responsibility for service goes from
user to provider. The cloud in these types of diagrams contains
the services that afford computing power harnessed to get work
done. Much like the electrical power we receive each day, cloud
computing provides subscribers or users with access to a shared
collection of computing resources: networks for transfer, servers
for storage, and applications or services for completing
tasks.</para>
<para>These are the compelling features of a cloud:</para>
<itemizedlist spacing="compact">
<listitem>
<para>On-demand self-service: Users can provision servers and
networks with little human intervention.</para>
</listitem>
<listitem>
<para>Network access: Any computing capabilities are
available over the network. Many different devices are
allowed access through standardized mechanisms.</para>
</listitem>
<listitem>
<para>Resource pooling: Multiple users can access clouds that
serve other consumers according to demand.</para>
</listitem>
<listitem>
<para>Elasticity: Provisioning is rapid and scales out or in
based on need.</para>
</listitem>
<listitem>
<para>Metered or measured service: Just like utilities that
are paid for by the hour, clouds should optimize resource use
and control it for the level of service or type of servers
such as storage or processing.</para>
</listitem>
</itemizedlist>
<para>Cloud computing offers different service models depending
on the capabilities a consumer may require.</para>
<itemizedlist>
<listitem>
<para>SaaS: Software as a Service. Provides the consumer the
ability to use the software in a cloud environment, such as
web-based email for example.</para>
</listitem>
<listitem>
<para>PaaS: Platform as a Service. Provides the consumer the
ability to deploy applications through a programming language
or tools supported by the cloud platform provider. An example
of platform as a service is an Eclipse/Java programming
platform provided with no downloads required.</para>
</listitem>
<listitem>
<para>IaaS: Infrastructure as a Service. Provides
infrastructure such as computer instances, network
connections, and storage so that people can run any software
or operating system.</para>
</listitem>
</itemizedlist>
<para>When you hear terms such as public cloud or private cloud,
these refer to the deployment model for the cloud. A private
cloud operates for a single organization, but can be managed
on-premise or off-premise. A public cloud has an infrastructure
that is available to the general public or a large industry group
and is likely owned by a cloud services company. The NIST also
defines community cloud as shared by several organizations
supporting a specific community with shared concerns.</para>
<para>Clouds can also be described as hybrid. A hybrid cloud can
be a deployment model, as a composition of both public and
private clouds, or a hybrid model for cloud computing may involve
both virtual and physical servers.</para>
<para>What have people done with cloud computing? Cloud computing
can help with large-scale computing needs or can lead
consolidation efforts by virtualizing servers to make more use of
existing hardware and potentially release old hardware from
service. People also use cloud computing for collaboration
because of its high availability through networked computers.
Productivity suites for word processing, number crunching, and
email communications, and more are also available through cloud
computing. Cloud computing also avails additional storage to the
cloud user, avoiding the need for additional hard drives on each
user's desktop and enabling access to huge data storage capacity
online in the cloud.</para>
<para>For a more detailed discussion of cloud computing's
essential characteristics and its models of service and
deployment, see
<link xlink:href="http://www.nist.gov/itl/cloud/">
http://www.nist.gov/itl/cloud/</link>, published by the US
National Institute of Standards and Technology.</para>
</section>
</chapter>