Change-Id: I39b4927226a126438e410a89d805a38ce68d4f22
12 KiB
Copyright 2014 Hewlett-Packard Development Company, L.P.
This work is licensed under a Creative Commons Attribution 3.0
Unported License.
http://creativecommons.org/licenses/by/3.0/legalcode
Docs Publishing via Swift
Story: https://storyboard.openstack.org/#!/story/168
We need to update the docs publishing pipeline to add features the doc team needs as well as provide a process that does not rely on Jenkins since we intend to retire it.
Problem Description
Juno summit session: https://etherpad.openstack.org/p/summit-b301-ci-doc-automation
The current doc publishing process uses Jenkins to FTP the results of doc builds to a Rackspace cloud site. Multiple versions of docs (eg, for releases or branches) are handled by publishing to a subdirectory of the destination. The publishing job for the nova developer documentation is configured to publish everything that appears in doc/build/html to developer/nova, so that content generated from the master branch appears at http://docs.openstack.org/developer/nova/. However, the doc build script detects if it is being run on a change that has merged to a stable branch and moves the output into a subdirectory before uploading. Therefore, havana documentation ends up in doc/build/html/havana, and the entire doc/build/html tree is still uploaded to developer/nova but since no content is present above the havana subdirectory, the documentation from the master branch is not touched, while the new havana docs appear at http://docs.openstack.org/developer/nova/havana/. Similar approaches may be used for other kinds of documentation builds.
Put another way, the docs jobs do not have a holistic view of the site to which they are publishing, but instead operate only within the context of one subdirectory and are expected not to interfere with jobs publishing to other locations in the same site.
The major drawback for the docs team is that this mechanism does not support automatically deleting a file when it has been removed from the documentation. Even if it is not linked internally, it may still remain in search engine indexes and users may find it.
The infrastructure team wants to remove Jenkins, and both the current FTP and SCP based publishers depend on Jenkins unique security arrangements to prevent exposure of the credentials used. Our intended replacement has no such feature and we expect workers to be completely untrusted.
The current system is also not atomic or concurrency safe, as multiple publishing jobs may be running at the same time, and using SCP or FTP to upload simultaneously.
This process is also very similar to log and artifact publishing, and keeping alignment with those process is desirable.
Proposed Change
The new system will use a dedicated virtual server running Apache and using LVM cinder volumes to serve content. The existing build+publish Jenkins job will be replaced with two jobs: the build job (which will only build the documents and upload them to an intermediate storage location) and the publish job (which will run locally on the docs server and retrieve the contents and publish them to their final locations).
As part of the effort to migrate away from Jenkins, we have developed a script that uploads logs from test runs to an OpenStack Swift container. Each job is granted limited permission to upload logs with a specific prefix to their ids (like a subdirectory) within the container. We will use the same script to upload the results of the build job to Swift. For jobs that run in the check queue that currently publish to docs-draft, this will be the end of the process (Zuul can link directly to the swift URL). [Alternative: use os-loganalize to proxy from swift at a more friendly url.]
For final publishing, further action is needed. The general approach will be to have a dedicated Zuul worker on the publishing server that will download files from swift and rsync them to the correct location.
The publish job needs to know what files to download from swift (as
it can not rely on index pages). It should also do this without using
any Swift credentials (mostly for simplicity; there is not much of a
security concern with the publish worker accessing the Swift API if it
needs to). In order to retrieve the full set of files from swift, the
build job should create a manifest file called
.manifest.txt
at the root of the output directory with a
recursive directory listing of all files to be published. The publish
job can then retrieve that file (by constructing the URL from the ZUUL* environment variables) and subsequently all of the
contents that it lists. [Alternative: have the build job upload a
tarball, but that means we can't use exactly the same process for draft
and publishing.]
The publish job also needs to determine the target directory to
rsync. To do this, it will take the contents of the ZUUL_PROJECT
environment variable as a base, and to this it will append the contents
of the .target.txt
file that it will expect to find in the
contents it downloaded from swift. The file should be empty for rsyncing
to the root ("openstack/nova"), or contain, e.g., the string "havana" to
rsync to "nova/havana". This lets the build jobs specify the final
publishing location but only within the context of the project (eg, nova
can't accidentally overwrite keystone's documentation).
Before the publish job rsyncs the downloaded data into its final location, it must first create a list of directories that should not be deleted. This way if an entire directory is removed from a document, it will still be removed from the website, but directories which are themselves roots of other documents (e.g., the havana branch) are not removed. A marker file at the root of each such directory will accomplish this, and in fact, simply leaving either the .target.txt or .manifest.txt files in place after copying to the destination will suffice. The publishing job should find each of those in the destination hierarchy and add their containing directories to a list of directories to exclude from rsyncing. Then an rsync command of the form:
rsync -a --delete-after --exclude-from=/path/to/exclude-file /src/ /dest/
should safely update and delete only the relevant data.
The openstack-manuals repo builds multiple manuals, in separate subdirectories, from a single repository. Using an overlay method similar to what is used for the developer documentation, the current build/publishing job updates only the changed documents and places them in appropriate target directories. The approach described above assumes the build job will output the full contents of a single "module" each time; having missing directories in that output risks the publisher removing them in the rsync step. One of the following approaches will need to be chosen to deal with this:
- Reconfigure the docs build job to build all of the manuals for publication (the optimization to only build changed manuals could be retained for efficiency in gating). This wastes some CPU time on build hosts but perhaps not that much (and should be quantified).
- Alter the above approach to handle multiple rsync source and destination path outputs for a single job. This makes the build/publishing process more complex.
- Allow the build job to provide an initial exclusion list for the rsync command (so that it can add directories that it knows are under its control but are not being updated in this run).
After the rsync is complete, the documents will be in a location that does not necessarily map to the desired URL. The apache process on the docs server can be configured to rewrite URLs as necessary. For instance:
docs.openstack.org/developer/nova/ -> /srv/docs/openstack/nova/
docs.openstack.org/icehouse/install-guide -> /srv/docs/openstack/openstack-manuals/icehouse/install-guide
Could be achieved with rewrite rules like:
/developer/(.*) -> /srv/docs/openstack/$1
anything else: -> /srv/docs/openstack/openstack-manuals/
Finally, in the rare cases of major restructuring, or the need to delete an entire "module" from the site, a member of infra-root can log in and manually remove anything needed.
The developer.openstack.org and specs.openstack.org sites are published using the same mechanisms as docs.openstack.org currently. Under the new system, we can create apache virtual hosts for these sites that connect the appropriate URLs with their publishing locations on disk.
Alternatives
The above implementation has several minor alternative changes noted within. In addition to this approach, we also considered the following:
ReadTheDocs
While readthedocs does handle docs publishing, including being version-aware, it is specific to python-based sphinx documentation and would not be useful for openstack-manuals (or other artifacts). It is also considered quite complex to set up.
AFS
The Andrew File System is a global distributed filesystem that would work quite well in this instance. Workers could be granted limited ACLs to publish to specific locations, so we could use the current combined build+publish job approach, but the worker could rsync directly to the final publishing location in AFS, and volume replication could be used to make atomic updates to the entire site. A static web server would then serve files out of AFS; more web servers can be added as needed to scale.
This approach requires some investment in creating and maintaining an AFS cell for OpenStack, as well as some enhancement work to Nodepool and Zuul to deal with Kerberos credentials. This is all work that we would like to do for other reasons (including mirrors), but is more substantial than what would be needed for the selected approach. Moreover, it should not be difficult to move from the selected approach to use AFS later should a cell materialize.
Implementation
Assignee(s)
- Primary assignee:
-
TBD
Work Items
- Create new publish.openstack.org server (this will be a server name that is not publicized, instead we will use apache virtual hosts for the public hostnames which will be CNAME DNS entries).
- Create apache vhosts for docs.openstack.org, developer.openstack.org, and specs.openstack.org on publish.openstack.org
- Create new doc build job that publishes to swift; start running this in addition to current publishing jobs on at least one project and openstack-manuals
- Enhance the doc publishing jobs to create .target.txt files
- Enhance the swift-upload tool to create .manifest.txt files
- Write and install the Zuul worker that will run on the docs server
- After testing, add the new jobs to all projects
- Copy data from the FTP site
- Change DNS to point to the new server
- Remove old build jobs
- Remove Rackspace cloud sites instances
Repositories
N/A.
Servers
publish.openstack.org will be a new server with LVM managed cinder volumes. Perhaps using SSD.
DNS Entries
publish.openstack.org will need to point to the new server. docs.openstack.org, developer.openstack.org, and specs.openstack.org will need their TTLs lowered in advance of the moves. On moving, they will become CNAME entries for publish.openstack.org.
Documentation
Infra documentation will need to be written for the new server and this process.
Security
The build jobs will have no special access and will only be able to put content in swift. The publishing job will run locally on the docs server, but will run no user-supplied code, and will constrain the publishing of content to a project-specific area.
Testing
This can operate in parallel with the current system without disruption.
Dependencies
We should finalize the log publishing system first (this is nearly done at the time of writing).