Skip to content

Commit

Permalink
ARROW-905 [Docs] Dockerize document generation
Browse files Browse the repository at this point in the history
This PR has two parts. First part is to use Docker to generate the API documentation for C, C++, Python and Java. This part is now complete.
The second part is to be able to run the site locally, which is still in progress.

Author: Heimir Sverrisson <[email protected]>

Closes apache#1162 from heimir-sverrisson/hs/dockerize_api_docs and squashes the following commits:

f758fdc [Heimir Sverrisson] ARROW-905 [Docs] Add Dockerfile for reproducible documentation generation
  • Loading branch information
heimir-sverrisson authored and wesm committed Oct 12, 2017
1 parent 4cb3e97 commit 60cb1c3
Show file tree
Hide file tree
Showing 13 changed files with 401 additions and 1 deletion.
1 change: 1 addition & 0 deletions c_glib/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ Makefile.in
/doc/reference/*.stamp
/doc/reference/html/
/doc/reference/xml/
/doc/reference/tmpl/
/libtool
/m4/
/stamp-h1
Expand Down
32 changes: 31 additions & 1 deletion dev/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,4 +109,34 @@ Studio 2015):

```
dev/release/verify-release-candidate.bat apache-arrow-0.7.0.tar.gz
```
```

## Creating API documentation

The generation of API documentation for `C++`, `C Glib`, `Python`
and `Java` has been Dockerized. To generate the API documentation
run the following command:

```shell
bash dev/gen_apidocs.sh
```

This script assumes that the `parquet-cpp` Git repository
https://github.com/apache/parquet-cpp has been cloned
besides the Arrow repository and a `dist` directory can be created
at the same level by the current user. Please note that most of the
software must be built in order to create the documentation, so this
step may take some time to run, especially the first time around as the
Docker container will also have to be built.

After successfully creating the API documentation the website can be
run locally to browse the API documentation from the top level
`Documentation` menu. To run the website issue the command:

```shell
bash dev/run_site.sh
```

The local URL for the website running inside the docker container
will be shown as `Server address:` in the output of the command.
To stop the server press `Ctrl-C` in that window.
30 changes: 30 additions & 0 deletions dev/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

version: '3'
services:
gen_apidocs:
build:
context: gen_apidocs
volumes:
- ../..:/apache-arrow
run_site:
build:
context: run_site
ports:
- "4000:4000"
volumes:
- ../..:/apache-arrow
21 changes: 21 additions & 0 deletions dev/gen_apidocs.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
#!/usr/bin/env bash
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# Pass the service name to run_docker_compse.sh
# Which validates environment and runs the service
exec "$(dirname ${BASH_SOURCE})"/run_docker_compose.sh gen_apidocs
83 changes: 83 additions & 0 deletions dev/gen_apidocs/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
FROM ubuntu:14.04
ADD . /apache-arrow
WORKDIR /apache-arrow
# Prerequsites for apt-add-repository
RUN apt-get update && apt-get install -y \
software-properties-common python-software-properties
# Basic OS dependencies
RUN apt-add-repository -y ppa:ubuntu-toolchain-r/test && \
apt-get update && apt-get install -y \
wget \
rsync \
git \
gcc-4.9 \
g++-4.9 \
build-essential
# This will install conda in /home/ubuntu/miniconda
RUN wget -O /tmp/miniconda.sh \
https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh && \
bash /tmp/miniconda.sh -b -p /home/ubuntu/miniconda && \
rm /tmp/miniconda.sh
# C++ dependencies
RUN /home/ubuntu/miniconda/bin/conda install -c conda-forge \
boost-cpp \
doxygen \
maven \
cmake \
zlib \
thrift-cpp
# C_Glib dependencies
RUN apt-get install -y \
libgtk2.0-dev \
gtk-doc-tools \
gobject-introspection \
libgirepository1.0-dev \
autogen \
autoconf-archive
# Python dependencies
RUN apt-get install -y \
pkg-config
# Create Conda environment
RUN /home/ubuntu/miniconda/bin/conda create -y -q -n pyarrow-dev \
# Python
python=3.6 \
numpy \
pandas \
pytest \
cython \
ipython \
matplotlib \
numpydoc \
sphinx \
sphinx_bootstrap_theme \
six \
setuptools \
# C++
cmake \
flatbuffers \
rapidjson \
thrift-cpp \
snappy \
zlib \
brotli \
jemalloc \
lz4-c \
zstd \
-c conda-forge
CMD arrow/dev/gen_apidocs/create_documents.sh
110 changes: 110 additions & 0 deletions dev/gen_apidocs/create_documents.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
#!/usr/bin/env bash
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# Set up environment and output directory for C++ libraries
cd /apache-arrow
rm -rf dist
mkdir dist
export ARROW_BUILD_TYPE=release
export ARROW_HOME=$(pwd)/dist
export PARQUET_HOME=$(pwd)/dist
CONDA_BASE=/home/ubuntu/miniconda
export LD_LIBRARY_PATH=$(pwd)/dist/lib:${CONDA_BASE}/lib:${LD_LIBRARY_PATH}
export THRIFT_HOME=${CONDA_BASE}
export BOOST_ROOT=${CONDA_BASE}
export PATH=${CONDA_BASE}/bin:${PATH}

# Prepare the asf-site before copying api docs
pushd arrow/site
rm -rf asf-site
export GIT_COMMITTER_NAME="Nobody"
export GIT_COMMITTER_EMAIL="[email protected]"
git clone --branch=asf-site \
https://git-wip-us.apache.org/repos/asf/arrow-site.git asf-site
popd

# Make Python documentation (Depends on C++ )
# Build Arrow C++
source activate pyarrow-dev
rm -rf arrow/cpp/build
mkdir arrow/cpp/build
pushd arrow/cpp/build
cmake -DCMAKE_BUILD_TYPE=$ARROW_BUILD_TYPE \
-DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
-DARROW_PYTHON=on \
-DARROW_PLASMA=on \
-DARROW_BUILD_TESTS=OFF \
..
make -j4
make install
popd

# Build Parquet C++
rm -rf parquet-cpp/build
mkdir parquet-cpp/build
pushd parquet-cpp/build
cmake -DCMAKE_BUILD_TYPE=$ARROW_BUILD_TYPE \
-DCMAKE_INSTALL_PREFIX=$PARQUET_HOME \
-DPARQUET_BUILD_BENCHMARKS=off \
-DPARQUET_BUILD_EXECUTABLES=off \
-DPARQUET_BUILD_TESTS=off \
..
make -j4
make install
popd

# Now Python documentation can be built
pushd arrow/python
rm -rf build/*
rm -rf doc/_build
python setup.py build_ext --build-type=$ARROW_BUILD_TYPE \
--with-plasma --with-parquet --inplace
python setup.py build_sphinx -s doc/source
mkdir -p ../site/asf-site/docs/python
rsync -r doc/_build/html/ ../site/asf-site/docs/python
popd

# Build c_glib documentation
pushd arrow/c_glib
rm -rf doc/reference/html/*
./autogen.sh
./configure \
--with-arrow-cpp-build-dir=$(pwd)/../cpp/build \
--with-arrow-cpp-build-type=$ARROW_BUILD_TYPE \
--enable-gtk-doc
LD_LIBRARY_PATH=$(pwd)/../cpp/build/$ARROW_BUILD_TYPE make GTK_DOC_V_XREF=": "
mkdir -p ../site/asf-site/docs/c_glib
rsync -r doc/reference/html/ ../site/asf-site/docs/c_glib
popd

# Make C++ documentation
pushd arrow/cpp/apidoc
rm -rf html/*
doxygen Doxyfile
mkdir -p ../../site/asf-site/docs/cpp
rsync -r html/ ../../site/asf-site/docs/cpp
popd

# Make Java documentation
pushd arrow/java
rm -rf target/site/apidocs/*
mvn -Drat.skip=true install
mvn -Drat.skip=true site
mkdir -p ../site/asf-site/docs/java/
rsync -r target/site/apidocs ../site/asf-site/docs/java/
popd
40 changes: 40 additions & 0 deletions dev/run_docker_compose.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
#!/usr/bin/env bash
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

if [ $# -lt 1 ]; then
echo "This script takes one argument - the docker service to run" >&2
exit 1
fi

# Make sure this is always run in the directory above Arrow root
cd $(dirname "${BASH_SOURCE}")/../..

if [ ! -d arrow ]; then
echo "Please make sure that the top level Arrow directory" >&2
echo "is named 'arrow'"
exit 1
fi

if [ ! -d parquet-cpp ]; then
echo "Please clone the Parquet repo next to the Arrow repo" >&2
exit 1
fi

GID=$(id -g ${USERNAME})
docker-compose -f arrow/dev/docker-compose.yml run \
-u "${UID}:${GID}" "${1}"
21 changes: 21 additions & 0 deletions dev/run_site.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
#!/usr/bin/env bash
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# Pass the service name to run_docker_compse.sh
# Which validates environment and runs the service
exec "$(dirname ${BASH_SOURCE})"/run_docker_compose.sh run_site
34 changes: 34 additions & 0 deletions dev/run_site/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

FROM ubuntu:14.04
ADD . /apache-arrow
WORKDIR /apache-arrow
# Prerequsites for apt-add-repository
RUN apt-get update && apt-get install -y \
software-properties-common python-software-properties
# Set up Ruby repository
RUN apt-add-repository ppa:brightbox/ruby-ng
# The publication tools
RUN apt-get update; apt-get install -y \
apt-transport-https \
ruby2.2-dev \
ruby2.2 \
zlib1g-dev \
make \
gcc
RUN gem install jekyll bundler
CMD arrow/dev/run_site/run_site.sh
Loading

0 comments on commit 60cb1c3

Please sign in to comment.