-
Notifications
You must be signed in to change notification settings - Fork 2
Server Install
These are the installation instructions for a server and will add some steps that aim at a production environment, which are not described in the [Developer-Install](developer install).
premise:
- currently, we've deployed the application only on Ubuntu Linux distributions
- empty trusty (14.04) or later
- three additional partitions (for easier increasing the sizes of the partitions separately):
- /data/log, 1-5GB depending on usage
- /data/mysql, up to 5GB
- /data/neo4j, up to 20GB
- let the $HOME of the less privileged user be '/home/user'
- all commands boxes start as the less privileged user and in their $HOME.
- Requiring root is explicitly marked (with
su
rather thansudo ...
; note that using sudo before each line is not sufficient to yield the same results as with su. Using su may require you to set a password for root if not already done!)
Note: some commands require user input, this is no unattended installation
This step requires root level access
su
apt-get install --no-install-recommends --yes git-core maven nodejs npm build-essential
These steps require less privileged access
lookout for the correct path (/home/user)
cd /home/user
git clone --depth 1 --branch builds/unstable https://github.com/dswarm/dswarm.git
git clone --depth 1 --branch master https://github.com/dswarm/dswarm-graph-neo4j.git
git clone --depth 1 --branch builds/unstable https://github.com/dswarm/dswarm-backoffice-web.git
These steps require root level access
- D:SWARM requires Java 8, which is no longer available in the default package sources. Follow these steps
su
add-apt-repository ppa:webupd8team/java
apt-get update
apt-get install oracle-java8-installer oracle-java8-set-default
You can verify your java version with
java -version 2>&1 | grep -q "1.8" && echo "OK, Java 8 is available" || echo "Uh oh, Java 8 is not available"
Add $JAVA_HOME in /etc/environment:
JAVA_HOME="/usr/lib/jvm/java-8-oracle"
Earlier versions of Tomcat (< 7.0.30) do not run with Java 8 albeit being advertised to do so (related to this bug). Ubuntu 12.04 Precise includes Tomcat in version 7.0.26, which therefore must be updated. If you run precise, execute these steps:
wget https://launchpad.net/ubuntu/+archive/primary/+files/libservlet3.0-java_7.0.52-1ubuntu0.1_all.deb
wget https://launchpad.net/ubuntu/+archive/primary/+files/libtomcat7-java_7.0.52-1ubuntu0.1_all.deb
wget https://launchpad.net/ubuntu/+archive/primary/+files/tomcat7-admin_7.0.52-1ubuntu0.1_all.deb
wget https://launchpad.net/ubuntu/+archive/primary/+files/tomcat7-common_7.0.52-1ubuntu0.1_all.deb
wget https://launchpad.net/ubuntu/+archive/primary/+files/tomcat7_7.0.52-1ubuntu0.1_all.deb
su
dpkg -i libservlet3.0-java_7.0.52-1ubuntu0.1_all.deb
dpkg -i libtomcat7-java_7.0.52-1ubuntu0.1_all.deb
dpkg -i tomcat7-common_7.0.52-1ubuntu0.1_all.deb
dpkg -i tomcat7-admin_7.0.52-1ubuntu0.1_all.deb
dpkg -i tomcat7_7.0.52-1ubuntu0.1_all.deb
If you run a more recent version, just install tomcat from the official sources:
su
apt-get install tomcat7
If you encounter an install failure, stating $JAVA_HOME not found by tomcat7, than change $JAVA_HOME in /etc/default/tomcat7 and try the installation again:
JAVA_HOME=/usr/lib/jvm/java-8-oracle
su
apt-get install --no-install-recommends --yes mysql-server nginx curl
currently, we rely on Neo4j version 2.3.2
su
wget -O - http://debian.neo4j.org/neotechnology.gpg.key | apt-key add -
echo 'deb http://debian.neo4j.org/repo stable/' > /etc/apt/sources.list.d/neo4j.list
apt-get update
apt-get install --no-install-recommends --yes neo4j=2.3.2
You can open the Neo4j Browser at http://localhost:7474/browser/
to check that the correct version has been installed.
Make sure Neo4j does not get updated when updating packages. You can use apt-pinning to do so. As root, create a file
su
touch /etc/apt/preferences.d/neo4j.pref
and add the following lines to this file.
Package: neo4j
Pin: version 2.3.2
Pin-Priority: 1000
su
chown -R tomcat7:tomcat7 /data/log
chown -R mysql:mysql /data/mysql
chown -R neo4j:adm /data/neo4j
su
ln -s /usr/bin/nodejs /usr/bin/node
npm install -g grunt-cli karma bower
Create a database and a user for d:swarm. To customize the settings, edit dswarm/persistence/src/main/resources/create_database.sql
. Do not check in this file in case you modify it. Hint: remember settings for step 13 (configure d:swarm).
mysql -uroot -p < dswarm/persistence/src/main/resources/create_database.sql
Then, open /etc/mysql/my.cnf
(Ubuntu 14.04) or /etc/mysql/mysql.conf.d/mysqld.cnf
(Ubuntu 16.04) and add the following line to the section [mysqld]
(around line 45)
wait_timeout = 1209600
in the same file, same sections, change datadir
to /data/mysql
(around line 40)
datadir = /data/mysql
add some performance tweaks for innodb for MySQL > 5.5
innodb-read-io-threads=1
innodb-write-io-threads=1
add this directory to AppArmor
su
echo "alias /var/lib/mysql/ -> /data/mysql/," >> /etc/apparmor.d/tunables/alias
/etc/init.d/apparmor reload
and copy whole MySQL data directory to new location (after stopping the mysql service)
service mysql stop
cp -pr /var/lib/mysql/* /data/mysql/
create /etc/nginx/sites-available/dswarm
and add the following block
server {
listen 80 default_server;
root /var/www/dswarm;
location / {
try_files $uri $uri/ =404;
}
location /dmp {
# Allow OPTIONS without Authorization
# This is required for proper CORS preflight support
# see http://www.w3.org/TR/cors/#preflight-request ('Exclude user credentials')
if ($request_method = OPTIONS) {
# return 599;
add_header Access-Control-Allow-Origin "http://localhost:9999";
add_header Access-Control-Allow-Methods "GET, OPTIONS, HEAD, PUT, POST, DELETE";
add_header Access-Control-Allow-Headers "Accept, Authorization, Origin, X-Requested-With, Content-Type";
add_header Access-Control-Allow-Credentials "true";
add_header Content-Length 0;
add_header Content-Type text/plain;
return 200;
}
client_max_body_size 100M;
proxy_pass http://127.0.0.1:8080$uri$is_args$args;
proxy_read_timeout 600s;
proxy_send_timeout 300s;
}
location /neo {
auth_basic "Restricted";
auth_basic_user_file /data/.htaccess;
proxy_pass http://127.0.0.1:7474/browser;
}
location /db/ {
proxy_pass http://127.0.0.1:7474/db/;
}
location /dmp/api-docs {
client_max_body_size 100M;
proxy_pass http://127.0.0.1:8080$uri$is_args$args;
}
location /docs {
alias /home/user/git/dswarm/controller/src/docs/ui/dist;
}
error_page 599 = @dmpnoauth;
location @dmpnoauth {
client_max_body_size 100M;
proxy_pass http://127.0.0.1:8080$uri$is_args$args;
}
}
for very long running processes, add appropriate settings for timeouts such as the proxy_read_timeout
, see http://nginx.org/en/docs/http/ngx_http_proxy_module.html.
su
ln -s /etc/nginx/sites-available/dswarm /etc/nginx/sites-enabled/000-dswarm
mkdir /var/www
note: replace user
in /home/user/...
with the user home directory of your d:swarm installation + you may need to delete the default site (reference) of nginx
su
rm /etc/nginx/sites-enabled/default
open /etc/tomcat7/server.xml
at line 33 and add a driverManagerProtection="false"
so that the line reads
<Listener className="org.apache.catalina.core.JreMemoryLeakPreventionListener" driverManagerProtection="false" />
at line 73, same file, add this option maxPostSize="104857600"
, so that the Connector block reads
<Connector port="8080" protocol="HTTP/1.1"
connectionTimeout="20000"
maxPostSize="104857600"
URIEncoding="UTF-8"
redirectPort="8443" />
then, give tomcat some more memory
su
echo 'export CATALINA_OPTS="-Xms4G -Xmx4G -XX:+CMSClassUnloadingEnabled -XX:+UseConcMarkSweepGC -XX:MaxPermSize=512M"' >> /usr/share/tomcat7/bin/setenv.sh
And finally, you have to tell Tomcat about Java 8. Open the file /etc/default/tomcat7
and around line 12, add this setting
# The home directory of the Java development kit (JDK). You need at least
# JDK version 1.5. If JAVA_HOME is not set, some common directories for
# OpenJDK, the Sun JDK, and various J2SE 1.5 versions are tried.
JAVA_HOME=/usr/lib/jvm/java-8-oracle
increase file handlers at /etc/security/limits.conf
root soft nofile 40000
root hard nofile 40000
plus add ulimit -n 40000
into your neo4j service script (under /etc/init.d
, e.g., /etc/init.d/neo4j-service
) before starting the daemon
edit /etc/neo4j/neo4j.properties
and:
- insert some storage tweaks
dbms.pagecache.memory=8g
keep_logical_logs=false
edit /etc/neo4j/neo4j-server.properties
and:
- change the database location
org.neo4j.server.database.location=/data/neo4j/data/graph.db
- disable authentication
dbms.security.auth_enabled=false
- change the rrd database location
org.neo4j.server.webadmin.rrdb.location=/data/neo4j/data/rrd
- add our graph extension
org.neo4j.server.thirdparty_jaxrs_classes=org.dswarm.graph.resources=/graph
- (optional) specify IP address
org.neo4j.server.webserver.address=0.0.0.0
edit /etc/neo4j/neo4j-wrapper.conf
and:
- insert an additional parameter (if your server is x64)
wrapper.java.additional.1=-d64
- tweak the java heap space size to an appropriate value according to your server ram memory, e.g.,
wrapper.java.initmemory=512
wrapper.java.maxmemory=8192
then, create a symlink from the previous log location to the external partition
su
mv /var/lib/neo4j/data/log{,-old}
ln -s /data/neo4j/log /var/lib/neo4j/data/log
mkdir /data/neo4j/log
chown -R neo4j:adm /data/neo4j/log
By default, the Neo4j Server is bundled with a Web server that binds to host localhost on port 7474, answering only requests from the local machine. If you need remote access to the Neo4j Browser or the D:SWARM Graph Extension API, see Secure the port and remote client connection accepts
These steps require less privileged access
Follow the instructions in d:swarm Configuration to create a custom d:swarm config file. Make sure that you've added a reference to your d:swarm config file in the Context section of context.xml configuration of Tomcat. Furthermore, make sure that the user that runs your Tomcat server (e.g. tomcat7) has read access to the d:swarm config file and read and write access to the folders that are configured in the paths section of your d:swarm config. You can do this, for example, by changing the owner of these folders to the user that runs your Tomcat server:
su
chown -R tomcat7:tomcat7 /path/to/your-folder-in-paths-section-of-dswarm-config
pushd dswarm-graph-neo4j
mvn -U -PRELEASE -DskipTests clean package
popd
mv dswarm-graph-neo4j/target/graph-1.3-jar-with-dependencies.jar dswarm-graph-neo4j.jar
pushd dswarm
mvn -U -DskipTests clean install -Dconfig.file=/path/to/dswarm.conf
pushd controller
mvn -U -DskipTests war:war -Dconfig.file=/path/to/dswarm.conf
popd; popd
mv dswarm/controller/target/dswarm-controller-0.1-SNAPSHOT.war dmp.war
note: Please specify the path to your custom d:swarm config, if it is not located at the root directory of the d:swarm backend repository. Otherwise, you can run the maven task with argument -Pdswarm-conf
(which looks at the root directory of the d:swarm backend repository for a d:swarm config named dswarm.conf)
pushd dswarm-backoffice-web; pushd yo
npm install
bower install
STAGE=unstable DMP_HOME=../../dswarm grunt build
popd
rsync --delete --verbose --recursive yo/dist/ yo/publish
popd
note: npm install
may needs to be executed as root
set symbolic link to web root directory of the frontend (this step only needs to be done once)
su
ln -s /home/user/dswarm-backoffice-web/yo/publish /var/www/dswarm
These steps require root level access
lookout for the correct path (/home/user)
su
rm /var/lib/tomcat7/webapps/dmp.war
rm -r /var/lib/tomcat7/webapps/dmp
cp /home/user/dmp.war /var/lib/tomcat7/webapps/
cp /home/user/dswarm-graph-neo4j.jar /usr/share/neo4j/plugins/
su
/etc/init.d/mysql restart
/etc/init.d/neo4j-service restart
/etc/init.d/nginx restart
/etc/init.d/tomcat7 restart
This step requires less privileged access
When running the backend the first time, the Metadata Repository (MySQL database) needs to be initialized. When updated, a reset is required in case the schema or initial data has changed. lookout for the correct path (/home/user)
pushd dswarm/dev-tools
python reset-dbs.py \
--persistence-module=../persistence \
--user=dmp \
--password=dmp \
--db=dmp \
--neo4j=http://localhost:7474/graph
Or provide the credentials and values you configured.
Check python reset-dbs.py --help
for additional information.
The Task Processing Unit (TPU) allows for processing larger amounts of data with mappings that were created via the d:swarm Back Office. Please have a look at the TPU documentation for further details on its usage and install instructions.
.. to easily explore the d:swarm backend HTTP API.
1. Go to the root directory of your d:swarm backend repository
cd /home/user/dswarm
2. Fetch submodules of this repository
git submodule update --init --recursive
(this command fetches a copy of the Swagger UI, which is linked as sub module from the backend controller)
3. Edit /etc/nginx/sites-available/default
and add this just below the location /
block
location /dmp/api-docs {
client_max_body_size 100M;
proxy_pass http://127.0.0.1:8080$uri$is_args$args;
}
to forward the Swagger description of the d:swarm backend HTTP API and
location /docs {
alias
[INSERT_HERE_THE_ROOT_DIRECTORY_OF_YOUR_DSWARM_BACKEND_REPOSITORY]/controller/src/docs/ui/dist;
}
to point to the local Swagger UI installation.
note: you need to insert the correct path to the d:swarm backend repository (e.g. /home/user/dswarm
)
4. Edit /home/user/dswarm/controller/src/docs/ui/dist/index.html
line 31 to insert the URL of your local d:swarm backend HTTP API Swagger description:
url = "http://localhost:/dmp/api-docs/"
Now you should be able to open and explore the d:swarm backend HTTP API via
http://localhost/docs
pushd dswarm; git pull; popd
pushd dswarm-graph-neo4j; git pull; popd
pushd dswarm-backoffice-web; git pull; popd
2. repeat steps 13 (Building D:SWARM Graph Extension) to 18 (Init/reset Metadata Repository + Data Hub) from the installation as necessary
First of all it's a good idea to know which of the four components frontend, backend, Metadata Repository (MySQL) and Data Hub (Neo4j database) does not run. If you already know, skip this list.
-
frontend: open
http://localhost:9999
(port defaults to 80 for server installation) in a browser. The front end should be displayed. -
backend: open
http://localhost:8087/dmp/_ping
(port defaults to 8080 for server installation) in a browser. The expected response is a page with the word pong. - Metadata Repository (MySQL database): open a terminal and type
mysql -udmp -p dmp
to open a connection to MySQL and select the database dmp. Hint: check for correct user name, password and database name in case you did not use the default values. If you can log in, typeselect * from DATA_MODEL;
. At least three internal data models should be listed. - Data Hub (Neo4j database): open
http://localhost:7474/browser/
in a browser. The Neo4j browser should be opened. - D:SWARM Graph Extension: open
http://localhost:7474/graph/gdm/ping
in a browser. The expected response is a page with the word pong.
Now that you know which component does not run, go through
- is curl installed?
- Did you choose a database name other than the default? If yes, you currently have to modify the
init_internal_schema.sql
, which is internally used by the scriptreset-dbs.py
and change theUSE
database statement (this should be improved). - when building the projects with maven, did you use the
-U
option to update project dependencies? - Check your dswarm Configuration. Are database name and password correct, i.e., the ones used when installing the Metadata Repository (MySQL; step Setup Metadata Repository)? Compare dswarm/persistence/src/main/resources/create_database.sql with dswarm/dswarm.conf or any other configuration option you use.
- Can Tomcat read the d:swarm configuration file?
- initialize/reset the Metadata Repository + Data Hub. They may be empty or contain corrupted data caused by a failed unit tests.
- Did you miss an update of, e.g., the neo4j version? Compare your installed version with the required version (see step 5)
- Are do the folders that you've configured in the paths section of your d:swarm config existent and are they accessible (read + write) for the user that runs your Tomcat server?
- If you specified a root path folder in the config, make sure it contains a tmp/resources and log folder (if you've not specified other folders in the paths section of your d:swarm config)
- Did you set the maximum file-size for uploads (see Step 9) to a sufficient value for your scenario?
- Did you set the server proxy timeout (see Step 9) to a sufficient value for your scenario?
- In order to access the D:SWARM Graph Extension (e.g. .../graph/gdm/ping) you may have to allow access from other than localhost (see step Setup Data Hub).
- if nginx package was updated, then (probably) the nginx webserver root was overwritten as well, i.e., you need to set the symbolic link to your d:swarm backoffice ui directory again (see Step 9)
- Overview
- misc
- Graph Data Model
- Server-Installation (Productive Environment)
- Maintenance
- HowTos
- Use Cases