Readmev2 (aws#21)

* Gremlin and SPARQL images * Updated README new instructions for Blazegraph, Gremlin Server, and Neptune new images showing graph-notebook features * Update README.md * Update README.md * Update README.md * Update README.md
bauer-at-work · Nov 19, 2020 · 23775ba · 23775ba
1 parent 97324b0
commit 23775ba
Show file tree

Hide file tree

Showing 3 changed files with 90 additions and 43 deletions.
diff --git a/README.md b/README.md
@@ -1,20 +1,65 @@
-## graph-notebook
+## Graph Notebook: easily query and visualize graphs 
 
-Python package integrating Jupyter notebooks with various graph-stores including
-[Apache TinkerPop](https://tinkerpop.apache.org/) and [RDF SPARQL](https://www.w3.org/TR/rdf-sparql-query/).
+The graph notebook provides an easy way to interact with graph databases using Jupyter notebooks. Using this open-source Python package, you can connect to any graph database that supports the [Apache TinkerPop](https://tinkerpop.apache.org/) or the [RDF SPARQL](https://www.w3.org/TR/rdf-sparql-query/) graph model. These databases could be running locally on your desktop or in the cloud. Graph databases can be used to explore a variety of use cases including [knowledge graphs](https://aws.amazon.com/neptune/knowledge-graphs-on-aws/) and [identity graphs](https://aws.amazon.com/neptune/identity-graphs-on-aws/).
 
-## Requirements
+### Visualizing Gremlin queries:
 
-- Python 3.6.1 or higher, Python 3.7
-- Jupyter Notebook
+![Gremlin query and graph](./images/GremlinQueryGraph.png)
 
-## Introduction
-The graph-notebook provides a way to interact using a Jupyter notebook with any graph database that follows the Gremlin Server or RDF HTTP protocols. These databases could be running locally on your laptop, in a private data center or in the cloud. This project was initially created as a way to work with Amazon Neptune but is not limited to that database engine. For example you can connect to a Gremlin Server running on your laptop using this solution. The instructions below describe the process for connecting to Amazon Neptune. We encourage others to contribute configurations they find useful. There is an [`additional-databases`](additional-databases) folder where such information can be found. 
+### Visualizing SPARQL queries:
+
+![SPARL query and graph](./images/SPARQLQueryGraph.png)
+
+Instructions for connecting to the following graph databases:
+
+|             Endpoint            |       Graph model       |   Query language    |
+| :-----------------------------: | :---------------------: | :-----------------: | 
+|[Gremlin Server](#gremlin-server)|     property graph      |       Gremlin       |
+|    [Blazegraph](#blazegraph)    |            RDF          |       SPARQL        |
+|[Amazon Neptune](#amazon-neptune)|  property graph or RDF  |  Gremlin or SPARQL  |
+
+We encourage others to contribute configurations they find useful. There is an [`additional-databases`](https://github.com/aws/graph-notebook/blob/main/additional-databases) folder where more information can be found.
+
+## Features
+
+#### Notebook cell 'magic' extensions in the IPython 3 kernel
+`%%sparql` - Executes a SPARQL query against your configured database endpoint.
+
+`%%gremlin` - Executes a Gremlin query against your database using web sockets. The results are similar to what the Gremlin console would return.
+
+**TIP** :point_right:  There is syntax highlighting for both `%%sparql` and `%%gremlin` queries to help you structure your queries more easily.
+
+#### Notebook line 'magic' extensions in the IPython 3 kernel
+`%graph_notebook_config` - Returns a JSON object that contains connection information for your host.
+
+`%query_mode` - Lets you set the query mode for your queries to one of:
+
+* `query` (the default) : executes the query against the normal SPARQL or Gremlin endpoint
+* `explain` : Returns an explanation of the query plan instead of the query's results (valid for both SPARQL and Gremlin).
+* `profile` : Returns a profile of the query's operation, but does not actually execute the query (valid only for Gremlin).
+
+`%seed` - Provides a form to add data to your graph without the use of a bulk loader. both SPARQL and Gremlin have an airport routes dataset.
+
+**TIP** :point_right: You can list all the magics installed in the Python 3 kernel using the `%lsmagic` command.
+
+
+## Prerequisites
+
+You will need:
+
+* [Python](https://www.python.org/downloads/) 3.6.1-3.6.12
+* [Jupyter Notebook](https://jupyter.org/install) 5.7.10
+* [Tornado](https://pypi.org/project/tornado/) 4.5.3
+* A graph database that provides a SPARQL 1.1 Endpoint or a Gremlin Server
 
 
 ## Installation
 
 ```
+# pin specific versions of Jupyter and Tornado dependency
+pip install notebook==5.7.10
+pip install tornado==4.5.3
+
 # install the package
 pip install graph-notebook
 
@@ -27,70 +72,72 @@ python -m graph_notebook.static_resources.install
 python -m graph_notebook.nbextensions.install
 
 # copy premade starter notebooks
-python -m graph_notebook.notebooks.install --destination /notebook/destination/dir  
+python -m graph_notebook.notebooks.install --destination ~/notebook/destination/dir  
 
 # start jupyter
-jupyter notebook /notebook/destination/dir
+jupyter notebook ~/notebook/destination/dir
 ```
 
-## Configuration
+## Connecting to a graph database
 
-In order to connect to your graph database, you have three configuration options.
+### Gremlin Server 
 
-1. Change the host setting in your opened Jupyter notebook by running the following in a notebook cell:
+In a new cell in the Jupyter notebook, change the configuration using `%%graph_notebook_config` and modify the fields for `host`, `port`, and `ssl`.  For a local Gremlin server (HTTP or WebSockets), you can use the following command:
 
 ```
-%graph_notebook_host you-endpoint-here
+%%graph_notebook_config
+{
+  "host": "localhost",
+  "port": 8182,
+  "auth_mode": "DEFAULT",
+  "iam_credentials_provider_type": "ROLE",
+  "load_from_s3_arn": "",
+  "ssl": false,
+  "aws_region": "us-east-1"
+}
 ```
 
-2. Change your configuration entirely grabbing the current configuration, making edits, and saving it to your notebook by running the following cells:
+To setup a new local Gremlin Server for use with the graph notebook, check out [`additional-databases/gremlin server`](additional-databases/gremlin-server)
 
-```
-# 1. print your configuration
-%graph_notebook_config
+### Blazegraph
 
-# default config will be printed if nothing else is set:
-{
-    "host": "change-me",
-    "port": 8182,
-    "auth_mode": "DEFAULT",
-    "iam_credentials_provider_type": "ROLE",
-    "load_from_s3_arn": "",
-    "ssl": true,
-    "aws_region": "us-east-1"
-}
+Change the configuration using `%%graph_notebook_config` and modify the fields for `host`, `port`, and `ssl`. For a local Blazegraph database, you can use the following command:
 
-# 2. in a new cell, change the configuration by using %%graph_notebook_config (note the two leading %% instead of one)
+```
 %%graph_notebook_config
 {
-  "host": "changed-my-endpoint",
-  "port": 8182,
+  "host": "localhost",
+  "port": 9999,
   "auth_mode": "DEFAULT",
-  "iam_credentials_provider_type": "ENV",
+  "iam_credentials_provider_type": "ROLE",
   "load_from_s3_arn": "",
-  "ssl": true,
+  "ssl": false,
   "aws_region": "us-east-1"
 }
 ```
+To setup a new local Blazegraph database for use with the graph notebook, check out the [Quick Start](https://github.com/blazegraph/database/wiki/Quick_Start) from Blazegraph.
+
+### Amazon Neptune
+
+Change the configuration using `%%graph_notebook_config` and modify the defaults as they apply to your Neptune cluster:
 
-3. Store a configuration under ~/graph_notebook_config.json
 ```
-echo "{
-  "host": "changed-my-endpoint",
+%%graph_notebook_config
+{
+  "host": "your-neptune-endpoint",
   "port": 8182,
   "auth_mode": "DEFAULT",
   "iam_credentials_provider_type": "ENV",
   "load_from_s3_arn": "",
   "ssl": true,
-  "aws_region": "us-east-1"
-}" >> ~/graph_notebook_config.json
+  "aws_region": "your-neptune-region"
+}
 ```
+To setup a new Amazon Neptune cluster, check out the [AWS documentation](https://docs.aws.amazon.com/neptune/latest/userguide/manage-console-launch.html).
 
-### Connecting to a local graph store
-As mentioned in the introduction, it is possible to connect [`graph-notebook`](src/graph_notebook) to a graph database running on your local machine, an example being Gremlin Server. There are additional instructions regarding the use of local servers in the [`additional-databases`](additional-databases) folder.
-
+When connecting the graph notebook to Neptune, make sure you have a network setup to communicate to the VPC that Neptune runs on. If not, you can follow [this guide](https://github.com/aws/graph-notebook/tree/main/additional-databases/neptune). 
 
-## Authentication
+## Authentication (Amazon Neptune)
 
 If you are running a SigV4 authenticated endpoint, ensure that the config field `iam_credentials_provider_type` is set
 to `ENV` and that you have set the following environment variables:
@@ -101,7 +148,7 @@ to `ENV` and that you have set the following environment variables:
 - AWS_SESSION_TOKEN (OPTIONAL. Use if you are using temporary credentials)
 
 
-## Security
+## Contributing Guidelines
 
 See [CONTRIBUTING](https://github.com/aws/graph-notebook/blob/main/CONTRIBUTING.md) for more information.
 

diff --git a/images/GremlinQueryGraph.png b/images/GremlinQueryGraph.png
diff --git a/images/SPARQLQueryGraph.png b/images/SPARQLQueryGraph.png