This project serves as a tool to test the quality of Natural Language to Visualization (NL2Viz) models based on existing benchmarks. Read the project report here. View the final presentation here. The project was for 6.S079 in the Spring 2022 semester.
The app uses a React front-end (bootstrapped with
create-react-app
) and a Python backend using
the Flask
web framework. The benchmark is
provided by nvBench
[1], which gives a list of
The tool is deployed at https://nl2viz.herokuapp.com/. Due to the resource-intensive nature of these models, however, it is highly likely that the application crashes several times. Use at your own risk!
A high-level overview of the project structure is shown below.
📦final-project
┣ 📂client
┃ ┣ 📂public
┃ ┣ 📂src
┃ ┃ ┣ 📂components
┃ ┃ ┗ 📂pages
┣ 📂server
┃ ┣ 📂benchmark
┃ ┃ ┣ 📂data
┃ ┃ ┗ 📜benchmark_meta.json
┃ ┣ 📂models
┃ ┃ ┣ 📂ncNet
┃ ┃ ┗ 📂nl4dv
┃ ┣ 📂scripts
┃ ┃ ┣ 📜config.py
┃ ┃ ┣ 📜get_benchmark_meta.py
┃ ┃ ┗ 📜sqlite_to_csv.py
┃ ┣ 📜api.py
┃ ┗ 📜model_setup.py
┣ 📜app.py
┣ 📜nltk.txt
┣ 📜requirements.txt
┗ 📜runtime.txt
This project is separated into a client directory and a server directory. The client handles user interaction, while the server handles the actual data processing and keeps track of the Nl2Viz model instances.
The app.py
file is the entry point for the server, while
[client/src/index.js
] is the entry point for the client, as is standard for
React apps.
The datasets available in this tool are found here. Note that the only datasets that exist in this directory are those that have an associated benchmark. Some benchmarks require more than one dataset to be used in order to produce the final result; these benchmarks are excluded in this tool. Therefore the datasets that are never used by a benchmark are also not included.
Follow the steps below to get started:
-
Clone the repository into your local workspace.
-
Set up a virtual environment with Python version no greater than
3.9
. This is required for the models to work, since they both use older versions of libraries that have been depracated in the newer versions of Python. -
Start the virtual environment.
-
Install the dependencies in your virtual environment by running
pip install -r requirements.txt
. NOTE:requirements.txt
installs the CPU version ofpytorch
. This is necessary for the production environment, but may yield slower processing times in development. If you are planning on doing a lot of development, also make sure to install the GPU version by installing the requirements indev-requirements.txt
. This requirements file also contains modules required to perform evaluation, such asDask
andscikit-image
. If you do choose to re-run evaluation, also make sure to install the node modules by runningnpm i
in the project root directory. -
For the
nl4dv
model to work, install the following dependencies separately (see thenl4dv
documentation for more details.):python -m nltk.downloader popular
python -m spacy download en_core_web_sm
-
In the root directory, run
flask run
to start the server. To start it in development mode, create a.flaskenv
file in the root directory and addFLASK_ENV=development
. The server is served at http://localhost:5000 by default. -
Navigate to the
client/
directory and runnpm i
to install the dependencies for the React frontend. -
Still in
client/
runnpm start
to start the client, served at http://localhost:3000. -
For full access to all of the features, still in
client/
, runnpm run build
to build the latest version of the frontend. Then, instead of needing to start the client, simply start the server and navigate to http://localhost:5000 which serves the static buildpack.