Skip to content

Commit

Permalink
superset reads duckdb
Browse files Browse the repository at this point in the history
added in .env-local.example

fix: docker config was being ignored

tidying

duckdb works again locally; better setup instructions

fixed proxies (again!)
  • Loading branch information
docsteveharris committed Feb 13, 2025
1 parent f3592e8 commit 8fd3641
Show file tree
Hide file tree
Showing 11 changed files with 164 additions and 27 deletions.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,8 @@ superset/translations/**/messages.mo
docker/requirements-local.txt

cache/
docker/*local*
docker/.env-local
docker/.env

.temp_cache

Expand Down
67 changes: 54 additions & 13 deletions README-UCLH.md
Original file line number Diff line number Diff line change
@@ -1,40 +1,81 @@

## Docker & Compose Settings
[docs](https://superset.apache.org/docs/installation/docker-compose)

> Note that docker/.env sets the default environment variables for all the docker images used by docker compose, and that docker/.env-local can be used to override those defaults. Also note that docker/.env-local is referenced in our .gitignore, preventing developers from risking committing potentially sensitive configuration to the repository.
Create a _docker/.env-local_ file with the following keys:
via [docs](https://superset.apache.org/docs/installation/docker-compose)

## Environment Variables

You **must** copy `docker/.env.example` to `docker/.env`

```bash
cp docker/.env.example docker/.env
```
COMPOSE_PROJECT_NAME=EMAP-Insights

Then make local edits to `docker/.env-local` file which overrides values from `docker/.env`.

For example, create a _docker/.env-local_ file with the following keys:
```
# Must be lowercase with only alphanumeric characters, hyphens, and underscores
COMPOSE_PROJECT_NAME=emap-insights
# Provide the name of the host machine (also HOSTNAME)
HOST_NAME=<GAEXX>
# Set this to a unique secure random value on production
DATABASE_PASSWORD=superset
SUPERSET_LOAD_EXAMPLES=false
SUPERSET_LOAD_EXAMPLES=yes
# Make sure you set this to a unique secure random value on production
# using something like `openssl rand -base64 42`
SUPERSET_SECRET_KEY=TEST_NON_DEV_SECRET
# Specify the Superset image tag to use
TAG=4.1.1
```

## Build or run
The shell scripts simply specify the tag, and then call docker compose with the appropriate docker-compose.yml file.

```bash
./build.sh
```

## Run
```bash
./up.sh
```

is a quick way of writing
```bash
TAG=4.1.1 docker compose -f docker-compose-image-tag.yml up
```

## Add DuckDb databases
You can now do this from the UI.
It's simplest to use a sqlalchemy connection string.
You must the database into `./data/duckdb` (which is a mounted volume).

```
duckdb:////var/data/duckdb/camino-gold.db
```


## Notes

See the details here for tag specification
https://superset.apache.org/docs/installation/docker-builds
- See the details here for tag specification: https://superset.apache.org/docs/installation/docker-builds
- e.g 4.1.1 is lean ... 250MB ish, 4.1.1-dev is not! (but includes postgres drivers and more) ... 1GB
- You may get warnings during initiaton about flask migrations.

e.g
- 4.1.1 is lean ... 250MB ish
- 4.1.1-dev is not! (but includes postgres drivers and more) ... 1GB
```bash
superset_init | ERROR [flask_migrate] Error: Can't locate revision identified by '74ad1125881c'
```
These can probably be ignored but you can always delete the `emap-insights_db_data` volume if you want to be sure.
A lean start ...
```bash
TAG=4.1.1 docker compose -f docker-compose-image-tag.yml up
```
docker compose down
docker volume rm emap-insights_db_data
```
8 changes: 8 additions & 0 deletions build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Run the docker compose file
set -a # automatically export all variables
source docker/.env
source docker/.env-local
set +a # stop automatically exporting

echo "Building Superset at tag $TAG"
docker compose -f docker-compose-image-tag.yml build
14 changes: 11 additions & 3 deletions docker-compose-image-tag.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,10 @@
# limitations under the License.
#

x-host-name: &host-name ${HOST_NAME}
x-host-name: &host-name ${HOSTNAME}
x-http-proxy: &http-proxy ${HTTP_PROXY}
x-https-proxy: &https-proxy ${HTTPS_PROXY}
x-no-proxy: &no-proxy ${HOST_NAME},${HOST_NAME}.xuclh.nhs.uk,localhost,127.0.0.1
x-no-proxy: &no-proxy ${HOSTNAME}.xuclh.nhs.uk,localhost,127.0.0.1
x-network: &network
HOST_NAME: *host-name
HTTP_PROXY: *http-proxy
Expand All @@ -43,7 +43,7 @@ x-superset-volumes:
- ./docker:/app/docker
- superset_home:/app/superset_home
# https://github.com/apache/superset/issues/9748#issuecomment-2099107789
- ./data:/var/mydata
- ./data:/var/data
- ./dashboards:/var/dashboards

services:
Expand All @@ -58,9 +58,13 @@ services:
env_file:
- docker/.env # default
- docker/.env-local # optional override
environment:
<<: *network
image: postgres:15
container_name: superset_db
restart: unless-stopped
ports:
- 8089:5432
volumes:
- db_home:/var/lib/postgresql/data
- ./docker/docker-entrypoint-initdb.d:/docker-entrypoint-initdb.d
Expand All @@ -69,6 +73,8 @@ services:
env_file:
- docker/.env # default
- docker/.env-local # optional override
environment:
<<: *network
image: *superset-image
container_name: superset_app
command: ["/app/docker/docker-bootstrap.sh", "app-gunicorn"]
Expand All @@ -87,6 +93,8 @@ services:
- docker/.env # default
- docker/.env-local # optional override
depends_on: *superset-depends-on
environment:
<<: *network
user: "root"
volumes: *superset-volumes
healthcheck:
Expand Down
2 changes: 1 addition & 1 deletion docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ x-superset-volumes: &superset-volumes
- superset_home:/app/superset_home
- ./tests:/app/tests
# https://github.com/apache/superset/issues/9748#issuecomment-2099107789
- ./data:/var/mydata
- ./data:/var/data

x-common-build: &common-build
context: .
Expand Down
24 changes: 24 additions & 0 deletions docker/.env-local.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Must be lowercase with only alphanumeric characters, hyphens, and underscores
COMPOSE_PROJECT_NAME=emap-insights

# HOST_NAME is the name of the host machine (also HOSTNAME)
HOST_NAME=
HOSTNAME=

# Set this to a unique secure random value on production
# DATABASE_PASSWORD=

SUPERSET_LOAD_EXAMPLES=yes
SUPERSET_ENV=production
DEV_MODE=false

# Make sure you set this to a unique secure random value on production
# using something like `openssl rand -base64 42`
# SUPERSET_SECRET_KEY=

# Specify the Superset image tag to use
# 4.1.1 is lean (250MB)
# 4.1.1-dev includes postgres drivers and more (1GB)
TAG=4.1.1

PYTHONPATH=/app/pythonpath:/app/docker/pythonpath_dev
4 changes: 4 additions & 0 deletions docker/.env → docker/.env.example
Original file line number Diff line number Diff line change
Expand Up @@ -67,3 +67,7 @@ ENABLE_PLAYWRIGHT=false
PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true
BUILD_SUPERSET_FRONTEND_IN_DOCKER=true
SUPERSET_LOG_LEVEL=info

# You need this for docker/superset_config_docker.py to work
# ... again edits here will override docker/superset_config.py
PYTHONPATH=/app/pythonpath:/app/docker/pythonpath_dev
1 change: 1 addition & 0 deletions docker/pythonpath_dev/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,5 @@
# DON'T ignore the .gitignore
!.gitignore
!superset_config.py
!superset_config_docker.py
!superset_config_local.example
9 changes: 0 additions & 9 deletions docker/pythonpath_dev/superset_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,15 +29,6 @@

logger = logging.getLogger()

# Allow sqlite to be used
# via https://github.com/apache/superset/issues/9748
# Superset configuration file
PREVENT_UNSAFE_DB_CONNECTIONS=False

ENABLE_PROXY_FIX = False
WTF_CSRF_ENABLED = False
TALISMAN_ENABLED = False

DATABASE_DIALECT = os.getenv("DATABASE_DIALECT")
DATABASE_USER = os.getenv("DATABASE_USER")
DATABASE_PASSWORD = os.getenv("DATABASE_PASSWORD")
Expand Down
52 changes: 52 additions & 0 deletions docker/pythonpath_dev/superset_config_docker.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
import logging
import os
import sys
import subprocess

logger = logging.getLogger()

# Allow sqlite to be used
# via https://github.com/apache/superset/issues/9748
# Superset configuration file
PREVENT_UNSAFE_DB_CONNECTIONS=False

# Add DuckDB setup
def setup_duckdb():
try:
subprocess.check_call([
sys.executable,
"-m",
"pip",
"install",
"duckdb-engine>=0.9.5,<0.10",
"--quiet"
])
logger.info("DuckDB engine installed successfully")
except subprocess.CalledProcessError as e:
logger.error(f"Failed to install DuckDB engine: {e}")
raise

# Install DuckDB when config is loaded
setup_duckdb()

# Update DuckDB configuration to use existing mounted volume
DUCKDB_DATA_PATH = "/var/data/duckdb" # Using the existing mount point
DUCKDB_CONN_PATH = os.getenv("DUCKDB_CONN_PATH", os.path.join(DUCKDB_DATA_PATH, "superset.db"))

# Ensure the directory exists
os.makedirs(DUCKDB_DATA_PATH, exist_ok=True)

# Add DuckDB to the databases dictionary
DATABASES = {
'duckdb': {
'allow_csv_upload': True,
'allow_ctas': True,
'allow_cvas': True,
'allow_dml': True,
'configuration_method': 'sqlalchemy_form',
'default_driver': 'duckdb',
}
}

# Add DuckDB to allowed databases
PREFERRED_DATABASES = ['sqlite', 'postgresql', 'duckdb', 'mssql']
7 changes: 7 additions & 0 deletions up.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Run the docker compose file
set -a # automatically export all variables
source docker/.env-local
set +a # stop automatically exporting

echo "Starting Superset at tag $TAG"
docker compose -f docker-compose-image-tag.yml up

0 comments on commit 8fd3641

Please sign in to comment.