You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We're hoping that the Airflow instance will be ephemeral and so we need a mechanism for managing data that needs to be backed up (e.g. the metadata database).
One quick way to do this to start would be use an SQLite database rather than Postgres, then set up an hourly DAG that copies the database file to Spaces. That would provide an easy recovery mechanism for us if we needed to tear down the instance.
Imo it's fine to just use SQLite because this is like kinda production but not really. It's not that big a deal if it falls over and needs to be rebuilt again.
Turns out that we can't use the LocalExecutor with sqlite, only the SequentialExecutor. This essentially means we can't have tasks run concurrently with sqlite, we'd need to have postgres in order to do that. Again I think that's fine to start, but postgres will be necessary down the line.
We're hoping that the Airflow instance will be ephemeral and so we need a mechanism for managing data that needs to be backed up (e.g. the metadata database).
One quick way to do this to start would be use an SQLite database rather than Postgres, then set up an hourly DAG that copies the database file to Spaces. That would provide an easy recovery mechanism for us if we needed to tear down the instance.
Note that the official Airflow docs specify to never use sqlite for production, so ideally we'll have a better setup with PG WAL replication or something down the line (CC @mepholic @mpuckett159).
The text was updated successfully, but these errors were encountered: