initial data connectors #82

StanChan03 · 2025-01-15T04:08:54Z

Initial SQL Integration

sidjha1 · 2025-01-18T02:15:29Z

lotus/databases/connectors.py

+class DatabaseConnector:
+    def __init__(self):
+        self.sql_engine = None
+        self.nosql_client = None


I think we should remove all the nosql stuff since there is not one interface for all the nosql offerings.

sidjha1 · 2025-01-18T02:16:53Z

examples/db_examples/sql_db.py

+
+connector = DatabaseConnector()
+connector.connect_sql("sqlite:///example_movies.db")
+lotus_db = LotusDB(connector)


Do we need LotusDB? Can't we just have a connect.load_as_pandas(table_name: str) directly?

Maybe DatabaseConnector just provides a staticmethod

@staticmethod def load_from_db(connection_url: str, table_name: str) -> pd.DataFrame

And then all the code here becomes

df = DatabaseConnector.load_from_db("sqlite:///example_movies.db", "movies")

However, what if someone has a specific query thats not just select * from table? Maybe have something like

def load_from_db(connection_url: str, query: str) -> pd.DataFrame

Also if someone has a very large dataset should we consider processing it in batches, and then concat them together?

def load_from_db(connection_url: str, query: str) -> pd.DataFrame

This works

Also if someone has a very large dataset should we consider processing it in batches, and then concat them together?

If someone has a very large dataset then I think lotus just will not be able to work, since the requirement is that it can be loaded into a pandas dataframe in memory.

makes sense

…ql_integration

initial sql integration

ab1a8e6

StanChan03 requested a review from sidjha1 January 15, 2025 04:08

add sql db tests

668c90e

StanChan03 linked an issue Jan 16, 2025 that may be closed by this pull request

Support for SQL and NoSQL databases. #72

Open

improve logging

cc4ddcf

sidjha1 reviewed Jan 18, 2025

View reviewed changes

StanChan03 added 3 commits January 17, 2025 19:36

implement static method

7de0f94

Merge branch 'main' of github.com:stanford-futuredata/lotus into sc/s…

7b48536

…ql_integration

s3 connection

60742c0

StanChan03 changed the title ~~initial sql integration~~ initial data connectors Jan 20, 2025

StanChan03 and others added 4 commits January 20, 2025 13:03

add boto3 requirement

df52f74

add s3 connection example

e02e360

add testing

aa9acf1

docs

badfdd5

StanChan03 force-pushed the sc/sql_integration branch from 0048eb1 to badfdd5 Compare January 23, 2025 22:52

testing

15a2015

StanChan03 requested a review from liana313 January 25, 2025 03:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

initial data connectors #82

initial data connectors #82

StanChan03 commented Jan 15, 2025

sidjha1 Jan 18, 2025

StanChan03 Jan 18, 2025

sidjha1 Jan 18, 2025

sidjha1 Jan 18, 2025

StanChan03 Jan 18, 2025 •

edited

Loading

StanChan03 Jan 18, 2025

sidjha1 Jan 18, 2025

StanChan03 Jan 18, 2025

initial data connectors #82

Are you sure you want to change the base?

initial data connectors #82

Conversation

StanChan03 commented Jan 15, 2025

sidjha1 Jan 18, 2025

Choose a reason for hiding this comment

StanChan03 Jan 18, 2025

Choose a reason for hiding this comment

sidjha1 Jan 18, 2025

Choose a reason for hiding this comment

sidjha1 Jan 18, 2025

Choose a reason for hiding this comment

StanChan03 Jan 18, 2025 • edited Loading

Choose a reason for hiding this comment

StanChan03 Jan 18, 2025

Choose a reason for hiding this comment

sidjha1 Jan 18, 2025

Choose a reason for hiding this comment

StanChan03 Jan 18, 2025

Choose a reason for hiding this comment

StanChan03 Jan 18, 2025 •

edited

Loading