Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only latest table name is saved #50

Closed
surister opened this issue Jul 2, 2024 · 4 comments · Fixed by #51
Closed

Only latest table name is saved #50

surister opened this issue Jul 2, 2024 · 4 comments · Fixed by #51
Assignees
Labels
enhancement New feature or request javascript About the Javascript package python About the Python package

Comments

@surister
Copy link
Collaborator

surister commented Jul 2, 2024

Sql queries can have several schemas/table names.

>>> stmt = sqlparse("""SELECT a FROM sys.t JOIN doc.b ON t.a = b.a""")[0]
>>> print(stmt.metadata)
Metadata(schema='doc', table_name= 'b', parameterized_properties={}, with_properties={})

sys.t table information is lost, it's easy to fix, the question is about the resulting API

Metadata(schema=['sys', 'doc'], table_name=['t', 'b'], parameterized_properties={}, with_properties={}) would this be good enough, where schemas and table_name are ordered?

If no schema it'd be:
Metadata(schema=[None, 'doc'], table_name=['t', 'b'], parameterized_properties={}, with_properties={})

Now that you are integrating into this, wdyt? @amotl

@surister surister added enhancement New feature or request python About the Python package javascript About the Javascript package labels Jul 2, 2024
@amotl
Copy link
Member

amotl commented Jul 2, 2024

Thanks for your support. I need this table name extraction feature over there right now, and probably later for other spots as well.

So, when looking at this...

schema=['sys', 'doc'], table_name=['t', 'b']

... would I need to coalesce them together again to get full-qualified table names like sys.t and doc.b? In this case, the client would need to know about CrateDB's quoting rules. So, wouldn't a representation like "sys"."t" and "doc"."b" possibly be optimal?

/cc @matriv, @hlcianfagna, @hammerhead, @seut

@surister
Copy link
Collaborator Author

surister commented Jul 2, 2024

Idea:

Metadata(tables=[Table(schema='doc', name='t'), ...], parameterized_properties={}, with_properties={}) where the Table dataclass has some kind of .quoted so:

>>> t = Table(schema='doc', name='t')
>>> t.quoted
"doc"."t"

Would also enable new stuff in the future like aliases.

>>> stmt = sqlparse('SELECT a FROM sys.t alias_a JOIN doc.b alias_b ON t.a = b.a')[0]
>>> t = stmt.tables[0]
>>> t
Table(schema='sys', name='t')
>>> t.quoted
'"sys"."t"'
>>> t.alias
{"alias_b": 'sys.t'}

@amotl
Copy link
Member

amotl commented Jul 2, 2024

OO FTW. Excellent, je vous remercie.

Suggestion: Maybe use t.fqn instead of t.quoted, to better convey it is the "full" representation, including schema- + table-name? That it's also a properly quoted value, might be an implicit feature, not necessarily reflected by its name.

@amotl
Copy link
Member

amotl commented Jul 2, 2024

JFYI: Maybe CrateDB Toolkit's quote_relation_name() function is helpful in any way? Please let me know, I planned to refactor it to sqlalchemy-cratedb earlier or later. You may not necessarily need it in your context, but if you do, I would be convinced enough to make it happen earlier than later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request javascript About the Javascript package python About the Python package
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants