Skip to content

mezantrop/tSQLike

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

72 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tSQLike

Python package CodeQL

SQL-like interface to tabular structured data

Buy Me A Coffee

Description

tSQLike is a Python3 module that is written with a hope to make tabular data processing easier using SQL-like primitives.

Notes

Not that early stage, but still in development: may contain bugs

Usage

import tsqlike

t1 = tsqlike.Table(data=[['h1', 'h2', 'h3', 'h4'],
                        ['a', 'b', 'c', 'd'],
                        ['b', 'c', 'd', 'd'],
                        ['f', 'g', 'h', 'i']],
                   name='first')
t2 = tsqlike.Table().import_list_dicts(data=[{'h1': 1, 'h2': 2, 'h3': 3},
                                            {'h1': 'd', 'h2': 'e', 'h3': 'f'}],
                                       name='second')
t3 = t1.join(t2, on='first.h4 == second.h1').select('*').order_by('second.h2', direction=tsqlike.ORDER_BY_DEC)
t3.write_csv(dialect='unix')

"first.h1", "first.h2", "first.h3", "first.h4", "second.h1", "second.h2", "second.h3"
"b", "c", "d", "d", "d", "e", "f"
"a", "b", "c", "d", "d", "e", "f"

Installation

pip install tsqlike

Functionality

Table class

The main class of the module

Data processing methods

Name Status Description
column_map Apply a function to a column
join Join two Tables (self and table) on an expression *. Complex, but slow
join_lt Light, limited, fast and safe Join, that doesn't use eval()
select Select column(s) from the Table *
select_lt eval()-free version of select
order_by ORDER BY primitive of SQL SELECT to sort the Table by a column
group_by GROUP BY primitive of SQL SELECT to apply aggregate function on a column

Import methods

Name Status Description
import_dict_lists Import a dictionary of lists into Table object
import_list_dicts Import a list of horizontal arranged dictionaries into the Table
import_list_lists Import list(list_1(), list_n()) with optional first row as the header

Export methods

Name Status Description
export_dict_lists Export a dictionary of lists
export_list_dicts Export list of dictionaries
export_list_lists Export list(list_1(), list_n()) with optional first row as the header

Write methods

Name Status Description
write_csv Make CSV from the Table object and write it to a file or stdout
write_json Write JSON into file or STDOUT *
write_json_lt eval()-free version of Table.write_json
write_xml Write XML. NB: Do we need this?

Header manipulation methods

Name Status Description
get_column Return either a column name by index or index by name. None if not found
rename_column Rename a column name in the header
make_shortnames Return Header with no Dot-prefix of the columns
set_shortnames Remove Dot-prefix of the columns from self/Table header

Private methods

Name Status Description
_redimension Recalculate dimensions of the Table.table

EvalCtrl class

Controls what arguments are available to eval() function

Name Status Description
blacklisted Checks if there is any of the blacklised words in stanza
blacklist_add Add a new word into the black list
blacklist_remove Remove the word from the blacklist

Standalone functions

Name Status Description
open_file Open a file
close_file Close a file
read_json Read JSON file
read_csv Read CSV file
read_xml Read XML. NB: Do we need XML support?
str_to_type Convert a str to a proper type: int, float or bool

WARNING

Methods Table.join(on=), Table.select(where=) and Table.write_json(export_f=), use eval() function to run specified expressions within the program. ANY expression, including one that is potentially DANGEROUS from security point of view, can be passed as the values of the above arguments. It is your duty to ensure correctness and safety of these arguments and EvalCtrl helps to block potentially dangerous function/method names.

Alternatively you can use Table.join_lt(), Table.select_lt() and Table.write_json(). They are significantly less powerful, but do not use eval().

TODO

  • Rework: Table Names, Header Column Names, Dot-Prefixes
  • Documentation!

Contacts

If you have an idea, a question, or have found a problem, do not hesitate to open an issue or mail me directly: Mikhail Zakharov [email protected]