Tensorflow Data for AWS Athena

An AWS athena library for tensorflow.data.Dataset. If you don't know tf.data, take a look at documentation and this example.

Instalation

Install is as simple as pip install:

pip install tf-data-athena

How to use

Use is almost as simple as another tf.Dataset implementation. You just need to create a dataset using the funciton create_athena_dataset

no (it follows aws authentication chain in boto3).

# imports
from tf_data_athena import create_athena_dataset

# connector parameters
s3_output_location = "s3://my-bucket/my-folder/athena-outputs" # Athena output bucket folder
waiting_interval = 0.1 # Time (in seconds) to wait before asking for query state

# query
query = "select * from my_namespace.my_table"

# create dataset
dataset = create_athena_dataset(query, s3_output_location)

Now, dataset is an instance of tf.data.Dataset containing query results.

Parameters

Then factory funcion create_athena_dataset has the following parameters:

query: The query to be ran in athena
s3_output_location: An s3 path with write access for the current account where the query results file will be saved
waiting_interval: A float number representing the number of seconds between to wait before ask for query status on athena
num_parallel_calls: Argument for tf.data.Dataset.map (see documentation) while parsing result rows
other named arguments: Any other named argument will be used on tf.data.TextLineDataset constructor, please, see documentation.

AWS Authorization

This library uses boto3 behind the scenes, then, it follows the same authentication/authorization chain. Authorized user or service needs permission to create and execute athena queries and create and read s3 objects in the folder defined by s3_output_location.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
test		test
tf_data_athena		tf_data_athena
.gitignore		.gitignore
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tensorflow Data for AWS Athena

Instalation

How to use

Parameters

AWS Authorization

About

Releases

Packages

Languages

andreclaudino/tf-data-athena

Folders and files

Latest commit

History

Repository files navigation

Tensorflow Data for AWS Athena

Instalation

How to use

Parameters

AWS Authorization

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages