Name		Name	Last commit message	Last commit date
parent directory ..
extractors		extractors
load		load
transform		transform
README.md		README.md

README.md

MLentory ETL Pipeline Code

This folder contains the core implementation of the MLentory ETL (Extract, Transform, Load) pipeline, designed to collect and process machine learning model metadata from various sources.

MLentory Pipeline Architecture

Project Structure

code/
├── extractors/   
├── transform/   
└── load/

1. Extractors

Platform-specific modules that extract ML model metadata from different sources:

HuggingFace Hub extractor
Future extractors for other platforms

For detailed information, see the extractors documentation

2. Transform

Transforms extracted data into a standardized schema:

Configurable transformation rules
Field processing and validation
Schema mapping

For detailed information, see the transform documentation .

3. Load

Handles storage and versioning of processed data:

PostgreSQL for relational data
Virtuoso for RDF triples
Elasticsearch for search capabilities

For detailed information, see the load documentation .

Run the project

If you want to run the full extraction, transformation and loading process you can follow the instructions in the deployment documentation.

If you want to run any of the specific components you need to have the prerequisites installed from the deployment documentation, if you already have them installed you can follow the instuctions from any of the components folders.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

code

code

README.md

MLentory ETL Pipeline Code

Project Structure

1. Extractors

2. Transform

3. Load

Run the project

Files

code

Directory actions

More options

Directory actions

More options

Latest commit

History

code

Folders and files

parent directory

README.md

MLentory ETL Pipeline Code

Project Structure

1. Extractors

2. Transform

3. Load

Run the project