-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #13 from qalita-io/dev
Dev
- Loading branch information
Showing
29 changed files
with
615 additions
and
91 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,42 +1,39 @@ | ||
# Accuracy | ||
|
||
## Overview | ||
This pack assesses the precision of float columns within a dataset, providing a granular view of data quality. The script computes the maximum number of decimal places for each float column and generates a normalized score representing the precision level of the data. The results are saved in `metrics.json`, with each float column's precision score detailed individually. | ||
|
||
## Features | ||
This pack assesses the precision of float columns within a dataset, providing a granular view of data quality. The script computes the maximum number of decimal places for each float column and generates a normalized score representing the precision level of the data. | ||
|
||
## Input 📥 | ||
|
||
### Configuration ⚙️ | ||
|
||
| Name | Type | Required | Default | Description | | ||
| ---------------------- | ------ | -------- | ------- | -------------------------------------------------------- | | ||
| `jobs.source.skiprows` | `int` | no | `0` | The number of rows to skip at the beginning of the file. | | ||
| `jobs.id_columns` | `list` | no | `[]` | The list of columns to use as identifier. | | ||
|
||
### Source type compatibility 🧩 | ||
|
||
This pack is compatible with **files** 📁 (``csv``, ``xslx``). | ||
|
||
## Analysis 🕵️♂️ | ||
|
||
- **Precision Calculation**: Computes the maximum number of decimal places for each float value in float columns. | ||
- **Score Normalization**: Normalizes the precision values to a 0-1 scale, providing a standardized precision score for each column. | ||
- **Metrics Generation**: Outputs a `metrics.json` file containing precision scores for each float column, enhancing the interpretability of data quality. | ||
|
||
## Setup | ||
Before running the script, ensure that the following files are properly configured: | ||
- `source_conf.json`: Configuration file for the source data. | ||
- `pack_conf.json`: Configuration file for the pack. | ||
- Data file: The data to be analyzed, lo aded using `opener.py`. | ||
|
||
## Usage | ||
To use this pack, follow these steps: | ||
1. Ensure all prerequisite files (`source_conf.json`, `pack_conf.json`, and the data file) are in place. | ||
2. Run the script with the appropriate Python interpreter. | ||
3. Review the generated `metrics.json` for precision metrics of the dataset. | ||
|
||
## Output | ||
- `metrics.json`: Contains precision scores for each float column in the dataset. The structure of the output is as follows: | ||
|
||
```json | ||
[ | ||
{ | ||
"key": "decimal_precision", | ||
"value": "<precision_score>", | ||
"scope": { | ||
"perimeter": "column", | ||
"value": "<column_name>" | ||
}, | ||
}, | ||
... | ||
] | ||
``` | ||
|
||
# Contribute | ||
|
||
[This pack is part of Qalita Open Source Assets (QOSA) and is open to contribution. You can help us improve this pack by forking it and submitting a pull request here.](https://github.com/qalita-io/packs) | ||
|
||
| Name | Description | Scope | Type | | ||
| ------------------- | ------------------------------------------------- | ------- | ------- | | ||
| `score` | Accuracy score | Dataset | `float` | | ||
| `decimal_precision` | Number of maximum decimals seen for this variable | Column | `int` | | ||
| `proportion_score` | Proportion of values with maximum decimals | Column | `float` | | ||
|
||
## Output 📤 | ||
|
||
### Report 📊 | ||
|
||
This pack doesn't generate any output or report. | ||
|
||
# Contribute 💡 | ||
|
||
[This pack is part of Qalita Open Source Assets (QOSA) and is open to contribution. You can help us improve this pack by forking it and submitting a pull request here.](https://github.com/qalita-io/packs) 👥🚀 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,6 @@ | ||
{ | ||
"job": { | ||
"id_columns": [], | ||
"source": { | ||
"skiprows": 0 | ||
} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,7 @@ | ||
description: Compute accuracy metrics | ||
description: Compute decimal accuracy metrics | ||
icon: icon.png | ||
name: accuracy | ||
type: accuracy | ||
url: https://github.com/qalita-io/packs/tree/main/accuracy_pack | ||
version: 1.1.0 | ||
version: 1.1.13 | ||
visibility: public |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.