Skip to content

UBC-MDS/wrangle_in_py

Repository files navigation

wrangle_in_py

A package for wrangling and tidy data in python.

This package consists of the following functions:

  • column_standardizer: returns a copy of the inputted dataframe with standardized column names.
  • string_standardizer: returns a string that is converted to lowercase and its non-alphanumerics (including spaces and punctuation) are replaced with underscores.
  • resulting_duplicates: identifies which strings became duplicates after standardization.
  • extracting_ymd: returns a copy of the inputted dataframe with three new columns: year, month, and day, splitting from inputted datetime column name.
  • extracting_hms: returns a copy of the inputted dataframe with three new columns: hour, minute, and second, from inputted datetime column name.
  • remove_duplicates: Remove duplicate rows from a DataFrame based on specified columns.

This package fills a niche in the Python ecosystem by offering specialized tools for tidying and wrangling data, focusing on standardizing column names and strings, detecting duplicates after standardization, and extracting components from datetime columns. While libraries like pandas provide general-purpose methods for similar tasks, such as renaming columns or working with datetime data, these often require multiple steps or custom scripts. By combining these focused functionalities into a single package, it offers a lightweight, user-friendly alternative for efficient data preprocessing.

Installation

$ pip install wrangle_in_py

Usage

  • TODO

Contributing

Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.

License

wrangle_in_py was created by Shannon Pflueger, Stephanie Ta, Wai Ming Wong, Yixuan(Clara) Gao. It is licensed under the terms of the MIT license.

Credits

wrangle_in_py was created with cookiecutter and the py-pkgs-cookiecutter template.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages