Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change the default data_home_path and update the docs #69

Merged
merged 4 commits into from
Jun 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,8 @@ tsdb.delete_cache(dataset_name='physionet_2012')
# or you can delete all cache with delete_cached_data() to free disk space
tsdb.delete_cache()

# to avoid taking up too much space if downloading many datasets,
# The default cache directory is ~/.pypots/tsdb under the user's home directory.
# To avoid taking up too much space if downloading many datasets ,
# TSDB cache directory can be migrated to an external disk
tsdb.migrate_cache("/mnt/external_disk/TSDB_cache")
```
Expand Down Expand Up @@ -145,9 +146,9 @@ year={2023},
}
```
or
> Wenjie Du. (2023).
> Wenjie Du.
> PyPOTS: a Python toolbox for data mining on Partially-Observed Time Series.
> arXiv, abs/2305.18811. https://arxiv.org/abs/2305.18811
> arXiv, abs/2305.18811, 2023.



Expand Down
25 changes: 13 additions & 12 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ Welcome to TSDB documentation!
.. image:: https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fgithub.com%2FWenjieDu%2FTime_Series_Database&count_bg=%2379C83D&title_bg=%23555555&icon=&icon_color=%23E7E7E7&title=Visits+since+April+2022&edge_flat=false
:alt: Visit num

📣 TSDB now supports a total of 1️⃣6️⃣9️⃣ time-series datasets ‼️
📣 TSDB now supports a total of 1️⃣7️⃣2️⃣ time-series datasets ‼️

.. image:: https://pypots.com/figs/pypots_logos/PyPOTS/logo_FFBG.svg
:width: 120
Expand Down Expand Up @@ -100,12 +100,15 @@ or install from source code:
tsdb.download_and_extract('physionet_2012', './save_it_here')
# datasets you once loaded are cached, and you can check them with list_cached_data()
tsdb.list_cache()
# you can delete only one specific dataset and preserve others
# you can delete only one specific dataset's pickled cache
tsdb.delete_cache(dataset_name='physionet_2012', only_pickle=True)
# you can delete only one specific dataset raw files and preserve others
tsdb.delete_cache(dataset_name='physionet_2012')
# or you can delete all cache with delete_cached_data() to free disk space
tsdb.delete_cache()

# to avoid taking up too much space if downloading many datasets,
# The default cache directory is ~/.pypots/tsdb under the user's home directory.
# To avoid taking up too much space if downloading many datasets ,
# TSDB cache directory can be migrated to an external disk
tsdb.migrate_cache("/mnt/external_disk/TSDB_cache")

Expand All @@ -124,6 +127,8 @@ That's all. Simple and efficient. Enjoy it! 😃
`Electricity Load Diagrams <https://github.com/WenjieDu/TSDB/tree/main/dataset_profiles/electricity_load_diagrams>`_ :cite:`trindade2015electricity` Forecasting, Imputation
`Electricity Transformer Temperature (ETT) <https://github.com/WenjieDu/TSDB/tree/main/dataset_profiles/electricity_transformer_temperature>`_ :cite:`zhou2021informer` Forecasting, Imputation
`Vessel AIS data <https://github.com/WenjieDu/TSDB/tree/main/dataset_profiles/vessel_ais>`_ :cite:`grgicevic2023ais` Forecasting, Imputation, Classification
`PeMS Traffic <https://github.com/WenjieDu/TSDB/tree/main/dataset_profiles/pems_traffic>`_ Forecasting, Imputation
`Solar Alabama <https://github.com/WenjieDu/TSDB/tree/main/dataset_profiles/solar_alabama>`_ Forecasting, Imputation
`UCR & UEA Datasets <https://github.com/WenjieDu/TSDB/tree/main/dataset_profiles/ucr_uea_datasets>`_ (all 163 datasets) :cite:`bagnall2018uea` :cite:`dau2018ucr` Classification
========================================================================================================================================================================== ==========================================

Expand All @@ -145,22 +150,18 @@ please cite it as below and 🌟star `PyPOTS repository <https://github.com/Wenj
.. code-block:: bibtex
:linenos:

@article{du2023PyPOTS,
title={{PyPOTS: A Python Toolbox for Data Mining on Partially-Observed Time Series}},
@article{du2023pypots,
title={{PyPOTS: a Python toolbox for data mining on Partially-Observed Time Series}},
author={Wenjie Du},
journal={arXiv preprint arXiv:2305.18811},
year={2023},
eprint={2305.18811},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2305.18811},
doi={10.48550/arXiv.2305.18811},
}

or

Wenjie Du. (2023).
Wenjie Du.
PyPOTS: A Python Toolbox for Data Mining on Partially-Observed Time Series.
arXiv, abs/2305.18811. https://doi.org/10.48550/arXiv.2305.18811
arXiv, abs/2305.18811, 2023.


.. toctree::
Expand Down
2 changes: 1 addition & 1 deletion tsdb/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
#
# Dev branch marker is: 'X.Y.dev' or 'X.Y.devN' where N is an integer.
# 'X.Y.dev0' is the canonical version of 'X.Y.dev'
__version__ = "0.4"
__version__ = "0.5"

from .data_processing import (
CACHED_DATASET_DIR,
Expand Down
2 changes: 1 addition & 1 deletion tsdb/config.ini
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
[path]
data_home = ~/.tsdb
data_home = ~/.pypots/tsdb
26 changes: 17 additions & 9 deletions tsdb/utils/file.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,25 +103,33 @@ def determine_data_home():
data_home_path = config.get("path", "data_home")
# replace '~' with the absolute path if existing in the path
data_home_path = data_home_path.replace("~", os.path.expanduser("~"))
old_cached_dataset_dir = os.path.join(

# old cached dataset dir path used in TSDB v0.2
old_cached_dataset_dir_02 = os.path.join(
os.path.expanduser("~"), ".tsdb_cached_datasets"
)
# old cached dataset dir path used in TSDB v0.4
old_cached_dataset_dir_04 = os.path.join(os.path.expanduser("~"), ".tsdb")

if os.path.exists(old_cached_dataset_dir):
# use the old path and warn the user
if os.path.exists(old_cached_dataset_dir_02) or os.path.exists(
old_cached_dataset_dir_04
):
logger.warning(
"‼️ Detected the home dir of the old version TSDB. "
"Since v0.3, TSDB has changed the default cache dir to '~/.tsdb'. "
"Auto migrating downloaded datasets to the new path. "
"‼️ Detected the home dir of the old version TSDB. Auto migrating... Please wait."
)
cached_dataset_dir = data_home_path
migrate(old_cached_dataset_dir, cached_dataset_dir)
if os.path.exists(old_cached_dataset_dir_02):
migrate(old_cached_dataset_dir_02, cached_dataset_dir)
else:
migrate(old_cached_dataset_dir_04, cached_dataset_dir)
logger.info("🌟 Migrating finished.")
elif os.path.exists(data_home_path):
# use the path directly, may be in a portable disk
cached_dataset_dir = data_home_path
else:
# use the default path
default_path = os.path.join(os.path.expanduser("~"), ".tsdb")
# use the default path for initialization,
# e.g. `data_home_path` in a portable disk but the disk is not connected
default_path = os.path.join(os.path.expanduser("~"), ".pypots", "tsdb")
cached_dataset_dir = default_path
if os.path.abspath(data_home_path) != os.path.abspath(default_path):
logger.warning(
Expand Down
Loading