Skip to content

Commit

Permalink
Updating package version number and release docs
Browse files Browse the repository at this point in the history
Added release notes and migration guide
  • Loading branch information
Yevgeni Litvin authored and selitvin committed Dec 10, 2018
1 parent 347e194 commit 2c2585c
Show file tree
Hide file tree
Showing 5 changed files with 74 additions and 1 deletion.
1 change: 1 addition & 0 deletions docs/autodoc/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ training frameworks.
benchmarks_tutorial_include
troubleshoot_include
release_notes_include
migration_guide_include


Indices and tables
Expand Down
3 changes: 3 additions & 0 deletions docs/autodoc/migration_guide_include.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Migration notes
===============
.. include:: ../migrating-0.5.0.rst
45 changes: 45 additions & 0 deletions docs/migrating-0.5.0.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
.. inclusion-marker-start-do-not-remove
==================
To petastorm 0.5.0
==================

Petastorm 0.5.0 has some breaking changes from previous versions. These include:

- Users should use :func:`~petastorm.reader.make_reader`, instead of instantiating :class:`~petastorm.reader.Reader`
directly to create a new instances
- It is still possible (although discouraged in most cases) to instantitate :class:`~petastorm.reader.Reader`. Some of
its argument has changed.

Use :func:`~petastorm.reader.make_reader` to instantiate a reader instance
--------------------------------------------------------------------------

Use :func:`~petastorm.reader.make_reader` to create a new instance of a reader. :func:`~petastorm.reader.make_reader`
takes arguments that are almost similar to constructor arguments of :class:`~petastorm.reader.Reader`. The following
list enumerates the differences:

- ``reader_pool``: takes one of the strings: ``'thread'``, ``'process'``, ``'dummy'``
(instead of ``ThreadPool()``, ``ProcessPool()`` and ``DummyPool()`` object instances). Pass number of workers using
``workers_count`` argument.
- ``training_partition`` and ``num_training_partitions`` were renamed into ``cur_shard`` and ``shard_count``.
- ``shuffle`` and ``shuffle_options`` were replaced by ``shuffle_row_groups=True, shuffle_row_drop_partitions=1``

.. code-block:: python
from petastorm.reader import Reader
reader = Reader(dataset_url,
reader_pool=ThreadPool(5),
training_partition=1, num_training_partitions=5,
shuffle_options=ShuffleOptions(shuffle_row_groups=False))
To:

.. code-block:: python
from petastorm import make_reader
reader = make_reader(dataset_url,
reader_pool='thread',
workers_count=5,
cur_shard=1, shard_count=5,
shuffle_row_groups=False)
24 changes: 24 additions & 0 deletions docs/release-notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,30 @@
Release notes
=============

Release 0.5.0
=============

Breaking changes
----------------
- :func:`~petastorm.reader.make_reader` should be used to create new instance of a reader.
- It is still possible, but not recommended to use :class:`~petastorm.reader.Reader` in most cases. Its constructor arguments
has changed:
-- ``training_partition`` and ``num_training_partitions`` were renamed into ``cur_shard`` and ``shard_count``.
-- ``shuffle`` and ``shuffle_options`` were replaced by ``shuffle_row_groups=True, shuffle_row_drop_partitions=1``
-- ``sequence`` argument was removed


New features and bug fixes
--------------------------
- It is possible to read non-Petastorm Parquet datasets (created externally to Petastorm). Currently most of the
scalar types are supported.
- Support s3 as the protocol in a dataset url strings (e.g. 's3://...')
- PyTorch: support collating decimal scalars
- PyTorch: promote integer types that are not supported by PyTorch to the next larger integer types that is supported
(e.g. int8 -> int16). Booleans are promoted to uint8.
- Support running ``petastorm-generate-metadata.py`` on datasets created by Hive.
- Fix incorrect dataset sharding when using Python 3.

Release 0.4.3
=============

Expand Down
2 changes: 1 addition & 1 deletion petastorm/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,4 @@

from petastorm.reader import make_reader, make_batch_reader # noqa: F401

__version__ = '0.4.3'
__version__ = '0.5.0rc0'

0 comments on commit 2c2585c

Please sign in to comment.