Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add globbing support in makeReader and CreateDataSource #729

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

m-fila
Copy link
Contributor

@m-fila m-fila commented Jan 27, 2025

BEGINRELEASENOTES

  • POSIX glob patterns can be used in makeReader and CreateDataSource. Added standalone helper podio::utilities::expand_globto resolve globs.
  • Added passing a list of files to get_reader

ENDRELEASENOTES

This is a proposition to make makeReader and CreateDataSource overloads a taking single filename also accept and expand a POSIX glob pattern.

The behaviour is:

  • if input isn't a glob pattern it's forwarded without expansion (regardless it exists or not)
  • if no matches are found then an exception is thrown
  • if any other error (memory allocation) is reported then an exception is thrown
  • if some expanded paths can't be read message with path name and corresponding error code is printed to cerr. The problematic paths are then skipped in the results, but the expansion overall isn't aborted.

Closes #686

  • podio::makeReader
  • podio::CreateDataFrame
    - [x] Python side - podio.get_reader
  • Python side - podio.CreateDataFrame

@m-fila m-fila marked this pull request as ready for review January 29, 2025 07:27
python/podio/reading.py Outdated Show resolved Hide resolved
src/globUtils.h Outdated Show resolved Hide resolved
@tmadlener
Copy link
Collaborator

Just for completeness bringing the conclusions from todays meeting here:

  • podio::detail::expand_glob should become a publicly available utility (also available from python)
  • makeReader(std::string) will keep the implicit globbing (via calling the aforementioned utility). Similar for the CreateDataFrame that takes only a single string. In both cases the versions that take a file list should not change.
  • The python side should not have implicit globbing (i.e. the "hidden" glob that is currently present should be removed).
  • The python interface should be able to take a list of input files though
  • (As a bonus goal to be done in another PR, we should check and see if the c++ and python API can be harmonized a bit better).

@m-fila
Copy link
Contributor Author

m-fila commented Feb 5, 2025

I think that should cover now all the points from the meeting except for harmonizing the Python bindings for readers

include/podio/utilities/Glob.h Outdated Show resolved Hide resolved
include/podio/utilities/Glob.h Outdated Show resolved Hide resolved
python/podio/reading.py Show resolved Hide resolved
python/podio/utils.py Outdated Show resolved Hide resolved
src/Glob.cc Show resolved Hide resolved
python/podio/utils.py Outdated Show resolved Hide resolved
src/selection.xml Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for globbing in podio::CreateDataFrame, podio::makeReader and podio.reading.get_reader
2 participants