Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: OBO Enumerator #3

Open
Thyra opened this issue Jun 19, 2020 · 1 comment
Open

Feature Request: OBO Enumerator #3

Thyra opened this issue Jun 19, 2020 · 1 comment

Comments

@Thyra
Copy link

Thyra commented Jun 19, 2020

I just tried parsing the Gene Ontology which is pretty huge, and my MacBook almost had a heart attack. What do you think about adding a method that enables lazy parsing (i.e. only parse one Term at a time, whenever somebody asks for it), either as a separate method or perhaps with an option such as lazy=true? I've been using this super-simple python OBO-parser until now but now that I'm trying to package my software into a Ruby Gem, of course everything should be just Ruby. And I don't like the idea of having multiple obo_parser gem equivalents floating around or that everybody starts their own thing from scratch.
I understand this would make many of the sanity/crossref checks impossible but I only care about very specific parts of the terms anyway and would rather have it parse quickly than safely in this case.

@Thyra
Copy link
Author

Thyra commented Jun 20, 2020

I just noticed: I think what I'm asking for is actually not a lazy way of parsing but a transient one, where only one Stanza is kept in memory at each time. Something like iterate_over_obo(IO) that would return an Enumerator, usable like this:

iterate_over_obo(File.open("go.obo")).each do |term|
  puts term.id.value
  # ...
end

names = iterate_over_obo(File.open("go.obo")).map do |term|
  term.name.value
end

Do you think that would be a valuable addition to the gem and would you be able to implement it?

@Thyra Thyra changed the title Feature Request: Lazy Parsing Feature Request: OBO Enumerator Jun 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant