You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @danich1, thank you so much for telling me about this approach you developed and I am really amazed looking through this code!
I was wondering if I could ask what is the starting dataset (or starting script to generate the dataset) that you're using? Everything I've looked at seems to make sense, but I just can't figure out if there's something I'm supposed to have downloaded initially to get it to run or whether I'm just missing something.
Thank you again!
The text was updated successfully, but these errors were encountered:
Ah the dataset for this code is the bioRxiv xml dump. This repository is intentionally missing the dump because the bioRxiv group asked me to not share with anybody until they were ready to go public. Plus the dump is 2 terabytes, so definitely not a size github would be happy with.
Hi @danich1, thank you so much for telling me about this approach you developed and I am really amazed looking through this code!
I was wondering if I could ask what is the starting dataset (or starting script to generate the dataset) that you're using? Everything I've looked at seems to make sense, but I just can't figure out if there's something I'm supposed to have downloaded initially to get it to run or whether I'm just missing something.
Thank you again!
The text was updated successfully, but these errors were encountered: