Skip to content

Large Dataset Pre-processing documentation #500

Closed Answered by ilyes319
tjgiese asked this question in Q&A
Discussion options

You must be logged in to vote

Hey @tjgiese,

Thank you for catching this typo in the doc, I have fixed it and also clarified the doc.

In case you use multiple threads to do the multiprocessing (which is your case), you need to use the following command to train:

python <mace_repo_dir>/mace/cli/run_train.py \
--name="MACE_on_big_data" \
--num_workers=16 \
--train_file="./processed_data/train \
--valid_file="./processed_data/valid" \
--test_dir="./processed_data/test" \
--statistics_file="./processed_data/statistics.json" \

You pass the full folder path.

Replies: 2 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by ilyes319
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants