Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change split from bytes to lines #14

Open
Not-C-Developer opened this issue Sep 13, 2020 · 2 comments
Open

Change split from bytes to lines #14

Not-C-Developer opened this issue Sep 13, 2020 · 2 comments

Comments

@Not-C-Developer
Copy link

Hi.
Change this.
from
split -b100M rdns.rev.lowercase.txt fileChunk
to
split -l2000000 rdns.rev.lowercase.txt fileChunk
in scripts/fdns_a.sh and scripts/rdns.sh
because lose some records when sorting.

@j0eii
Copy link

j0eii commented Sep 17, 2020

Yes true, it has data loss.
it fixed my issue too.

@redNixon
Copy link

From split's man page:
-C, --line-bytes=SIZE
put at most SIZE bytes of records per output file

Seems like that would be the best option to use to give you a desired file chunk size without risking data loss.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants