Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move db writer to a separate process with a single queue #32

Open
robobenklein opened this issue May 28, 2021 · 2 comments
Open

Move db writer to a separate process with a single queue #32

robobenklein opened this issue May 28, 2021 · 2 comments
Assignees
Labels
enhancement New feature or request performance

Comments

@robobenklein
Copy link
Member

All processes will append docs to insert to a queue which will be consumed by the writer process.

This helps prevent the write-write conflict / locking over the db updates which could result in dedup conflicts.

Would also simplify writing to file if all docs written pass through the queue.

Writer process will need to be extremely efficient though in order to handle thousands of docs/second.

@robobenklein robobenklein added enhancement New feature or request performance labels May 28, 2021
@robobenklein robobenklein self-assigned this May 28, 2021
@robobenklein
Copy link
Member Author

Inserts that fail will be re-added to a retry queue along with a list of the errors, when that list for some doc becomes too large, stop the entire procedure since writes are failing.

@robobenklein
Copy link
Member Author

This is a bad idea.

Don't use a single IPC queue for passing documents, performance is worse than my house connection to the DB...

Need some method to write documents from each worker in parallel, maybe FileLock is performant enough? Until then, manual checks on arango it is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance
Projects
None yet
Development

No branches or pull requests

1 participant