Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Suggestions] Much faster imports #35

Open
Squareys opened this issue Jul 19, 2024 · 1 comment
Open

[Suggestions] Much faster imports #35

Squareys opened this issue Jul 19, 2024 · 1 comment

Comments

@Squareys
Copy link

Squareys commented Jul 19, 2024

Hey all,

The rate limit hit is actually the 300 requests / minute per webhook, so running the importer per channel in parallel gives a huge speedup. You will need to create a per-channel webhook name for that.

Additionally, messages can be appended to the previous message, if the user matches and no replies, files or reactions are attached to the previous file (remap the thread_ts for these appended messages). Beware of the max message length.

I have a very scrappy diff for this, which I might attach later. It's probably too scrappy for a merge request, though.

Best,
Jonathan

@Checkroth
Copy link

I wonder if you couldn't also just create multiple webhooks for each channel and loop over them, so that each message is still sent in sequence but each webhook gets a resting period, which would move the bottleneck from waiting for the rate limit to the actual response time from the webhook for each message

Checkroth added a commit to Checkroth/slack-to-discord that referenced this issue Jan 11, 2025
This takes the suggestion from this gh issue: pR0Ps#35

And runs with it as-is. _import_channel includes the exact code that used to exist in the per-channel for loop block as its own async function, and all channels are run in tandem as a task group.

This PR also builds on top of the previous commit that adds simple json file-based state management, allowing idempotent re-runs. State management is a blocker for this task due to the unreliable speed for each channel -- a failure would leave each channel imported up to potentially different timestamps, rendering the start date parameter useless.
Checkroth added a commit to Checkroth/slack-to-discord that referenced this issue Jan 13, 2025
This takes the suggestion from this gh issue: pR0Ps#35

And runs with it as-is. _import_channel includes the exact code that used to exist in the per-channel for loop block as its own async function, and all channels are run in tandem as a task group.

This PR also builds on top of the previous commit that adds simple json file-based state management, allowing idempotent re-runs. State management is a blocker for this task due to the unreliable speed for each channel -- a failure would leave each channel imported up to potentially different timestamps, rendering the start date parameter useless.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants