Splitting tokenization part to reduce time #432
yusufcakmakk
started this conversation in
Ideas
Replies: 1 comment
-
This can be achieved by modifying here. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi all,
I have a suggestion for the tokenization part in here. When we start the process, and we have different block sizes, it calculates all tokens again and then grouping them. Instead, the tokenization step can be used once and then grouped based on block size.
What do you think about this improvement?
Beta Was this translation helpful? Give feedback.
All reactions