Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write simple additional performance tests #105

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

spigo900
Copy link
Collaborator

@spigo900 spigo900 commented Dec 4, 2024

Addresses part of #45.

cc @benlebrun

(locally, anyway)

This is counting the overall test time, so including time to load the model. If
I find a way to load the model once for the entire benchmark then that will go
away and I can scale up the test "size" again.
@spigo900
Copy link
Collaborator Author

spigo900 commented Dec 4, 2024

@benlebrun As it stands:

  1. I run the tests using make benchmark to run them all, or pytest perf_tests/test_inference -k test_name_substring to run a single test.
  2. The long sequences test should be short enough -- it ran in 1 minute 45 seconds locally, total, when I ran only that test.
  3. The permissive grammar test takes ~5 minutes as it stands, so I would shorten the token limit from 100 tokens to 40 tokens I think and adjust from there.
  4. I don't know where the "many particles" benchmark stands.
  5. The benchmarks seem to take longer than the benchmark printout reports, probably because of time spent loading the model. It might be possible to fix these by putting model loading into a fixture, but I don't know if there is a nice way to do that because each test needs to use its own grammar in the inference setup.

So, the tests are in place but the parameters still need adjusting, and we might be able to increase the test parameters if we can factor out the model loading/inference setup somehow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant