Skip to content
This repository has been archived by the owner on Dec 9, 2024. It is now read-only.

Support/README for lnx4555 #363

Merged
merged 1 commit into from
Jan 25, 2024
Merged

Support/README for lnx4555 #363

merged 1 commit into from
Jan 25, 2024

Conversation

GNiendorf
Copy link
Member

Small PR that adds the relevant flags to optimize for the L4 and L40 GPU's on lnx4555, and updates the README with relevant info. Also, below is the timing for the L4. I checked the timing for a higher number of streams, but it plateaus at 8 streams.

Screenshot 2024-01-22 at 11 07 08 AM

Copy link
Contributor

@VourMa VourMa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR looks good to me. I will merge tonight, unless someone comes up with comments/suggestions in the meantime.

@GNiendorf
Copy link
Member Author

/run standalone

Copy link

The PR was built and ran successfully in standalone mode. Here are some of the comparison plots.

Efficiency vs pT comparison Efficiency vs eta comparison
Fake rate vs pT comparison Fake rate vs eta comparison
Duplicate rate vs pT comparison Duplicate rate vs eta comparison

The full set of validation and comparison plots can be found here.

@VourMa VourMa merged commit 25e08e7 into master Jan 25, 2024
2 checks passed
@GNiendorf GNiendorf deleted the lnx4555_startup branch March 26, 2024 14:05
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants