Replace Second Delta Beta Cut with DNN #126

GNiendorf · 2024-11-19T01:13:32Z

On the 1000 event RelVal plots (not shown here yet) this PR leads to a higher efficiency at a lower FR. Considering that issue #123 was fixed though, I think it makes sense to only merge this PR if I can replace both delta beta cuts at a performance improvement. Leaving this PR as a draft for now.

Continuation of PR #122, although I heavily downsample 80% matched tracks during training in this PR to get better overall performance and a DNN cut more similar to the existing delta beta cuts.

Edit: It looks like this method of downsampling 80% tracks works for the second delta beta cut, but not for the first delta beta cut. It seems like the more underlying issue is network size when training on the larger sample. If I increase network size substantially it fixes the issue without having to downsample tracks, although I think it would take a substantial amount of effort to make a larger model that has competitive timing to the current hybrid approach. Closing this PR for now. I'm going to shift my focus to work on loading the weights properly and hopefully getting a timing improvement with lower precision weights.

GNiendorf · 2024-11-19T01:13:41Z

/run all

github-actions · 2024-11-19T01:27:07Z

The PR was built and ran successfully in standalone mode. Here are some of the comparison plots.

The full set of validation and comparison plots can be found here.

Here is a timing comparison:

   Evt    Hits       MD       LS      T3       T5       pLS       pT5      pT3      TC       Reset    Event     Short             Rate
   avg     43.4    319.4    115.6     73.3    118.6    508.7    126.4    138.2    140.2      2.4    1586.1    1034.1+/- 275.2     434.9   explicit_cache[s=4] (target branch)
   avg     46.7    324.9    116.8     73.9    114.1    498.2    125.3    137.9    144.9      3.1    1585.7    1040.8+/- 272.3     434.5   explicit_cache[s=4] (this PR)

github-actions · 2024-11-19T02:32:41Z

The PR was built and ran successfully with CMSSW. Here are some plots.

OOTB All Tracks

The full set of validation and comparison plots can be found here.

GNiendorf · 2024-11-21T18:37:55Z

@slava77 It looks like this method of downsampling 80% tracks works for the second delta beta cut, but not for the first delta beta cut. It seems like the more underlying issue is network size when training on the larger sample. If I increase network size substantially it fixes the issue without having to downsample tracks, although I think it would take a substantial amount of effort to make a larger model that has competitive timing to the current hybrid approach. Closing this PR for now. I'm going to shift my focus to work on loading the weights properly and hopefully getting a timing improvement with lower precision weights.

GNiendorf · 2024-11-21T18:45:36Z

404795f

Here is the big dnn code with associated training notebook for future reference.

slava77 · 2024-11-21T19:51:55Z

404795f

Here is the big dnn code with associated training notebook for future reference.

two more layers and double the hidden features; right?
How much slower is it?

GNiendorf · 2024-11-21T20:28:06Z

two more layers and double the hidden features; right? How much slower is it?

The number of parameters increases by a factor of ~7.6x, the t5 timing increases by a factor of 12x (1ms->12ms). I didn't check to see if a smaller network also fixes the issue though.

slava77 · 2024-11-22T14:05:52Z

two more layers and double the hidden features; right? How much slower is it?

The number of parameters increases by a factor of ~7.6x, the t5 timing increases by a factor of 12x (1ms->12ms). I didn't check to see if a smaller network also fixes the issue though.

running a profiler may help.
2x more features would 2^3 in matrix computation. x2 layers it's 16x. While I thought that in 1ms the DNN computation was not the leading term.
I suspect memory constraints, that the new weights don't fit anymore in the SM.
Perhaps only 50% more hidden features would fit (at least one per layer matrix)

initial replace dbeta commit, 8x disp weight, 30% downsample

71a0b7d

GNiendorf closed this Nov 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace Second Delta Beta Cut with DNN #126

Replace Second Delta Beta Cut with DNN #126

GNiendorf commented Nov 19, 2024 •

edited

Loading

GNiendorf commented Nov 19, 2024

github-actions bot commented Nov 19, 2024

github-actions bot commented Nov 19, 2024

GNiendorf commented Nov 21, 2024

GNiendorf commented Nov 21, 2024

slava77 commented Nov 21, 2024

GNiendorf commented Nov 21, 2024 •

edited

Loading

slava77 commented Nov 22, 2024

Replace Second Delta Beta Cut with DNN #126

Replace Second Delta Beta Cut with DNN #126

Conversation

GNiendorf commented Nov 19, 2024 • edited Loading

GNiendorf commented Nov 19, 2024

github-actions bot commented Nov 19, 2024

github-actions bot commented Nov 19, 2024

GNiendorf commented Nov 21, 2024

GNiendorf commented Nov 21, 2024

slava77 commented Nov 21, 2024

GNiendorf commented Nov 21, 2024 • edited Loading

slava77 commented Nov 22, 2024

GNiendorf commented Nov 19, 2024 •

edited

Loading

GNiendorf commented Nov 21, 2024 •

edited

Loading