Skip to content
This repository has been archived by the owner on Dec 9, 2024. It is now read-only.

Move Maps to CSV Based Method #374

Merged
merged 7 commits into from
Mar 17, 2024
Merged

Move Maps to CSV Based Method #374

merged 7 commits into from
Mar 17, 2024

Conversation

GNiendorf
Copy link
Member

@GNiendorf GNiendorf commented Mar 6, 2024

This PR is associated with the corresponding PR on LSTGeometry: SegmentLinking/LSTGeometry#10

  1. Deleted all files in the current /data/ directory.
  2. Made a new README in the /data/ directory, and put the new module maps in the /OT800_IT615_pt0.8/ directory in /data/.
  3. Moved any imports in the code (in trkCore.cc and LST.cc) to the new map's directory.
  4. Improved the loading functions in TiltedGeometry.cc and EndcapGeometry.cc by getting rid of variables which were imported but never used, and made the import functions more robust to prevent similar issues described in Endcap Orientation Load Failure #375.
  5. Got rid of the if statement below, since I checked and confirmed it is never used and therefore not needed anymore (i.e. All endcap detid's are present in the endcap map now, as they should be).

    TrackLooper/SDL/Hit.h

    Lines 256 to 258 in 1b7eccf

    // Unclear why these are not in map, but CPU map returns phi = 0 for all exceptions.
    if (found_index != -1)
    phi = geoMapPhi[found_index];
  6. Updated pix_tot and endcap_size in Constants.h as required by the new maps. endcap_size went down by 1 because in the previous maps there was a comment with the column names at the top of the file being incorrectly loaded in as actual data. pix_tot changed slightly because of new CSV based geometry leading to slightly different pixel maps.
  7. Made naming improvement changes by changing the badly name "slope" variables to "dxdy" or something similar just like we have "drdz" variables.

@GNiendorf
Copy link
Member Author

/run standalone

Copy link

github-actions bot commented Mar 6, 2024

The PR was built and ran successfully in standalone mode. Here are some of the comparison plots.

Efficiency vs pT comparison Efficiency vs eta comparison
Fake rate vs pT comparison Fake rate vs eta comparison
Duplicate rate vs pT comparison Duplicate rate vs eta comparison

The full set of validation and comparison plots can be found here.

@GNiendorf
Copy link
Member Author

GNiendorf commented Mar 6, 2024

I think there is an error on the master branch. The code expects that the centroid values are attached to the endcap orientation file, see below.

ss >> detid >> avgr2 >> yl >> sl >> yh >> sh >> cr >> cp >> cz;
// std::cout << " detid: " << detid << " avgr2: " << avgr2 << " yl: " << yl << " sl: " << sl << " yh: " << yh << " sh: " << sh << std::endl;
avgr2s_[detid] = avgr2;
yls_[detid] = yl;
sls_[detid] = sl;
yus_[detid] = yh;
sus_[detid] = sh;
centroid_rs_[detid] = cp;
centroid_phis_[detid] = cr;
centroid_zs_[detid] = cz;

But if you look at the file the master branch is using,

TString::Format("%s/data/endcap_orientation_data_CMSSW_12_2_0_pre2.txt", trackLooperDir().Data()).Data());

# detid average_r2s y_intercept_low_hits slope_low_hits y_intercept_high_hits slope_high_hits
411591806 2220.16867469 -158.103584386 3.29655422422 -166.217904238 3.29655904166
411591802 2224.28374024 -97.3598468111 1.87086918017 -102.356491661 1.87086915128

It doesn't have the centroid values appended and so it is filling those values with 0's. The centroid phi's are used in the endcap map, so those values are used by the algorithm. @VourMa I'll fix this in this PR.

@GNiendorf GNiendorf linked an issue Mar 6, 2024 that may be closed by this pull request
@GNiendorf
Copy link
Member Author

/run cmssw

Copy link

github-actions bot commented Mar 6, 2024

There was a problem while building and running with CMSSW. The logs can be found here.

@GNiendorf
Copy link
Member Author

GNiendorf commented Mar 6, 2024

There was a problem while building and running with CMSSW. The logs can be found here.

Looks like the master branch failed? @ariostas

@ariostas
Copy link
Member

ariostas commented Mar 6, 2024

Huh that's weird. I'm just gonna rerun the job and see if it works.

Copy link

github-actions bot commented Mar 6, 2024

There was a problem while building and running with CMSSW. The logs can be found here.

@ariostas
Copy link
Member

ariostas commented Mar 6, 2024

I'm not sure what's happening. I'll look into it.

@ariostas
Copy link
Member

ariostas commented Mar 6, 2024

Oh it's because since SegmentLinking/cmssw#16 CMSSW uses the LST headers, including Constants.h. So, since some numbers changed there, it would have to be recompiled. I'll implement that since it might be good to always recompile the LST part between PRs and master runs. I'll let you know when it's done.

@slava77
Copy link
Contributor

slava77 commented Mar 6, 2024

was there a good reason to put new txt files in data/output ? why not directly in data/ ?
output is not a particularly meaningful additional layer.

@slava77
Copy link
Contributor

slava77 commented Mar 6, 2024

was there a good reason to put new txt files in data/output ? why not directly in data/ ? output is not a particularly meaningful additional layer.

also, with 2M lines proposed to be added, perhaps it's time to think of migrating to binary files from plain text for numbers.

@GNiendorf
Copy link
Member Author

GNiendorf commented Mar 7, 2024

was there a good reason to put new txt files in data/output ? why not directly in data/ ? output is not a particularly meaningful additional layer.

This and other stuff will get cleaned up in the next commit. Still a draft PR for now.

@GNiendorf
Copy link
Member Author

All CSV, This PR: Here
All CSV, but with old endcap file (i.e. with bug above): Here
Comparison plots: Here

I'm just showing in the plots linked above that almost all of the performance differences we see are coming from fixing the bug in #375. The difference in switching from hit-based to CSV is small and only present in a couple of bins, as I've shown before.

@GNiendorf
Copy link
Member Author

GNiendorf commented Mar 7, 2024

I removed all of the old geometry files in the /data/ folder and moved the new CSV files from /output/ to /OT800_IT615_pt0.8/ with a new README file in /data/.

I'll look at @slava77's request now of moving from .txt files to binary files.

Copy link

github-actions bot commented Mar 7, 2024

The PR was built and ran successfully with CMSSW. Here are some plots.

OOTB All Tracks
Efficiency and fake rate vs pT, eta, and phi

The full set of validation and comparison plots can be found here.

@GNiendorf
Copy link
Member Author

/run standalone

@GNiendorf GNiendorf marked this pull request as ready for review March 8, 2024 19:13
@GNiendorf GNiendorf requested a review from VourMa March 8, 2024 19:13
Copy link

github-actions bot commented Mar 8, 2024

The PR was built and ran successfully in standalone mode. Here are some of the comparison plots.

Efficiency vs pT comparison Efficiency vs eta comparison
Fake rate vs pT comparison Fake rate vs eta comparison
Duplicate rate vs pT comparison Duplicate rate vs eta comparison

The full set of validation and comparison plots can be found here.

@GNiendorf
Copy link
Member Author

/run standalone

Copy link

github-actions bot commented Mar 8, 2024

The PR was built and ran successfully in standalone mode. Here are some of the comparison plots.

Efficiency vs pT comparison Efficiency vs eta comparison
Fake rate vs pT comparison Fake rate vs eta comparison
Duplicate rate vs pT comparison Duplicate rate vs eta comparison

The full set of validation and comparison plots can be found here.

@ariostas
Copy link
Member

/run cmssw CMSSW_13_3_0_pre3_LST_X

Just making sure that the CMSSW workflow works well for 100 events.

Also, just leaving an extra reminder to squash and merge since there are some very large intermediate diffs.

Copy link

The PR was built and ran successfully with CMSSW. Here are some plots.

OOTB All Tracks
Efficiency and fake rate vs pT, eta, and phi

The full set of validation and comparison plots can be found here.

@GNiendorf
Copy link
Member Author

/run cmssw CMSSW_13_3_0_pre3_LST_X

Just making sure that the CMSSW workflow works well for 100 events.

Also, just leaving an extra reminder to squash and merge since there are some very large intermediate diffs.

Did you get rid of the comparison to master? Or is that in a different command.

Copy link

There was a problem while building and running with CMSSW. The logs can be found here.

@ariostas
Copy link
Member

Did you get rid of the comparison to master? Or is that in a different command.

Now everything is switched to CMSSW_14_1_0_pre0, so I had to specify the older branch CMSSW_13_3_0_pre3_LST_X. When running with a non-default branch it doesn't run comparisons with master since there might be many things that changed so pretty much everything would have to be done again.

@VourMa
Copy link
Contributor

VourMa commented Mar 13, 2024

@GNiendorf While I am starting to look into the PR, could you please update the PR description with the actual changes that go into this PR? It is a bit hard to understand when there are 53 files changed and issues and fixes are mentioned in a incoherent way across O(30) comments. Thank you.

Copy link
Contributor

@VourMa VourMa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks good as well. Apart from the comments, I have a couple of general questions:

  1. What is the plan on replacing the .txt with binary files? Is it something for this PR?
  2. My very naive expectation would be no changes in physics performance, so I can't say it is clear to me why validation plots show differences. Could you please explain?

SDL/Constants.h Show resolved Hide resolved
SDL/EndcapGeometry.cc Outdated Show resolved Hide resolved
@GNiendorf
Copy link
Member Author

Summary of changes @VourMa:

  1. Deleted all files in the current /data/ directory.
  2. Made a new README in the /data/ directory, and put the new module maps in the /OT800_IT615_pt0.8/ directory in /data/.
  3. Moved any imports in the code (in trkCore.cc and LST.cc) to the new map's directory.
  4. Improved the loading functions in TiltedGeometry.cc and EndcapGeometry.cc by getting rid of variables which were imported but never used, and made the import functions more robust to prevent similar issues described in Endcap Orientation Load Failure #375.
  5. Got rid of the if statement below, since I checked and confirmed it is never used and therefore not needed anymore (i.e. All endcap detid's are present in the endcap map now, as they should be).

    TrackLooper/SDL/Hit.h

    Lines 256 to 258 in 1b7eccf

    // Unclear why these are not in map, but CPU map returns phi = 0 for all exceptions.
    if (found_index != -1)
    phi = geoMapPhi[found_index];
  6. Updated pix_tot and endcap_size in Constants.h as required by the new maps. endcap_size went down by 1 because in the previous maps there was a comment with the column names at the top of the file being incorrectly loaded in as actual data. pix_tot changed slightly because of new CSV based geometry leading to slightly different pixel maps.
  7. Made naming improvement changes by changing the badly name "slope" variables to "dxdy" or something similar just like we have "drdz" variables.

@GNiendorf
Copy link
Member Author

Thanks, this looks good as well. Apart from the comments, I have a couple of general questions:

  1. What is the plan on replacing the .txt with binary files? Is it something for this PR?
  2. My very naive expectation would be no changes in physics performance, so I can't say it is clear to me why validation plots show differences. Could you please explain?

We talked about binary files in one of the weekly meetings and came to the conclusion that it should be left for a different PR in the future. Also for number 2, the performance changes are coming from the bug I fixed in #375, where the endcap maps expected that the centroid phi position was being loaded but it wasn't, so those values were all being set to 0's.

@GNiendorf
Copy link
Member Author

/run standalone
/run cmssw

Copy link

The PR was built and ran successfully in standalone mode. Here are some of the comparison plots.

Efficiency vs pT comparison Efficiency vs eta comparison
Fake rate vs pT comparison Fake rate vs eta comparison
Duplicate rate vs pT comparison Duplicate rate vs eta comparison

The full set of validation and comparison plots can be found here.

Copy link

There was a problem while building and running with CMSSW. The logs can be found here.

1 similar comment
Copy link

There was a problem while building and running with CMSSW. The logs can be found here.

@ariostas
Copy link
Member

A fatal system signal has occurred: segmentation violation
./run.sh: line 76: 4133 Segmentation fault (core dumped) cmsRun step3_RAW2DIGI_RECO_VALIDATION_DQM.py

I'm not sure why there's a segfault. I'll take a look.

@GNiendorf
Copy link
Member Author

@VourMa I thought there was another commit I had to make, but it looks like this PR is ready to be merged in as well (unless I missed something).

Copy link
Contributor

@VourMa VourMa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the replies to the comments, @GNiendorf. Approving and merging!

@VourMa VourMa merged commit e2a0a04 into master Mar 17, 2024
2 of 3 checks passed
@GNiendorf GNiendorf deleted the csv_maps branch March 26, 2024 14:05
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Endcap Orientation Load Failure
4 participants