-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Embedding update for run3 #47299
base: master
Are you sure you want to change the base?
Embedding update for run3 #47299
Conversation
…orStateFilter during HLT step
…r_run3. This fixes one problem embedding has with some hlt filters.
cms-bot internal usage |
-code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-47299/43620 Code check has found code style and quality issues which could be resolved by applying following patch(s)
|
@@ -639,6 +608,19 @@ def customiseGenerator_HLT(process, changeProcessname=True, reselect=False): | |||
process.embeddingHltPixelVertices.clone() | |||
) | |||
|
|||
# replace the original detector state filters in the HLT with a dummy module | |||
process.hltPixelTrackerHVOn = cms.EDFilter("EmbeddingDetectorStateFilter", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to admit I am having difficulties to understand the logic here (@cms-sw/hlt-l2 FYI).
If you want to replace these two filters with something that always pass, I guess you can just use this configuration:
process.hltPixelTrackerHVOn = cms.EDFilter("HLTBool",
result = cms.bool(True)
)
without the need of creating a new module EmbeddingDetectorStateFilter
that does nothing.
But more in general can you explain what goes wrong with the regular module we use at HLT?
In the embedded event from where do the unpacked online metadata digis come from? From the real data event or the embedded MC event?
For the record, in the current menu we configure these modules as:
fragment.hltPixelTrackerHVOn = cms.EDFilter( "DetectorStateFilter",
DebugOn = cms.untracked.bool( False ),
DetectorType = cms.untracked.string( "pixel" ),
acceptedCombinations = cms.untracked.vstring( ),
DcsStatusLabel = cms.untracked.InputTag( "" ),
DCSRecordLabel = cms.untracked.InputTag( "hltOnlineMetaDataDigis" )
)
fragment.hltStripTrackerHVOn = cms.EDFilter( "DetectorStateFilter",
DebugOn = cms.untracked.bool( False ),
DetectorType = cms.untracked.string( "sistrip" ),
acceptedCombinations = cms.untracked.vstring( ),
DcsStatusLabel = cms.untracked.InputTag( "" ),
DCSRecordLabel = cms.untracked.InputTag( "hltOnlineMetaDataDigis" )
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, thanks for giving us this hint. We are going to change the implementation according to your proposal! The reason why we introduced a module which always evaluates to True
is an issue with the DetectorStateFilter
when processing embedded events. The two filters reject 100% of our events. (The results of corresponding study can be found in this presentation: https://indico.cern.ch/event/1389181/contributions/5841911/attachments/2817132/4918642/2024-03-11-tau-cqm-meeting-triggers-in-mu-to-tau-embedded-events.pdf.) To our understanding, this is related to the fact that embedded events are "real data" events, but the detector in the HLT simulation is a simulated detector. Thus, the module always tries to read out the detector status of the real detector, while a simulated detector is present in our HLT simulation of the embedded event. With our implementation, we want to emulate the behavior of this module on MC events, which always pass this filter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@moritzmolch is there a cmsDriver that can be used to reproduce such behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you want to achieve with a cmsDriver command.
First of all the tau embedding datasets are completely separated from normal cms or MC datasets.
You can find those datasets for Run2 UL here as described in this twiki.
In those some HLT filters perform worse or have an efficiency of 0%.
This is because the tau embedding method simulates two tau decays in an otherwise empty detector. If the HLT step now runs on these events, it makes perfect sense that some filters that expect a busy background will not work properly in this empty detector with only the tau decays.
This is therefore expected behavior up to a certain point. We solve this problem by switching off these problematic filters or changing them so that they allow all events.
Sorry if this was not clear and feel free to ask if you want to know more about this.
I also plan to give a presentation about the tau embedding method in the next RECO meeting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the HLT step now runs on these events, it makes perfect sense that some filters that expect a busy background will not work properly in this empty detector with only the tau decays.
OK, but in this PR we're discussing specifically these two filters: hltPixelTrackerHVOn
and hltStripTrackerHVOn
.
These filters don't care about event occupancy. That you have 10k tracks or none at all, it doesn't matter.
What matters is the aggregate DCS state of the detector which can be fully ON, partially ON or fully OFF.
Now, in MC events the filter is designed to accept all the events, see
cmssw/DQM/TrackerCommon/plugins/DetectorStateFilter.cc
Lines 308 to 314 in 2e0d63c
} else { | |
detectorOn_ = true; | |
nSelectedEvents_++; | |
if (verbose_) { | |
edm::LogInfo("DetectorStatusFilter") << "Total MC Events " << nEvents_ << " Selected Events " << nSelectedEvents_ | |
<< " Detector States " << detectorOn_ << std::endl; | |
} |
so you wouldn't need to bypass them.
In real data, it depends on what was the actual state of the detector in that particular event, but assuming you are able to reconstruct taus in it, it looks to me very unlikely that any of the tracker partitions was OFF.
So all in all, I am sorry, but this doesn't make any sense to me.
Now coming back to this:
I'm not sure what you want to achieve with a cmsDriver command.
what I want to achieve is to see for myself the answer to the question I asked at #47299 (comment)
In the embedded event from where do the unpacked online metadata digis come from? From the real data event or the embedded MC event?
as I haven't received any answer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now, in MC events the filter is designed to accept all the events, ... In real data, it depends on what was the actual state of the detector in that particular event, but assuming you are able to reconstruct taus in it, it looks to me very unlikely that any of the tracker partitions was OFF. So all in all, I am sorry, but this doesn't make any sense to me.
Ah, now this makes sense. Thanks for pointing that out. While looking at the code you linked, I realized that this is decided by evt.isRealData()
, which calls a function in EventAuxiliary. I looked into our embedding files and there isRealData
is always True
. So that could be the reason this trigger performs not as expected.
what I want to achieve is to see for myself the answer to the question I asked at #47299 (comment)
Sorry for this, I'm not an expert on triggers and didn't understand your goal with this.
I will try to answer at my understanding:
But more in general can you explain what goes wrong with the regular module we use at HLT?
I guess it is connected to the evt.isRealData()
as described above.
In the embedded event from where do the unpacked online metadata digis come from? From the real data event or the embedded MC event?
Can you please tell me what are unpacked online metadata digis
?
And for the cmsDriver commands: I have described how to produce embedding samples in my presentation at a TAU POG meeting last week. You can find the slides here and on slide 25 the cmsDriver command for the HLT step. As input file for the HLT step you can use root://xrootd-cms.infn.it//store/user/cwinter/run3_embedding/2022postEE/MuMu/gensim/0_gensim.root
.
And also thanks a lot for your effort to get to the ground of our problems here, I'm very grateful of your help with all this.
* Revert changes of commit 064da79 and replace detector state checks with simple 'True' filters * add a explaining comment --------- Co-authored-by: Chris Winter <[email protected]>
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-47299/43634 |
A new Pull Request was created by @winterchristian for master. It involves the following packages:
@civanch, @cmsbuild, @kpedro88, @mdhildreth can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
please test |
The current policy/procedure is to try to retain backward compatibility. Removing Run 2 collections would potentially break that. In the offline workflows, we handle per-run changes using Eras. I am not sure if it should be done that way here because of the interplay with HLT. @mmusich ? |
TSG doesn't guarantee one given HLT menu to run in any other release other than it's native release while taking data. |
@kpedro88 , should we agree with this PR? |
Before you merge, I would like to implement what @kpedro88 suggested. So that we could use the same code for different eras. Therefore, I will convert it to draft until I implemented these small changes. |
thanks @winterchristian . It is still not clear to me if anything will break with the current PR version, but it's always better to make sure. |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-47299/43670 |
Pull request #47299 was updated. |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-47299/43681 |
Pull request #47299 was updated. |
I thought I have implemented it as @kpedro88 suggested, but it doesn't work. All I'm using the following cmsDriver command: cmsDriver.py TauAnalysis/MCEmbeddingTools/python/EmbeddingPythia8Hadronizer_cfi.py \
--step GEN,SIM,DIGI,L1,DIGI2RAW \
--mc \
--beamspot Realistic25ns13p6TeVEarly2022Collision \
--geometry DB:Extended \
--era Run3 \
--conditions auto:phase1_2022_realistic_postEE \
--eventcontent RAWSIM \
--datatier RAWSIM \
--customise \
TauAnalysis/MCEmbeddingTools/customisers.customiseGenerator_preHLT \
--customise_commands 'process.generator.HepMCFilter.filterParameters.MuMuCut = cms.string("(Mu.Pt > 18 && Had.Pt > 18 && Mu.Eta < 2.2 && Had.Eta < 2.4)");process.generator.HepMCFilter.filterParameters.Final_States = cms.vstring("MuHad");process.generator.nAttempts = cms.uint32(1000);' \
--filein file:lhe_and_cleaned.root \
--fileout file:simulated_and_cleaned_prehlt.root \
-n -1 \
--python_filename generator_preHLT.py and it already fails before loading the input file with the following error: |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-47299/43685 |
Pull request #47299 was updated. @civanch, @cmsbuild, @kpedro88, @mdhildreth can you please check and sign again. |
@winterchristian the Eras are defined in a sequential manner, so Run3 builds on Run2_2018: https://github.com/cms-sw/cmssw/blob/master/Configuration/Eras/python/Era_Run3_cff.py; and Run2_2018 builds on Run2_2017: https://github.com/cms-sw/cmssw/blob/master/Configuration/Eras/python/Era_Run2_2018_cff.py; etc. If you want something to happen only for Run2, then you either have to use one of the specific modifiers that gets dropped in Run3, or you have to do something like |
PR description:
This PR updates the tau embedding method (
TauAnalysis/MCEmbeddingTools
) so that it is possible to produce RUN 3 tau embedding samples. It is together with #43871 part of the ongoing effort to produce RUN 3 tau embedding samples.The tau embedding method is used to estimate the genuine di-tau background from data. It is a common method used in analyses with tau leptons. More information about tau embedding can be found in the paper: https://doi.org/10.1088/1748-0221/14/06/P06032
In principle, one can split the method into 4 steps, but technically there are at least 6 steps that have to be executed.
The following things where changed in this pull request: