Embedding update for run3 #47299

winterchristian · 2025-02-07T19:10:55Z

PR description:

This PR updates the tau embedding method (TauAnalysis/MCEmbeddingTools) so that it is possible to produce RUN 3 tau embedding samples. It is together with #43871 part of the ongoing effort to produce RUN 3 tau embedding samples.

The tau embedding method is used to estimate the genuine di-tau background from data. It is a common method used in analyses with tau leptons. More information about tau embedding can be found in the paper: https://doi.org/10.1088/1748-0221/14/06/P06032
In principle, one can split the method into 4 steps, but technically there are at least 6 steps that have to be executed.

The following things where changed in this pull request:

For the embedding merging step: removing some collections, which were only needed for Run2
For the embedding generator HLT step: replacing a filter by a simple dummy that ignores the fact that the tau-leptons are simulated in an else empty detector.

… each events

…orStateFilter during HLT step

…r_run3. This fixes one problem embedding has with some hlt filters.

cmsbuild · 2025-02-07T19:11:19Z

cms-bot internal usage

cmsbuild · 2025-02-07T19:13:19Z

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-47299/43620

Code check has found code style and quality issues which could be resolved by applying following patch(s)

code-format:
https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-47299/43620/code-format.patch
e.g. curl -k https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-47299/43620/code-format.patch | patch -p1
You can also run scram build code-format to apply code format directly

mmusich · 2025-02-07T19:35:19Z

TauAnalysis/MCEmbeddingTools/python/customisers.py

@@ -639,6 +608,19 @@ def customiseGenerator_HLT(process, changeProcessname=True, reselect=False):
        process.embeddingHltPixelVertices.clone()
    )

+    # replace the original detector state filters in the HLT with a dummy module
+    process.hltPixelTrackerHVOn = cms.EDFilter("EmbeddingDetectorStateFilter",


I have to admit I am having difficulties to understand the logic here (@cms-sw/hlt-l2 FYI).
If you want to replace these two filters with something that always pass, I guess you can just use this configuration:

process.hltPixelTrackerHVOn = cms.EDFilter("HLTBool", result = cms.bool(True) )

without the need of creating a new module EmbeddingDetectorStateFilter that does nothing.

But more in general can you explain what goes wrong with the regular module we use at HLT?
In the embedded event from where do the unpacked online metadata digis come from? From the real data event or the embedded MC event?

For the record, in the current menu we configure these modules as:

fragment.hltPixelTrackerHVOn = cms.EDFilter( "DetectorStateFilter", DebugOn = cms.untracked.bool( False ), DetectorType = cms.untracked.string( "pixel" ), acceptedCombinations = cms.untracked.vstring( ), DcsStatusLabel = cms.untracked.InputTag( "" ), DCSRecordLabel = cms.untracked.InputTag( "hltOnlineMetaDataDigis" ) ) fragment.hltStripTrackerHVOn = cms.EDFilter( "DetectorStateFilter", DebugOn = cms.untracked.bool( False ), DetectorType = cms.untracked.string( "sistrip" ), acceptedCombinations = cms.untracked.vstring( ), DcsStatusLabel = cms.untracked.InputTag( "" ), DCSRecordLabel = cms.untracked.InputTag( "hltOnlineMetaDataDigis" ) )

Hi, thanks for giving us this hint. We are going to change the implementation according to your proposal! The reason why we introduced a module which always evaluates to True is an issue with the DetectorStateFilter when processing embedded events. The two filters reject 100% of our events. (The results of corresponding study can be found in this presentation: https://indico.cern.ch/event/1389181/contributions/5841911/attachments/2817132/4918642/2024-03-11-tau-cqm-meeting-triggers-in-mu-to-tau-embedded-events.pdf.) To our understanding, this is related to the fact that embedded events are "real data" events, but the detector in the HLT simulation is a simulated detector. Thus, the module always tries to read out the detector status of the real detector, while a simulated detector is present in our HLT simulation of the embedded event. With our implementation, we want to emulate the behavior of this module on MC events, which always pass this filter.

@moritzmolch is there a cmsDriver that can be used to reproduce such behavior?

I'm not sure what you want to achieve with a cmsDriver command.
First of all the tau embedding datasets are completely separated from normal cms or MC datasets.
You can find those datasets for Run2 UL here as described in this twiki.
In those some HLT filters perform worse or have an efficiency of 0%.

This is because the tau embedding method simulates two tau decays in an otherwise empty detector. If the HLT step now runs on these events, it makes perfect sense that some filters that expect a busy background will not work properly in this empty detector with only the tau decays.

This is therefore expected behavior up to a certain point. We solve this problem by switching off these problematic filters or changing them so that they allow all events.

Sorry if this was not clear and feel free to ask if you want to know more about this.
I also plan to give a presentation about the tau embedding method in the next RECO meeting.

If the HLT step now runs on these events, it makes perfect sense that some filters that expect a busy background will not work properly in this empty detector with only the tau decays.

OK, but in this PR we're discussing specifically these two filters: hltPixelTrackerHVOn and hltStripTrackerHVOn.
These filters don't care about event occupancy. That you have 10k tracks or none at all, it doesn't matter.
What matters is the aggregate DCS state of the detector which can be fully ON, partially ON or fully OFF.
Now, in MC events the filter is designed to accept all the events, see

cmssw/DQM/TrackerCommon/plugins/DetectorStateFilter.cc

Lines 308 to 314 in 2e0d63c

} else {

detectorOn_ = true;

nSelectedEvents_++;

if (verbose_) {

edm::LogInfo("DetectorStatusFilter") << "Total MC Events " << nEvents_ << " Selected Events " << nSelectedEvents_

<< " Detector States " << detectorOn_ << std::endl;

}

so you wouldn't need to bypass them.
In real data, it depends on what was the actual state of the detector in that particular event, but assuming you are able to reconstruct taus in it, it looks to me very unlikely that any of the tracker partitions was OFF.
So all in all, I am sorry, but this doesn't make any sense to me.

Now coming back to this:

I'm not sure what you want to achieve with a cmsDriver command.

what I want to achieve is to see for myself the answer to the question I asked at #47299 (comment)

In the embedded event from where do the unpacked online metadata digis come from? From the real data event or the embedded MC event?

as I haven't received any answer.

Now, in MC events the filter is designed to accept all the events, ... In real data, it depends on what was the actual state of the detector in that particular event, but assuming you are able to reconstruct taus in it, it looks to me very unlikely that any of the tracker partitions was OFF. So all in all, I am sorry, but this doesn't make any sense to me.

Ah, now this makes sense. Thanks for pointing that out. While looking at the code you linked, I realized that this is decided by evt.isRealData(), which calls a function in EventAuxiliary. I looked into our embedding files and there isRealData is always True. So that could be the reason this trigger performs not as expected.

what I want to achieve is to see for myself the answer to the question I asked at #47299 (comment)

Sorry for this, I'm not an expert on triggers and didn't understand your goal with this.
I will try to answer at my understanding:

But more in general can you explain what goes wrong with the regular module we use at HLT?

I guess it is connected to the evt.isRealData() as described above.

In the embedded event from where do the unpacked online metadata digis come from? From the real data event or the embedded MC event?

Can you please tell me what are unpacked online metadata digis?

And for the cmsDriver commands: I have described how to produce embedding samples in my presentation at a TAU POG meeting last week. You can find the slides here and on slide 25 the cmsDriver command for the HLT step. As input file for the HLT step you can use root://xrootd-cms.infn.it//store/user/cwinter/run3_embedding/2022postEE/MuMu/gensim/0_gensim.root.

And also thanks a lot for your effort to get to the ground of our problems here, I'm very grateful of your help with all this.

* Revert changes of commit 064da79 and replace detector state checks with simple 'True' filters * add a explaining comment --------- Co-authored-by: Chris Winter <[email protected]>

cmsbuild · 2025-02-10T13:49:42Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-47299/43634

Found files with invalid states:
- TauAnalysis/MCEmbeddingTools/plugins/EmbeddingDetectorStateFilter.cc:
  - Added: 878160f
  - Deleted: d27b7fe

cmsbuild · 2025-02-10T13:50:04Z

A new Pull Request was created by @winterchristian for master.

It involves the following packages:

TauAnalysis/MCEmbeddingTools (simulation)

@civanch, @cmsbuild, @kpedro88, @mdhildreth can you please review it and eventually sign? Thanks.
@antoniovilela, @mandrenguyen, @rappoccio, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

kpedro88 · 2025-02-10T13:56:04Z

please test

kpedro88 · 2025-02-10T13:57:04Z

The current policy/procedure is to try to retain backward compatibility. Removing Run 2 collections would potentially break that. In the offline workflows, we handle per-run changes using Eras. I am not sure if it should be done that way here because of the interplay with HLT. @mmusich ?

mmusich · 2025-02-10T14:00:14Z

I am not sure if it should be done that way here because of the interplay with HLT.

TSG doesn't guarantee one given HLT menu to run in any other release other than it's native release while taking data.
Having said that I don't see removal of HLT collections in this PR.

civanch · 2025-02-11T15:44:50Z

@kpedro88 , should we agree with this PR?

winterchristian · 2025-02-11T16:55:40Z

Before you merge, I would like to implement what @kpedro88 suggested. So that we could use the same code for different eras. Therefore, I will convert it to draft until I implemented these small changes.

kpedro88 · 2025-02-11T16:57:02Z

thanks @winterchristian . It is still not clear to me if anything will break with the current PR version, but it's always better to make sure.

cmsbuild · 2025-02-12T18:43:01Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-47299/43670

Found files with invalid states:
- TauAnalysis/MCEmbeddingTools/plugins/EmbeddingDetectorStateFilter.cc:
  - Added: 878160f
  - Deleted: d27b7fe

cmsbuild · 2025-02-12T18:43:25Z

Pull request #47299 was updated.

cmsbuild · 2025-02-13T16:47:26Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-47299/43681

Found files with invalid states:
- TauAnalysis/MCEmbeddingTools/plugins/EmbeddingDetectorStateFilter.cc:
  - Added: 878160f
  - Deleted: d27b7fe

cmsbuild · 2025-02-13T16:47:52Z

Pull request #47299 was updated.

winterchristian · 2025-02-13T19:43:40Z

I thought I have implemented it as @kpedro88 suggested, but it doesn't work. All toModify() functions are executed even if I use the newest CMMSW version with run3 conditions. Can you please tell me what I've done wrong?

I'm using the following cmsDriver command:

cmsDriver.py TauAnalysis/MCEmbeddingTools/python/EmbeddingPythia8Hadronizer_cfi.py \
    --step GEN,SIM,DIGI,L1,DIGI2RAW \
    --mc \
    --beamspot Realistic25ns13p6TeVEarly2022Collision \
    --geometry DB:Extended \
    --era Run3 \
    --conditions auto:phase1_2022_realistic_postEE \
    --eventcontent RAWSIM \
    --datatier RAWSIM \
    --customise \
    TauAnalysis/MCEmbeddingTools/customisers.customiseGenerator_preHLT \
    --customise_commands 'process.generator.HepMCFilter.filterParameters.MuMuCut = cms.string("(Mu.Pt > 18 && Had.Pt > 18 && Mu.Eta < 2.2 && Had.Eta < 2.4)");process.generator.HepMCFilter.filterParameters.Final_States = cms.vstring("MuHad");process.generator.nAttempts = cms.uint32(1000);' \
    --filein file:lhe_and_cleaned.root \
    --fileout file:simulated_and_cleaned_prehlt.root \
    -n -1 \
    --python_filename generator_preHLT.py

and it already fails before loading the input file with the following error: AttributeError: 'PSet' object has no attribute 'castor'

cmsbuild · 2025-02-13T19:50:23Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-47299/43685

Found files with invalid states:
- TauAnalysis/MCEmbeddingTools/plugins/EmbeddingDetectorStateFilter.cc:
  - Added: 878160f
  - Deleted: d27b7fe

cmsbuild · 2025-02-13T19:50:49Z

Pull request #47299 was updated. @civanch, @cmsbuild, @kpedro88, @mdhildreth can you please check and sign again.

kpedro88 · 2025-02-13T19:56:54Z

@winterchristian the Eras are defined in a sequential manner, so Run3 builds on Run2_2018: https://github.com/cms-sw/cmssw/blob/master/Configuration/Eras/python/Era_Run3_cff.py; and Run2_2018 builds on Run2_2017: https://github.com/cms-sw/cmssw/blob/master/Configuration/Eras/python/Era_Run2_2018_cff.py; etc. If you want something to happen only for Run2, then you either have to use one of the specific modifiers that gets dropped in Run3, or you have to do something like (run2_common & ~run3_common).toModify(...).

Christian Winter and others added 6 commits February 7, 2025 18:01

remove not necessary merging steps

1937801

Introduce dummy DetectorStateFilter for embedding, which is passed by…

878160f

… each events

Replace original DetectorStateFilter in HLT path with EmbeddingDetect…

a059d00

…orStateFilter during HLT step

remove not necessary merging steps

24ff9bc

change muon HLT trigger in selection decission

69a3e80

Merge branch 'hlt_detector_state_filter_fix' into embedding_update_fo…

064da79

…r_run3. This fixes one problem embedding has with some hlt filters.

cmsbuild added this to the CMSSW_15_1_X milestone Feb 7, 2025

cmsbuild added simulation-pending pending-signatures tests-pending orp-pending code-checks-pending labels Feb 7, 2025

cmsbuild added code-checks-rejected and removed code-checks-pending labels Feb 7, 2025

mmusich reviewed Feb 7, 2025

View reviewed changes

moritzmolch mentioned this pull request Feb 10, 2025

Simplify implementation of dummy detector state filters KIT-CMS/cmssw#4

Merged

Simplify implementation of dummy detector state filters (#4)

d27b7fe

* Revert changes of commit 064da79 and replace detector state checks with simple 'True' filters * add a explaining comment --------- Co-authored-by: Chris Winter <[email protected]>

cmsbuild added code-checks-pending and removed code-checks-rejected labels Feb 10, 2025

cmsbuild added code-checks-approved and removed code-checks-pending labels Feb 10, 2025

cmsbuild added tests-started and removed tests-pending labels Feb 10, 2025

winterchristian marked this pull request as draft February 11, 2025 16:55

readd stuff from run2 with era modifier

c55bde0

cmsbuild added tests-pending code-checks-pending and removed tests-approved code-checks-approved labels Feb 12, 2025

cmsbuild added code-checks-approved and removed code-checks-pending labels Feb 12, 2025

change Trigger conditions in selection step and add comments

ac65461

cmsbuild added code-checks-pending and removed code-checks-approved labels Feb 13, 2025

cmsbuild added code-checks-approved and removed code-checks-pending labels Feb 13, 2025

winterchristian marked this pull request as ready for review February 13, 2025 16:48

add a unit test for 2022postEE data

36d2593

cmsbuild added code-checks-pending and removed code-checks-approved labels Feb 13, 2025

cmsbuild added code-checks-approved and removed code-checks-pending labels Feb 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Embedding update for run3 #47299

Embedding update for run3 #47299

winterchristian commented Feb 7, 2025

cmsbuild commented Feb 7, 2025 •

edited

Loading

cmsbuild commented Feb 7, 2025

mmusich Feb 7, 2025 •

edited

Loading

moritzmolch Feb 10, 2025

mmusich Feb 10, 2025

winterchristian Feb 11, 2025

mmusich Feb 11, 2025 •

edited

Loading

winterchristian Feb 12, 2025 •

edited

Loading

cmsbuild commented Feb 10, 2025

cmsbuild commented Feb 10, 2025

kpedro88 commented Feb 10, 2025

kpedro88 commented Feb 10, 2025

mmusich commented Feb 10, 2025

civanch commented Feb 11, 2025

winterchristian commented Feb 11, 2025 •

edited

Loading

kpedro88 commented Feb 11, 2025

cmsbuild commented Feb 12, 2025

cmsbuild commented Feb 12, 2025

cmsbuild commented Feb 13, 2025

cmsbuild commented Feb 13, 2025

winterchristian commented Feb 13, 2025

cmsbuild commented Feb 13, 2025

cmsbuild commented Feb 13, 2025

kpedro88 commented Feb 13, 2025

	} else {
	detectorOn_ = true;
	nSelectedEvents_++;
	if (verbose_) {
	edm::LogInfo("DetectorStatusFilter") << "Total MC Events " << nEvents_ << " Selected Events " << nSelectedEvents_
	<< " Detector States " << detectorOn_ << std::endl;
	}

Embedding update for run3 #47299

Are you sure you want to change the base?

Embedding update for run3 #47299

Conversation

winterchristian commented Feb 7, 2025

PR description:

cmsbuild commented Feb 7, 2025 • edited Loading

cmsbuild commented Feb 7, 2025

mmusich Feb 7, 2025 • edited Loading

Choose a reason for hiding this comment

moritzmolch Feb 10, 2025

Choose a reason for hiding this comment

mmusich Feb 10, 2025

Choose a reason for hiding this comment

winterchristian Feb 11, 2025

Choose a reason for hiding this comment

mmusich Feb 11, 2025 • edited Loading

Choose a reason for hiding this comment

winterchristian Feb 12, 2025 • edited Loading

Choose a reason for hiding this comment

cmsbuild commented Feb 10, 2025

cmsbuild commented Feb 10, 2025

kpedro88 commented Feb 10, 2025

kpedro88 commented Feb 10, 2025

mmusich commented Feb 10, 2025

civanch commented Feb 11, 2025

winterchristian commented Feb 11, 2025 • edited Loading

kpedro88 commented Feb 11, 2025

cmsbuild commented Feb 12, 2025

cmsbuild commented Feb 12, 2025

cmsbuild commented Feb 13, 2025

cmsbuild commented Feb 13, 2025

winterchristian commented Feb 13, 2025

cmsbuild commented Feb 13, 2025

cmsbuild commented Feb 13, 2025

kpedro88 commented Feb 13, 2025

cmsbuild commented Feb 7, 2025 •

edited

Loading

mmusich Feb 7, 2025 •

edited

Loading

mmusich Feb 11, 2025 •

edited

Loading

winterchristian Feb 12, 2025 •

edited

Loading

winterchristian commented Feb 11, 2025 •

edited

Loading