Scale rates via datacard parser #1015

IzaakWN · 2024-11-08T15:22:02Z

This PR allows users to scale rates in datacard parser via command line for combineCards.py or text2workspace.py.

I ran into a use case where I want to combine cards with signal pdfs normalized to 1 pb to set limits on the cross section in units of pb. When combined, the signal yields/rates need to be reweighed by their relative fraction of the total cross section. With this PR, one can do

combineCards.py datacard_wh.txt datacard_zh.txt --scale-rate='wh_.*=0.6,zh_.*=0.4'

where 0.6 = 1.37/(1.37+0.88) and 0.4 = 0.88/(1.37+0.88) for 𝜎(WH) = 1.37 pb and 𝜎(ZH) = 0.88 pb.

The implementation allows to use regular expression by default, and specify the bin, e.g.

--scale-rate='ch[12]/wh.*=0.6,ch[12]/wz.*=0.4'

Mathematical expressions native to python are possible thanks to implementation with eval(), e.g.

--scale-rate='wh.*=1.37/(1.37+0.88),zh.*=1-1.37/(1.37+0.88)'

Hope this can be useful to others? I think it should work for simple datacards without shapes and just the rates, or datacards with PDFs, but not histograms (see documentation)?

…text2workspace.py command line

adewit · 2024-11-08T15:40:11Z

Thanks for the initiative - I'm wondering why a physics model wouldn't work for this ?

If a physics model wouldn't suffice, I think it's probably better to rescale in the input datacards, e.g. by parsing them with CombineHarvester and adding a rateParam frozen to the scaling value, or by adjusting the original datacards, if the code used to make those cards is still available.

You seem to say that the option wouldn't work in all cases, and I'd be a bit worried about adding text2workspace runtime options that don't have a fully defined behaviour (we've already seen problems when the nuisance edit directive has been overused in situations for which it was not designed). Maybe I misunderstood what you said :-)

IzaakWN · 2024-11-08T15:58:25Z

Thanks for the initiative - I'm wondering why a physics model wouldn't work for this ?

Yes, it should not be too hard to create a physics model for this. Do you happen to know if one exists that allows you to scale specified processes?

If a physics model wouldn't suffice, I think it's probably better to rescale in the input datacards, e.g. by parsing them with CombineHarvester and adding a rateParam frozen to the scaling value, or by adjusting the original datacards, if the code used to make those cards is still available.

Yes, that is also be a good solution. I encountered the use case when reviewing simple cards for a SUS search with signal PDFs made with a custom datacard writer, so I thought this would be a very simple way for a user to scale the rates on the fly when combining individual cards with combineCards.py, and ensuring the total cross section remains 1 pb to compute limits.

You seem to say that the option wouldn't work in all cases, and I'd be a bit worried about adding text2workspace runtime options that don't have a fully defined behaviour (we've already seen problems when the nuisance edit directive has been overused in situations for which it was not designed). Maybe I misunderstood what you said :-)

No, I think you are right and understood correctly. My understanding is that if a process's shape is taken from a ROOT histogram, the yield in the rates must match the integral of the respective histogram, unless -1 is used? If the rate differs from the histogram integral, text2workspace.py will complain here:

HiggsAnalysis-CombinedLimit/python/ShapeTools.py

Lines 552 to 556 in 21238d9

    
           if self.DC.exp[b][p] == -1: 
        
               self.DC.exp[b][p] = norm 
        
           elif self.DC.exp[b][p] > 0 and abs(norm - self.DC.exp[b][p]) > 0.01 * max(1, self.DC.exp[b][p]): 
        
               if not self.options.noCheckNorm: 
        
                   raise RuntimeError("Mismatch in normalizations for bin %s, process %s: rate %f, shape %f" % (b, p, self.DC.exp[b][p], norm))

I think the added runtime would be negligible if --scale-rate is not used, but I understand if you're prefer to not add options without fully defined behavior. Otherwise, I could specify a warning in the command line?

IzaakWN · 2024-11-08T16:41:28Z

I was overthinking it. Adding the lines

f_wh rateParam * wh_* 0.6
f_zh rateParam * zh_* 0.4

is a perfectly good solution, and more transparant and universal as well. It's easy to do on the fly:

combineCards.py datacard_wh.txt datacard_zh.txt > datacard_comb.txt
sed '$s/.*/f_wh rateParam * wh_* 0.6\n/' -i datacard_comb.txt
sed '$s/.*/f_zh rateParam * zh_* 0.4\n/' -i datacard_comb.txt

I am closing this PR.

allow users to scale rates in datacard parser via combineCards.py or …

667f661

…text2workspace.py command line

IzaakWN closed this Nov 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scale rates via datacard parser #1015

Scale rates via datacard parser #1015

IzaakWN commented Nov 8, 2024

adewit commented Nov 8, 2024

IzaakWN commented Nov 8, 2024

IzaakWN commented Nov 8, 2024 •

edited

Loading

Scale rates via datacard parser #1015

Scale rates via datacard parser #1015

Conversation

IzaakWN commented Nov 8, 2024

adewit commented Nov 8, 2024

IzaakWN commented Nov 8, 2024

IzaakWN commented Nov 8, 2024 • edited Loading

IzaakWN commented Nov 8, 2024 •

edited

Loading