Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support reading v0.13 and v0.14 files #417

Draft
wants to merge 34 commits into
base: master
Choose a base branch
from
Draft

Conversation

felixhekhorn
Copy link
Contributor

@felixhekhorn felixhekhorn commented Oct 17, 2024

Okay, this will need some thinking, as we maybe want to come up with a more future-proof strategy (see e.g. #145 #328)

  • this drops the Legacy runner, i.e. closes Drop legacy runner #398 - this will break many things potentially again, i.e. this is surely another major break to master. eko.runner.managed.solve which is the new default does only accept new runcards, i.e. if you need to convert the user has to do that before calling the function (see also below)
  • this drops the old file converter which was for files before v0.11 (I think)
  • then continuing from Legacy to versions #175:
    • we still have eko.io.runcards.Legacy: 1. we should rename that to NnpdfDb or similar (and maybe move it?) and 2. I guess we need to maintain that also midterm (or even forever)
    • we haven't increased the __data_version__ since we introduced it, but I guess we should start doing so to disentangle a change in API (__version__) from a change in the file format (__data_version__). The latter also refers any file in the tar, i.e. including all .yaml files. We should increase it to 3 and rely on versions beyond v0.15 on that (1 ~ v0.13, 2 ~ v0.14 although the latter was never applied).
  • I don't know how to recover from Change manipulate to act on Operator #292 : old ekos might be rotated (on any of the four axes) and we would need to counteract this everywhere ... any ideas? It is true that the old format and the new format are technical the same, but the content might not

cc @scarlehoff


TODO:

  • unify loading of unit test assets; i.e. move https://github.com/NNPDF/eko/blob/master/crates/dekoder/tests/data/download.sh somewhere more global and use it for both Rust and Python unit tests
  • move all test assets to that script, i.e. to the server
  • test v0.13:
    • First create EKO object (the v0.13 test case) in v0.13, use LHA LO FFNS
      Do the following things for this EKO object in the newest EKO version from master branch:
    • reading of theory card: check it reads the theory card as a theory card
    • reading of operator card: check it reads the operator card as an operator card
    • reading of the 4D operator:
  • test v0.14 (same as for v0.13)
    • First create EKO object (the v0.13 test case) in v0.13, use LHA LO FFNS
      Do the following things for this EKO object in the newest EKO version from master branch:
    • reading of theory card: check it reads the theory card as a theory card
    • reading of operator card: check it reads the operator card as an operator card
    • reading of the 4D operator:
  • test v0.15 (same as for v0.13, but now reuse the existing v0.15 test case instead of making it again the existing one was not a sane object)
    Do the following things for this EKO object in the newest EKO version from master branch:
    • reading of theory card: check it reads the theory card as a theory card
    • reading of operator card: check it reads the operator card as an operator card
    • reading of the 4D operator:

@felixhekhorn felixhekhorn added refactor Refactor code output Output format and management labels Oct 17, 2024
@scarlehoff
Copy link
Member

(or even forever)

4ever

But I'm happy with moving it to the theory parser in NNPDF (i.e., having an attribute in the Theory file which is return_eko_theory or something).
But I would say let's do things incrementally, for now let's keep it here, let's make sure things are working and then we move it.

since we introduced it, but I guess we should start doing so to disentangle a change in API

I agree. The ones in 0.15 are surely a data version different from what we have

I don't know how to recover from

from the nnpdf side I don't mind not having this. In practice it would mean that the x-grid is completely fixed. If I understand correctly this feature has been more a convenience for people that could create more custom ekos and then got the 192x LHAPDF grid as the output? (did I understood it correctly @giacomomagni ?)

@giacomomagni
Copy link
Collaborator

  • this drops the Legacy runner, i.e. closes Drop legacy runner #398 - this will break many things potentially again, i.e. this is surely another major break to master. eko.runner.managed.solve which is the new default does only accept new runcards, i.e. if you need to convert the user has to do that before calling the function (see also below)

since we are updating evolven3fit we might consider to update as well pineko... but it will require a bit of work.

  • this drops the old file converter which was for files before v0.11 (I think)

here we are safe I think, it's already impossible to read 0.12 (?)

  • I don't know how to recover from Change manipulate to act on Operator #292 : old ekos might be rotated (on any of the four axes) and we would need to counteract this everywhere ... any ideas? It is true that the old format and the new format are technical the same, but the content might not

so in order to use the old ekos (0.13) you'd need to convert them, can't we add some checks while applying the conversion and warn the user?

@scarlehoff
Copy link
Member

so in order to use the old ekos (0.13) you'd need to convert them, can't we add some checks while applying the conversion and warn the user?

Wait, I thought the pdf ekos that we have were already ok without rotation? Or do you mean the fktables ekos?

@felixhekhorn
Copy link
Contributor Author

But I would say let's do things incrementally, for now let's keep it here, let's make sure things are working and then we move it.

definitely. I'm not sure, if that is better here or in NNPDF - we may want to move it e.g. into ekobox though

since we are updating evolven3fit we might consider to update as well pineko... but it will require a bit of work.

yes, but it will require anyway work since this is a major change already

here we are safe I think, it's already impossible to read 0.12 (?)

I think so - that was never requested and so with this PR I put a hard limit on v0.13

so in order to use the old ekos (0.13) you'd need to convert them, can't we add some checks while applying the conversion and warn the user?

just warning would be doable - but is that sufficient?

from the nnpdf side I don't mind not having this. In practice it would mean that the x-grid is completely fixed. If I understand correctly this feature has been more a convenience for people that could create more custom ekos and then got the 192x LHAPDF grid as the output? (did I understood it correctly @giacomomagni ?)

Wait, I thought the pdf ekos that we have were already ok without rotation? Or do you mean the fktables ekos?

that's the thing: eko can not know for what you use the it - and FK-ekos are rotated by default, even on multiple axes (I think). The main post-fit eko might be safe by chance. You can still rotate the eko in any way you like, but it is now always stored in the "canconical basis" (of course you can force your way)

@scarlehoff
Copy link
Member

Yes, they might be correct by chance... but if they are correct then we are happy I'd say, chance or not

@felixhekhorn
Copy link
Contributor Author

sorry @evagroenendijk for the delay but I added some todo items to the PR description above on which you can work as a warm-up; are the items sufficiently clear? Please ping me if they are not ...

@giacomomagni should we pin the test cases to LHA LO FFNS unpolarized? and check the actual numbers are fine? should we check for failure (i.e. "non-squared" or x rotated)?

@giacomomagni
Copy link
Collaborator

@giacomomagni should we pin the test cases to LHA LO FFNS unpolarized? and check the actual numbers are fine? should we check for failure (i.e. "non-squared" or x rotated)?

Yes that could be a good idea, I'd indeed try to check if the thing is not-squared, because it's not possible to translate it or use it...

@felixhekhorn
Copy link
Contributor Author

@evagroenendijk please remember to activate pre-commit

eko.solve(th_card,op_card,path)
evolve_pdfs(
[pdf], th_card, op_card, install=True, name="Evolved_pdf", store_path=path

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @felixhekhorn I wanted to ask you about this small part of my script. Because what I am trying to do here is to evolve my pdf using the xgrid that I defined myself. I found an example in the eko documentation which I'm approximately trying to follow. But this path.unlink gives an error that I'm struggling to solve. Because including line 44 you get an error that the path doesn't exist anymore (which makes sense), but excluding line 44 you get an error that the path already exists. I'm not really sure how to work around this here, do you perhaps have any advice? Or there is probably even a way to do this better altogether, but I thought my method should also work. Thanks in any case!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. see my comment above
  2. evolve_pdfs is doing way too much things, which we don't need here in the first place - so drop this line
  3. instead we want to use apply_pdf which was already imported - for an example on how to use it see https://eko.readthedocs.io/en/latest/overview/tutorials/pdf.html#Method-1:-Using-apply_pdf

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I'm doing it with apply_pdf now and seems to work fine. One question: do the pdfs have to be spot-on the same as in the LHA benchmark or is agreement up to some precision also alright?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, it has to be as good as is master currently, i.e.

$ poetry poe lha -m "lo and ffns"
 ─── 
  S  
 ─── 
               x       Q2         eko     eko_error         LHA  percent_error
0   1.000000e-07  10000.0  196.013420  2.070574e-04  196.021092      -0.003914
1   1.000000e-06  10000.0   98.182659  1.004108e-04   98.207532      -0.025327
2   1.000000e-05  10000.0   46.918471  6.991565e-05   46.923979      -0.011737
3   1.000000e-04  10000.0   21.203157  2.303485e-05   21.208365      -0.024556
4   1.000000e-03  10000.0    9.010651  8.842617e-06    9.012788      -0.023707
5   1.000000e-02  10000.0    3.646971  1.776247e-06    3.647790      -0.022450
6   1.000000e-01  10000.0    1.440969  2.043208e-06    1.440964       0.000377
7   3.000000e-01  10000.0    0.575928  1.937128e-08    0.575929      -0.000203
8   5.000000e-01  10000.0    0.173979  5.302622e-10    0.173983      -0.001839
9   7.000000e-01  10000.0    0.026810  2.180479e-08    0.026811      -0.000617
10  9.000000e-01  10000.0    0.000454  3.694197e-09    0.000457      -0.597290

 ─── 
  g  
 ─── 
               x       Q2  ...          LHA  percent_error
0   1.000000e-07  10000.0  ...  1316.200000      -0.000565
1   1.000000e-06  10000.0  ...   600.080000      -0.028675
2   1.000000e-05  10000.0  ...   254.190000      -0.016463
3   1.000000e-04  10000.0  ...    97.371000      -0.029621
4   1.000000e-03  10000.0  ...    32.078000      -0.030995
5   1.000000e-02  10000.0  ...     8.054600      -0.037837
6   1.000000e-01  10000.0  ...     0.887660      -0.001243
7   3.000000e-01  10000.0  ...     0.082676      -0.004348
8   5.000000e-01  10000.0  ...     0.007924      -0.004414
9   7.000000e-01  10000.0  ...     0.000373      -0.068088
10  9.000000e-01  10000.0  ...     0.000001       8.994289

[11 rows x 6 columns]

 ─── 
  V  
 ─── 
               x       Q2       eko     eko_error       LHA  percent_error
0   1.000000e-07  10000.0  0.000092  1.301617e-10  0.000092       0.000420
1   1.000000e-06  10000.0  0.000532  1.064871e-09  0.000532       0.000623
2   1.000000e-05  10000.0  0.002979  2.901840e-09  0.002979       0.002073
3   1.000000e-04  10000.0  0.015964  1.376862e-08  0.015965      -0.001269
4   1.000000e-03  10000.0  0.079689  6.163553e-08  0.079688       0.000638
5   1.000000e-02  10000.0  0.343877  1.564116e-07  0.343880      -0.000781
6   1.000000e-01  10000.0  0.856808  1.662353e-08  0.856800       0.000946
7   3.000000e-01  10000.0  0.521109 -4.859107e-10  0.521110      -0.000148
8   5.000000e-01  10000.0  0.170121  1.510332e-10  0.170124      -0.001738
9   7.000000e-01  10000.0  0.026717 -2.732362e-09  0.026717       0.000657
10  9.000000e-01  10000.0  0.000454 -1.011684e-09  0.000457      -0.606629

 ──── 
  V3  
 ──── 
               x       Q2       eko     eko_error       LHA  percent_error
0   1.000000e-07  10000.0  0.000023  3.255914e-11  0.000023       0.000771
1   1.000000e-06  10000.0  0.000136  2.663909e-10  0.000136      -0.002105
2   1.000000e-05  10000.0  0.000766  7.272131e-10  0.000766       0.003961
3   1.000000e-04  10000.0  0.004149  3.468569e-09  0.004149      -0.003267
4   1.000000e-03  10000.0  0.021096  1.573728e-08  0.021096       0.002286
5   1.000000e-02  10000.0  0.095220  4.562153e-08  0.095220      -0.000331
6   1.000000e-01  10000.0  0.288539  6.692407e-09  0.288540      -0.000499
7   3.000000e-01  10000.0  0.237386 -2.448215e-10  0.237390      -0.001621
8   5.000000e-01  10000.0  0.099393  1.083280e-10  0.099396      -0.003019
9   7.000000e-01  10000.0  0.019526 -2.042607e-09  0.019529      -0.011735
10  9.000000e-01  10000.0  0.000409 -9.033632e-10  0.000412      -0.814945

 ──── 
  T3  
 ──── 
               x       Q2       eko     eko_error       LHA  percent_error
0   1.000000e-07  10000.0  0.000022  3.077890e-11  0.000022       0.000794
1   1.000000e-06  10000.0  0.000126  2.490182e-10  0.000126      -0.002292
2   1.000000e-05  10000.0  0.000703  6.645188e-10  0.000703       0.004368
3   1.000000e-04  10000.0  0.003768  3.125672e-09  0.003768      -0.003604
4   1.000000e-03  10000.0  0.018973  1.394690e-08  0.018972       0.002227
5   1.000000e-02  10000.0  0.085273  4.117576e-08  0.085274      -0.000443
6   1.000000e-01  10000.0  0.267598  6.410300e-09  0.267600      -0.000744
7   3.000000e-01  10000.0  0.230780 -2.410765e-10  0.230784      -0.001672
8   5.000000e-01  10000.0  0.098537  1.081309e-10  0.098540      -0.003150
9   7.000000e-01  10000.0  0.019495 -2.040230e-09  0.019497      -0.011705
10  9.000000e-01  10000.0  0.000409 -9.034513e-10  0.000412      -0.818107

 ──── 
  T8  
 ──── 
               x       Q2       eko     eko_error       LHA  percent_error
0   1.000000e-07  10000.0  2.182090  3.413575e-06  2.181092       0.045772
1   1.000000e-06  10000.0  1.733373  4.860242e-06  1.733532      -0.009178
2   1.000000e-05  10000.0  1.379013  4.156562e-07  1.378979       0.002490
3   1.000000e-04  10000.0  1.105966  4.637348e-07  1.106165      -0.017972
4   1.000000e-03  10000.0  0.928381  2.681725e-07  0.928488      -0.011559
5   1.000000e-02  10000.0  0.930818  2.943862e-07  0.930900      -0.008810
6   1.000000e-01  10000.0  1.031162  1.766631e-08  1.031160       0.000198
7   3.000000e-01  10000.0  0.540243 -4.925727e-10  0.540243       0.000008
8   5.000000e-01  10000.0  0.171521  1.512387e-10  0.171524      -0.001742
9   7.000000e-01  10000.0  0.026751 -2.734662e-09  0.026751       0.000160
10  9.000000e-01  10000.0  0.000454 -1.011559e-09  0.000457      -0.603362

 ───── 
  T15  
 ───── 
               x       Q2       eko     eko_error       LHA  percent_error
0   1.000000e-07  10000.0  4.364089  6.827019e-06  4.365092      -0.022986
1   1.000000e-06  10000.0  3.466214  9.719419e-06  3.467532      -0.038016
2   1.000000e-05  10000.0  2.755048  8.284105e-07  2.755979      -0.033796
3   1.000000e-04  10000.0  2.195967  9.137010e-07  2.196365      -0.018093
4   1.000000e-03  10000.0  1.777073  4.747096e-07  1.777188      -0.006479
5   1.000000e-02  10000.0  1.517759  4.323608e-07  1.517910      -0.009970
6   1.000000e-01  10000.0  1.205516  1.870909e-08  1.205508       0.000661
7   3.000000e-01  10000.0  0.559377 -4.992348e-10  0.559377      -0.000080
8   5.000000e-01  10000.0  0.172920  1.514443e-10  0.172923      -0.001787
9   7.000000e-01  10000.0  0.026784 -2.736963e-09  0.026784      -0.000332
10  9.000000e-01  10000.0  0.000454 -1.011434e-09  0.000457      -0.600095

so I'd say we can start with some global settings, e.g. rel. error < 0.05% and abs. error < 5e-7 and then let's see if we need to make it slightly looser or we can make it sharper

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So at the moment e.g. for v0.14 all relative errors are <0.09, this is too big then or is it okay?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe a factor 2 we can afford ... what is the abs error? I'm still a bit surprised why we don't get the same ... the only reason I can think it is a side effect #292

Comment on lines 17 to 29
for name in ["v0.14.tar"]:

with eko.EKO.read(TEST/name) as evolution_operator:
#evolved_pdfs, _integration_errors = apply_pdf(evolution_operator, pdf)
#import pdb; pdb.set_trace()
#print('test')
assert isinstance(evolution_operator.theory_card, eko.io.runcards.TheoryCard) No newline at end of file
assert isinstance(evolution_operator.theory_card, eko.io.runcards.TheoryCard) # Check that theory card is read as theory card
assert isinstance(evolution_operator.operator_card, eko.io.runcards.OperatorCard) # Check that operator card is read as operator card
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's split those four statements into a separate function: test_read_legacy_cards() and all the rest will go into test_read_legacy_pdf() as it is obvious now that the complexity is much different between the two

Comment on lines 43 to 45
path.unlink(missing_ok=True)
eko.solve(th_card,op_card,path)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't do this in the first place - we don't want to regenerate the eko on the fly, but we want to check the content. So just drop these lines

eko.solve(th_card,op_card,path)
evolve_pdfs(
[pdf], th_card, op_card, install=True, name="Evolved_pdf", store_path=path
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. see my comment above
  2. evolve_pdfs is doing way too much things, which we don't need here in the first place - so drop this line
  3. instead we want to use apply_pdf which was already imported - for an example on how to use it see https://eko.readthedocs.io/en/latest/overview/tutorials/pdf.html#Method-1:-Using-apply_pdf

Comment on lines +58 to +63
lha_path = pathlib.Path(__file__).parents[4] / 'eko/src/ekomark/benchmark/external/LHA.yaml'
with open(lha_path,'r') as file:
lha_benchmark = yaml.safe_load(file)

xpdf_benchmark = lha_benchmark["table2"]["part2"]["g"]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use the ekomark interface to do all this - it knows and you don't need to hack here

"""
--> have to divide by x values to compare or not
Was the LHAPDF output xf(x,q2) or f(x,q2) again?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the LHAPDF function is xfxQ2 - however EKO returns just f so you need to do something

Comment on lines +75 to +80
for i in range(len(pdf_test)):
if pdf_test[i] != pdf_benchmark[i]:
print(name, False)
else:
print(name, True)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just use np.testing?

print(name, True)

return evolved_pdf
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to return anything here, this is expected to run under pytest which throws everything away

return evolved_pdf

test_read_legacy()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fine for the moment, but keep in mind that you can do the same thing using the pytest interface


TEST = pathlib.Path(__file__).parents[2] / "data"
from math import nan
TEST = pathlib.Path(__file__).parents[2] / "data" # directory of the EKO object
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about

Suggested change
TEST = pathlib.Path(__file__).parents[2] / "data" # directory of the EKO object
# path of the EKO objects
TEST_DATA_DIR = pathlib.Path(__file__).parents[2] / "data"

also ruff will complain that you have to do all the imports first

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
output Output format and management refactor Refactor code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Drop legacy runner
4 participants