Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add flag for activating robust calculation of expand_derivatives #1353

Merged
merged 4 commits into from
Dec 22, 2024

Conversation

karlwessel
Copy link
Contributor

This provides a workaround for issues #1126 and #1262 by adding a flag robust that if set, forces expand_derivatives to always recalculate occurrences.

It also adds tests that check if the examples from the issues are actually solved. These tests also show that expand_derivatives can return a wrong result without throwing any error. Line 372 to 375 in tests/diff.jl:

expr = expr_gen(g(y))
@test_broken isequal(expand_derivatives(expr), expand_derivatives(expr; robust=true))
expr = expr_gen(h(y))
@test_broken isequal(expand_derivatives(expr), expand_derivatives(expr; robust=true))

I set the robust flag to false per default but since the result of the non-robust version is not reliable it should be considered if setting it to true should be the default.

@codecov-commenter
Copy link

codecov-commenter commented Nov 8, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 94.66667% with 4 lines in your changes missing coverage. Please review.

Project coverage is 79.01%. Comparing base (5af597a) to head (3bf685a).
Report is 18 commits behind head on master.

Files with missing lines Patch % Lines
src/diff.jl 94.66% 4 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@             Coverage Diff             @@
##           master    #1353       +/-   ##
===========================================
+ Coverage    3.98%   79.01%   +75.03%     
===========================================
  Files          50       51        +1     
  Lines        4771     4880      +109     
===========================================
+ Hits          190     3856     +3666     
+ Misses       4581     1024     -3557     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ChrisRackauckas
Copy link
Member

Why is a recalculation of occursin required? That seems like a bug that should just be addressed directly.

@karlwessel
Copy link
Contributor Author

karlwessel commented Nov 8, 2024

It definitely is a bug, but no one was able to fix it until now, maybe because of #1126 (comment).

I spent two nights trying to pin down the problem, but not being able to print intermediate values without affecting the process itself makes this really tedious/complicated. I decided implementing this workaround is much simpler and I think necessary, until some hero hunts this one down.

I think the Problem is, that there actually are cases where recalculation is necessary because the expression tree can change between the creation of occursin and its usage in a subtree. The call order in that case would be

occurrences = occursin_info(operation(O).x, arg)

t2 = expand_derivatives(D(inner_args[i]),false, occurrences=arguments(occurrences)[i])

arg = expand_derivatives(arg, false)

After that point, occurrences could already be outdated since the last call can change the subtree.

@shashi
Copy link
Member

shashi commented Nov 9, 2024

Have you tried debugging/reproducing this by using the sorted arguments instead of unsorted?

@karlwessel
Copy link
Contributor Author

Yes, using the sorted arguments seemingly fixes #1262. Since it however does not fix #1126, my guess is that it just changes the ordering of the terms per chance to an order where the error doesn't happen.

@shashi
Copy link
Member

shashi commented Nov 10, 2024

I think the occursin tree code can just be removed, but that changes the time complexity of this function...
Best approach is to try to debug the issues instead of robust mode. All mode must be robust imo...

If you want to find more cases, I'd look into running more tests using fuzz.jl and fuzzlib.jl right now I believe it just runs simplify tests..

@karlwessel
Copy link
Contributor Author

I agree that trying to debug the issue would be the best solution. But I don't know when somebody will do that and until then I would like to have a working way to calculate my derivatives, which currently is not possible.

@karlwessel
Copy link
Contributor Author

I did a take on actually solving the problem. I think the Problem is that subtrees of a differential are themself expanded multiple times, once before the differential is executed and then another time to execute the chain rule for the differential on the subtrees.

I removed the second expansion by moving the actual execution of a differential to a separate function executediff which shouldn't call expand again but only executediff on the subtrees.

This fixes the test cases and the results with and without the robust flag agree.

There are still two calls to expand_differentials in executediff left which I didn't know how to replace at the moment. Replacing them also shouldn't be to hard, however.

@karlwessel
Copy link
Contributor Author

I removed the robust flag and cleaned up the code.
The failing integration tests don't seem to be related to the changes in this PR.
Good to be merged from my side.

@ChrisRackauckas
Copy link
Member

@shashi can you review?

@orebas
Copy link

orebas commented Dec 16, 2024

Hello. I am not fully following the discussion here. Issue #1262 has made Julia Symbolics unusable for me. If there exists a "temporary" workaround so that expand_derivatives function works properly, I would appreciate that getting into prod. I spent a while trying to debug this issue, and it seems like others have, and the bug has been outstanding for months or possibly years. Thanks.

@ChrisRackauckas
Copy link
Member

I'm not sure why it's a given this is actually a fix or a workaround. This kind of change has some other implications and adds a lot of other complications. It fixes one case but this kind of recomputation complexity very easily hides much harder to debug issues and the exponential complexity can be considered a bug itself. So if someone really wants a fix now they should look into fixing the real bug. Shortcuts to solve one issue at the expense of many others will lead to much pain down the line. If @shashi thinks it's safe I'd merge but it is definitely risky

@karlwessel
Copy link
Contributor Author

karlwessel commented Dec 17, 2024

For clarification, in the current state of this PR there is no recalculation happening anymore. The complexity of expanding derivatives with this PR should be the same or lower than without this PR.

I used this PR for my working code during the last weeks. I hadn't any issues with wrong derivatives or trying to access an outdated occurences cache since then anymore. So I think that issue should be fixed.

However, I did encounter some stack overflows when expanding derivatives that involved complex symbolics. I hadn't the time to test if they also occur without this PR nor had a deeper look at the cause. But if they are caused by this PR my guess would be an infinite loop in

Symbolics.jl/src/diff.jl

Lines 196 to 211 in 3bf685a

elseif isa(op, Differential)
# The recursive expand_derivatives was not able to remove
# a nested Differential. We can attempt to differentiate the
# inner expression wrt to the outer iv. And leave the
# unexpandable Differential outside.
if isequal(op.x, D.x)
return D(arg)
else
inner = executediff(D, arguments(arg)[1], false)
# if the inner expression is not expandable either, return
if iscall(inner) && operation(inner) isa Differential
return D(arg)
else
# otherwise give the nested Differential another try
return executediff(op, inner, simplify)
end

And I have to admit that I neither understand those lines nor handling of complex numbers very well yet.

@orebas
Copy link

orebas commented Dec 22, 2024

I'm not sure why it's a given this is actually a fix or a workaround. This kind of change has some other implications and adds a lot of other complications. It fixes one case but this kind of recomputation complexity very easily hides much harder to debug issues and the exponential complexity can be considered a bug itself. So if someone really wants a fix now they should look into fixing the real bug. Shortcuts to solve one issue at the expense of many others will lead to much pain down the line. If @shashi thinks it's safe I'd merge but it is definitely risky

So, in terms of risk, IMHO the expand_derivatives() function is currently broken to the point where it should probably be disabled. (there are like 4 or 5 issues about it, granted 2 are mine). I've had to replace symbolic differentiation with taylor-based AD in a package because of that. For reference the issues are #1369 #1281 #1262 #1126 #634 (the last from Jun2022) . Based on my attempt to debug this, I believe but I am not certain that the whole problem comes from "occurences" or "occursin" which is an optimization. If it is possible to just remove that optimization and get a version of expand_derivatives which passes all existing tests and resolves the issues referenced it seems like a clear win.

I see above that this PR is failing some tests; I don't know how much they matter.

I apologize in advance if I'm misunderstanding the issue, Julia is not my main language. Thanks for looking in any case.

@ChrisRackauckas
Copy link
Member

However, I did encounter some stack overflows when expanding derivatives that involved complex symbolics. I hadn't the time to test if they also occur without this PR nor had a deeper look at the cause. But if they are caused by this PR my guess would be an infinite loop in

Can you do this test and add those to the test set? That would be a big factor here.

@ChrisRackauckas
Copy link
Member

I removed the robust flag and cleaned up the code.

I had missed this change, the code does look a lot more reasonable now. I think if you test the part with the stack overflows that would be helpful, but I don't see why this change would cause an infinite recursion where the other had not. In fact, executediff should block some cases from being possible, so it should only help in that sense. But it would be good to make an error for the other cases so it can get looked at.

@ChrisRackauckas ChrisRackauckas merged commit a119204 into JuliaSymbolics:master Dec 22, 2024
10 of 12 checks passed
@karlwessel
Copy link
Contributor Author

Thanks for merging! I can probably take a closer look at the stack overflow in February.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants