Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjoint equations #3713

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open

Conversation

jaustermann
Copy link
Contributor

This is a pull request for the implementation of the stokes adjoint equations. This is not meant to be merged into the developer branch but just serves to share the code to check the current implementation and think about how to reorganize this (i.e. what to keep, what to adjust, and what to move to a plugin). I've been talking with Wolfgang, Rene, and Menno about this.

I rebased this to the current master and resolved conflicts. The code compiles but has a few warnings (e.g. some things are deprecated) that I did not fix. It also runs, however, there are some updates to the parameter file that I'm now working through to check. While I'm looking through this I think this doesn't affect the core of the implementation so wanted to share this as is.

Happy to look through this together to think about how to implement / split this! Also if you want me to clean up parts before going through this further let me know.

@gassmoeller gassmoeller self-assigned this Aug 11, 2020
@gassmoeller
Copy link
Member

Let us know when you want us to take another look.

@jaustermann
Copy link
Contributor Author

Yes, would be great to talk again about the structure. Items for discussion / restructuring:
(1) Currently, there is a function 'compute_parameter_updates' in assembly.cc This function calculates the sensitivity kernels and updates the compositional field so that in a next iteration the density and viscosity have different values that should yield a better fit to the data (minimize objective functional). Even if we don't want to iterate, this function is useful because it updates the compositional field (sensitivity kernel), which can then be plotted / exported without having to write a separate postprocessor for the kernels. On the downside, using that function to calculate the kernels requires using a specific material model & compositional field structure. This is also where a conjugate gradient method (or other gradient optimization scheme) would be linked in. So the question is - where should this live? Should this be implemented differently? Etc.
(2) The parameter update iteration is currently done in core.cc and if we want to keep the adjoint solver as separate as possible that's probably not ideal(?). I think the iteration could also be done within the solver scheme and I forget now why I moved it to the core. Would be good to double-check the best place for this.
(3) The rest of the added files are postprocessing and material model. Most of this is already separate, but some might need separating out.

Copy link
Member

@gassmoeller gassmoeller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a first look, and in the current form I think this fits nicely, without too many conflicts. I marked up places you will want to change, but I did not see anything that seemed impossible to do in our exiting structure. We could even make the adjoint scheme an option available for all solver schemes, and only call the second half of your solver scheme function after normal solver schemes (for that to work the adjoint solver has to take care to restore the assembler structure after it has finished, instead of waiting until the next forward solve to restore the normal assemblers). But I would suggest to first clean things up with the solver scheme, and then revisit the topic. Let me know if you want help with any of the changes.

source/simulator/core.cc Outdated Show resolved Hide resolved
source/simulator/core.cc Outdated Show resolved Hide resolved
source/simulator/parameters.cc Outdated Show resolved Hide resolved
source/simulator/assembly.cc Outdated Show resolved Hide resolved
source/simulator/core.cc Outdated Show resolved Hide resolved
source/simulator/assemblers/stokes.cc Outdated Show resolved Hide resolved
source/simulator/core.cc Outdated Show resolved Hide resolved
source/simulator/simulator_access.cc Outdated Show resolved Hide resolved
source/simulator/solver_schemes.cc Outdated Show resolved Hide resolved
source/simulator/solver_schemes.cc Outdated Show resolved Hide resolved
@gassmoeller
Copy link
Member

To your questions:
(1) yes it's current spot is not ideal. However, I would think that moving it to helper_functions.cc, naming it compute_adjoint_parameter_update and calling it from the solver scheme just like you do at the moment seems fine to me. I see that it depends on the DynamicTopography postprocessor, so somewhere it should asserted that this postprocessor is active (probably when reading in the nonlinear solver scheme). But I do not see where you require a specific material model for this function. You evaluate the material like normal with in, out, and then you generate an additional in_adjoint, but that is never passed into the material model, you seem to only use it to store some values (extract them from adjoint_solution), so this should actually work with any material model? Of course some other places may require your specific material model.
(2) Where is this precisely? I see some code in assembly.cc and solver_schemes.cc, but I do not see where the update is done in core.cc.
(3) Yes I think the rest is nicely separate. Is the material model useful without the solver scheme? If not, you can assert that the solver scheme is actually selected and throw an exception if not.

@jaustermann
Copy link
Contributor Author

(1) Great, I'll move that to the helper_functions. And I think I sort of misspoke. It doesn't require a specific material model, but it does use two different compositional fields into which it writes the sensitivity kernels. In the material model these are then used to update the material properties. So yes, it doesn't need a specific material model, but it needs the compositional fields and if you want to use the kernels in an iteration it needs the material model. I'll set some descriptive exception for this.
(2) It is in core.cc for convenience because it allowed me to save different iterative steps as different timesteps. I moved this into the solver scheme now and need to figure out how to best do the postprocessing.
(3) Great! I'm not sure the material model is useful elsewhere so I'll put in exceptions that make sure the right setup is chosen.

I resolved the comments that I addressed and will push an updated version with those. Next on my list is moving the compute_update_parameters to helper_functions, figure out how to best store the output, set exceptions to make sure the right material model - compositional field combo is chosen, and add some brief documentation.

Copy link
Member

@bobmyhill bobmyhill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to understand this PR, and in reading through I found a few trivial things you might want to change :)

benchmarks/adjoint/adjointkernel_degree4.prm Outdated Show resolved Hide resolved
include/aspect/simulator.h Outdated Show resolved Hide resolved
include/aspect/simulator/assemblers/adjoint.h Outdated Show resolved Hide resolved
source/material_model/additive.cc Outdated Show resolved Hide resolved
include/aspect/postprocess/adjoint_kernels.h Outdated Show resolved Hide resolved
source/postprocess/adjoint_kernels.cc Outdated Show resolved Hide resolved
source/simulator/assembly.cc Outdated Show resolved Hide resolved
source/simulator/assemblers/adjoint.cc Outdated Show resolved Hide resolved
source/postprocess/adjoint_kernels.cc Outdated Show resolved Hide resolved
source/postprocess/visualization.cc Outdated Show resolved Hide resolved
@jaustermann
Copy link
Contributor Author

These are updates to address the comments above. One thing that is still missing is (better) documentation. Apart from that - any feedback is welcome, especially on how the new code fits within aspect. I actually would also be interested to see if it breaks any tests. We've talked about moving things into a separate plugin and I'm still happy to work on that if we think that would be best.

@jaustermann jaustermann force-pushed the adjoint_hack20 branch 2 times, most recently from a8529e0 to acaeeda Compare July 8, 2021 22:00
Copy link
Member

@gassmoeller gassmoeller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, I think this integrates well into the general structure and we do not need to create another manager class. I am not sure I caught everything, but I left a number of comments. Most are straightforward, but let me know if you get stuck with some of the trickier ones.

Main point: It would be good if for the next review the documentation is updated that would help with the review process (and reviewing documentation is also important to make sure a non expert understands it). The easiest way to do this is for you to do a review of your own code, i.e. go through the pull request on the 'Files changed' tab (https://github.com/geodynamics/aspect/pull/3713/files) from top to bottom and immediately fix any comment you see that is outdated. Let us know when you want more feedback.

include/aspect/parameters.h Outdated Show resolved Hide resolved
include/aspect/simulator.h Outdated Show resolved Hide resolved
source/material_model/additive.cc Outdated Show resolved Hide resolved
source/material_model/additive.cc Show resolved Hide resolved
source/material_model/additive.cc Outdated Show resolved Hide resolved
source/simulator/helper_functions.cc Outdated Show resolved Hide resolved
source/simulator/helper_functions.cc Outdated Show resolved Hide resolved
source/simulator/helper_functions.cc Outdated Show resolved Hide resolved
source/simulator/helper_functions.cc Outdated Show resolved Hide resolved
source/simulator/helper_functions.cc Outdated Show resolved Hide resolved
Copy link
Contributor

@bangerth bangerth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jaustermann , I read everything down to visualization.cc, but it's late and I have to postpone the rest for tomorrow. I'm sorry that there will be a bit of work left. I think the design is all there and correct, it's really just small cosmetic stuff to work through.

More tomorrow!

benchmarks/adjoint/adjointkernel_degree4.prm Outdated Show resolved Hide resolved
benchmarks/adjoint/adjointkernel_degree4.prm Outdated Show resolved Hide resolved
benchmarks/adjoint/adjointkernel_degree4.prm Outdated Show resolved Hide resolved
benchmarks/adjoint/adjointkernel_degree4.prm Outdated Show resolved Hide resolved
benchmarks/adjoint/adjointkernel_degree4.prm Outdated Show resolved Hide resolved
source/postprocess/adjoint_kernels.cc Outdated Show resolved Hide resolved
source/postprocess/adjoint_kernels.cc Outdated Show resolved Hide resolved
source/postprocess/adjoint_kernels.cc Outdated Show resolved Hide resolved
source/postprocess/visualization.cc Show resolved Hide resolved
source/postprocess/visualization.cc Show resolved Hide resolved
@jaustermann
Copy link
Contributor Author

Thanks so much for the comments Wolfgang! I've addressed the majority of Rene's comments but haven't pushed them yet (because I didn't quite get through all). I'll go through yours now. I'm not sure if it's better for you if I push my recent changes so that you see the most recent code of if I don't push them so that you can continue to work on the same code you started with. So ... let me know if you want me to push.

@bangerth
Copy link
Contributor

No need to review something that's not up to date :-) Let me know once you're through addressing all of @gassmoeller 's comments and I will take a look at the rest!

@jaustermann
Copy link
Contributor Author

Wolfgang and Rene, I've pushed new commits that I hope address your comments so far. This is ready to be looked at again. I want to test a couple of things but then I'll also do my own review as Rene suggested. One item that I still want to add is a test. The other item that I'm not sure about is the benchmark file. Currently, the benchmark works within the model domain but not at the surface. There's still something wrong that needs fixing. Do I keep the parameter file in the benchmark folder for now but wait with further documentation in the manual until this fully works? Or should I remove the parameter file until it works?

Copy link
Member

@gassmoeller gassmoeller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only part way through, but I need to leave it for now. Getting closer.

include/aspect/material_model/additive.h Outdated Show resolved Hide resolved
include/aspect/material_model/additive.h Show resolved Hide resolved
include/aspect/material_model/additive.h Show resolved Hide resolved
include/aspect/material_model/additive.h Outdated Show resolved Hide resolved
include/aspect/material_model/additive.h Outdated Show resolved Hide resolved
source/postprocess/adjoint_kernels.cc Show resolved Hide resolved
source/postprocess/dynamic_topography.cc Show resolved Hide resolved
source/postprocess/visualization.cc Show resolved Hide resolved
source/simulator/assemblers/adjoint.cc Outdated Show resolved Hide resolved
source/simulator/assemblers/adjoint.cc Outdated Show resolved Hide resolved
@bangerth
Copy link
Contributor

bangerth commented Jul 14, 2021 via email

Copy link
Contributor

@bangerth bangerth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've read through the rest, and I have primarily one general comment that I'd like us to address:

The adjoint equations are driven by a right hand side function that is typically the derivative of the objective function, i.e., misfit. In adjoint.cc and in the place where you compute the adjoint kernel, you make a very specific choice of how this functional looks like, namely the difference between some reference dynamic topography values and what your model predicts, but that is a case that very likely not a lot of people will actually be interested in. The way one would typically address this is by introducing a plugin system whereby people can implement different objective functions, and these may also include other kinds of observations. Have you given this some thought? I'm not saying that that needs to happen for this particular patch, but the code that is currently in the PR is oddly specific to the very particular case you are considering.

source/simulator/assemblers/adjoint.cc Show resolved Hide resolved
source/simulator/assemblers/adjoint.cc Outdated Show resolved Hide resolved
source/simulator/assemblers/adjoint.cc Outdated Show resolved Hide resolved
source/simulator/core.cc Outdated Show resolved Hide resolved
source/simulator/core.cc Outdated Show resolved Hide resolved
source/simulator/solver_schemes.cc Outdated Show resolved Hide resolved
source/simulator/solver_schemes.cc Outdated Show resolved Hide resolved
source/simulator/solver_schemes.cc Outdated Show resolved Hide resolved
source/simulator/solver_schemes.cc Outdated Show resolved Hide resolved
source/simulator/solver_schemes.cc Outdated Show resolved Hide resolved
@jaustermann
Copy link
Contributor Author

Thank you both for your comments! I’ll work through those today.

Thanks Wolfgang for your question about making this code more versatile and flexible by allowing for different RHS plugins. At the moment it really only does the bare bones. As you noted at a different point, I’m not even reading in data yet to calculate a misfit but just calculate the sensitivity to the produced topography. This is what is used in the benchmark. I had additional code in there to read in data but then thought it might be best to start with the simplest implementation and then add to it in a next step (next pull request) rather than add at this stage. If you want me to add this now though let me know.

It would be great to talk about how this could be made flexible to add other observations as plugins. Currently, I was mostly thinking of adding this in the same execute function (adjoint.cc) and choosing which observation to include based on parameter input, but I see that that’s probably not the most flexible way. Since I would like to add geoid observations as well it would be good to chat about how this should be done best. I’m thinking of saving the actual implementation of the geoid observations for a separate pull request but setting up the right structure already now could be good. Note that changing the observation may also affect the kernel calculation (this is in helper_functions -> compute_sensitivity_kernel).

I also wanted to respond to your comment on the benchmark because that’s a more general point that the code doesn’t yet reproduce the benchmark at the surface. I totally agree that only code that fully works (to the best of our knowledge) should be in the repository. I do think that means for this pull request that we still have to figure out how to properly deal with the surface term, but this should only affect a very small part of the code. So maybe the goal for the hackathon can be to get everything else finalized but then wait till later once the surface term aspect is figured out to merge. That would work for me and my plan is to continue working on this this summer and fall until it's done (hopefully with help from you Wolfgang :).

@bangerth
Copy link
Contributor

@jaustermann and I talked this through earlier today and the plan is to merge the PR in essentially the form it is right now (subject to addressing the comments other than the ones that pertain to the future generalization), given that the PR is big enough as it is now. We'd then come back to the correctness and extensibility question in future PRs.

@jaustermann jaustermann force-pushed the adjoint_hack20 branch 2 times, most recently from 5126a2d to 4bac043 Compare July 16, 2021 23:26
…w and dynamic topography surface observations
…embler. Add example parameterfile to benchmarks folder.
…to helper_functions. Move iteration from core.cc into the specific adjoint solver scheme.
@jaustermann
Copy link
Contributor Author

Thank you so much Wolfgang and Rene for your continued help with this! I addressed all the comments and added a test. I also combined some commits (but not all). Lastly, I rebased to master. I expect my added test to fail and I plan to update it with the test output. However, it seems like other things are failing as well (the manual? and it also cannot be built?). Any thoughts about what's going wrong? It does compile and run correctly on my desktop.

@jaustermann
Copy link
Contributor Author

jaustermann commented Jul 18, 2021

I noticed something weird in the iterations: The (forward) velocity field isn't changing from one iteration to the next even if I choose variable updates that lead to significantly different density or viscosity (and hence should lead to a different flow field). I checked that by only postprocessing the solution after the forward iteration - the density field changes but the velocity field doesn't. It's almost like it takes the initial condition of the compositional fields rather than the current compositional field. What is further weird though is that the # of stokes iterations (in the forward calculation) changes from one iteration to the next so it does seem to solve a different system. I'll continue to think through this but if anything comes to mind why the velocity in the forward solution wouldn't change even though the density does, let me know.

Copy link
Member

@gassmoeller gassmoeller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I found your bug, and left some comments on how to circumvent the block0 workaround. I didnt go through all of the code, because I thought you want to try it first.

// start adjoint calculation
pcout << " Solving the adjoint problem ... " << std::endl;

// set the pressure rhs compatibility to true for the ajdoint problem.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// set the pressure rhs compatibility to true for the ajdoint problem.
// set the pressure rhs compatibility to true for the adjoint problem.

source/simulator/helper_functions.cc Show resolved Hide resolved
Comment on lines +2650 to +2651
solution.block(rho_comp_block) += delta.block(rho_comp_block);
solution.block(eta_comp_block) += delta.block(eta_comp_block);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You write your update into the solution vector here, but you do not update the current_linearization_point inside solve_adjoint_stokes afterwards (like you do for the stokes blocks). The assembly for the next forward problem uses the current linearization point, and that is why your update does not make it into the forward problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants