-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🚀 Add NonLinearProgram Support to DiffOpt.jl #260
base: master
Are you sure you want to change the base?
🚀 Add NonLinearProgram Support to DiffOpt.jl #260
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good job @andrewrosemberg ! I appreciate the numerous unit-tests you have shipped with this PR.
I will try to run your code locally on small NLP instances, to assess how fast is the current implementation (my main concern is the total number of allocations). I will try also to test your code on parameterized OPF instances, to assess how far we can get in term of size.
|
||
################################################ | ||
#= | ||
From sIpopt paper: https://optimization-online.org/2011/04/3008/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very nice!
Filling the off-diagonal elements of a sparse matrix to make it symmetric. | ||
""" | ||
function fill_off_diagonal(H) | ||
ret = H + H' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we assume H
is lower-triangular?
|
||
Filling the off-diagonal elements of a sparse matrix to make it symmetric. | ||
""" | ||
function fill_off_diagonal(H) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe constrain the input type? H::SparseMatrixCSC
) | ||
sense_multiplier = sense_mult(model) | ||
evaluator = model.cache.evaluator | ||
y = [model.y[model.model.nlp_index_2_constraint[row].value] for row in rows] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a proper getter to access model.model.nlp_index_2_constraint
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that I know of. @joaquimg , do you ?
# Partial derivative of the equality constraintswith wrt parameters | ||
∇ₚC = jacobian[:, params_idx] | ||
|
||
# M matrix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment is appreciated!
# [V_U 0 0 0 (X_U - X)] | ||
# ] | ||
len_w = num_vars + num_ineq | ||
M = spzeros( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there should be a cleaner way to build M
and N
, with less allocations. Let me try to build a MWE, we can assess later if we should address this in this PR or later.
""" | ||
inertia_corrector_factorization(M::SparseMatrixCSC, num_w, num_cons; st=1e-6, max_corrections=50) | ||
|
||
Inertia correction for the factorization of the KKT matrix. Sparse version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The theoretical question is: do we need to run inertia correction at the optimum? It could be necessary if SOSC is not satisfied at the solution indeed, but in that case I am not sure the sIpopt's formula remains valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are right, but couldn't there be numerical errors?
Moreover, I believe the rest of DiffOpt just assumes that the necessary conditions hold (even if they don't). The thing is that their method doesn't fail when it doesn't hold, while sIpopt method appears to throw singular matrix error when either the conditions don't hold or when we have numerical errors. Therefore, I tried to make it consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should log a warning if inertia correction is necessary (apologies if you're already doing this) or add an option allow_intertia_correction
that defaults to true?
@andrewrosemberg Playing with your branch right now, I must say I like DiffOpt's interface! I did a dummy mistake when solving the problem HS15 and retrieving the sensitivity. A MWE is: model = Model(() -> DiffOpt.diff_optimizer(Ipopt.Optimizer))
@variable(model, p[1:2] ∈ MOI.Parameter.([100.0, 1.0]))
@variable(model, x[1:2])
set_upper_bound.(x[1], 0.5)
@objective(model, Min, p[1] * (x[2] - x[1]^2)^2 + (p[2] - x[1])^2)
@constraint(model, x[1] * x[2] >= 1.0)
@constraint(model, x[1] + x[2]^2 >= 0.0)
optimize!(model)
# set parameter pertubations
MOI.set(model, DiffOpt.ForwardParameter(), p[1], 1.0)
# # forward differentiate
DiffOpt.forward_differentiate!(model)
Δx = [
MOI.get(model, DiffOpt.ForwardVariablePrimal(), var) for
var in x
]
I forgot to specify the sensitivity for
|
Also, if I query the solution after calling The MWE is: model = Model(() -> DiffOpt.diff_optimizer(Ipopt.Optimizer))
@variable(model, p[1:2] ∈ MOI.Parameter.([100.0, 1.0]))
@variable(model, x[1:2])
set_upper_bound.(x[1], 0.5)
@objective(model, Min, p[1] * (x[2] - x[1]^2)^2 + (p[2] - x[1])^2)
@constraint(model, x[1] * x[2] >= 1.0)
@constraint(model, x[1] + x[2]^2 >= 0.0)
optimize!(model)
# set parameter pertubations
MOI.set(model, DiffOpt.ForwardParameter(), p[1], 1.0)
MOI.set(model, DiffOpt.ForwardParameter(), p[2], 1.0)
# # forward differentiate
DiffOpt.forward_differentiate!(model)
JuMP.value.(x)
Output:
|
@frapac good question! I am fine either way. @joaquimg what would be the consistent approach given the rest of DIffOpt? |
I don't think this is the desired outcome haha Not sure how to ask MOI to ignore the forward attributes, but I will look into it. @joaquimg do you have an idea on how to avoid this? |
@andrewrosemberg Another issue I noted: if we are using non-standard indexing in JuMP (
A MWE is: model = Model(() -> DiffOpt.diff_optimizer(Ipopt.Optimizer))
@variable(model, p[1:2] ∈ MOI.Parameter.([100.0, 1.0]))
# N.B: use non-standard indexing
@variable(model, x[0:2])
set_upper_bound.(x[1], 0.5)
@objective(model, Min, p[1] * (x[2] - x[1]^2)^2 + (p[2] - x[1])^2)
@constraint(model, x[1] * x[2] >= 1.0)
@constraint(model, x[1] + x[2]^2 >= 0.0)
optimize!(model)
# set parameter pertubations
MOI.set(model, DiffOpt.ForwardParameter(), p[1], 1.0)
MOI.set(model, DiffOpt.ForwardParameter(), p[2], 1.0)
# # forward differentiate
DiffOpt.forward_differentiate!(model)
Δx = [
MOI.get(model, DiffOpt.ForwardVariablePrimal(), var) for
var in x
]
|
Pushing it one step further, I have tried to differentiate an ACOPF instance using DiffOpt. I re-used the code in rosetta-opf, and ported it to DiffOpt. You can find a gist here. Two observations:
|
This PR introduces a new module,
NonLinearProgram
, to extend DiffOpt.jl's functionality for differentiating nonlinear optimization problems (NLPs). The implementation integrates with JuMP-based nonlinear models and supports advanced derivative computation through custom evaluator and differentiation logic.🆕 Features
Nonlinear Model Differentiation:
focus_vars
) and dual variables (focus_duals
) with respect to a given set of parameters.Core Structures:
Cache
: Stores primal variables, parameters, evaluator, and constraints for efficient reuse.ForwCache
: Holds results of forward differentiation, including sensitivities for specified variables.ReverseCache
: Holds results of reverse differentiation (implemented in this PR).Integration with DiffOpt API:
DiffOpt.AbstractModel
for seamless compatibility with DiffOpt's API.forward_differentiate!
andreverse_differentiate!
functions for NLPs.🔧 How It Works
Custom Sensitivity Calculations:
Forward Differentiation:
Reverse Differentiation:
ReverseCache
.📜 Implementation Highlights
Forward Differentiation:
w.r.t.
params`.ForwCache
.Reverse Differentiation:
ReverseCache
.Custom Utilities:
create_evaluator
,compute_sensitivity
, and other utilities fromnlp_utilities.jl
for efficient derivative computation.📋 Example Usage
Forward Differentiation
Reverse Differentiation
🚧 TODO
🛠 Future Work