Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limits to missingness handled by learn_circuit_miss? #118

Open
robertfeldt opened this issue Mar 2, 2022 · 2 comments
Open

Limits to missingness handled by learn_circuit_miss? #118

robertfeldt opened this issue Mar 2, 2022 · 2 comments

Comments

@robertfeldt
Copy link

I have a situation where I need to train models with lots of missing information, i.e. the training matrix is very sparse (on the order of more than 90% of values are missing (in a sub-matrix of the full training matrix)). There are many training instances overall but for each one only a small subset of the covariates are active/non-missing.

Is there some known limit to the missingness that can be handled?

When I try training with this data I get an AssertionError:

c1 = learn_circuit_miss(df1; maxiter = 50)

Iteration 0/50. Marginal LogLikelihood = -11.778915; nodes = 1481; edges =  2050; params = 910
ERROR: LoadError: AssertionError: Parameters do not sum to one locally: 1.137407764369371; [-0.2097739565374725, -1.1188958040892987]
Stacktrace:
  [1] (::ProbabilisticCircuits.var"#103#104"{LogicCircuits.BitCircuit{Vector{Int32}, Matrix{Int32}}, Vector{Float64}, Float64, Float64})(pn::StructSumNode)
    @ ProbabilisticCircuits ~/.julia/packages/ProbabilisticCircuits/1cjGx/src/parameter_learn/parameters.jl:137
  [2] (::DirectedAcyclicGraphs.var"#1#2"{ProbabilisticCircuits.var"#103#104"{LogicCircuits.BitCircuit{Vector{Int32}, Matrix{Int32}}, Vector{Float64}, Float64, Float64}, StructSumNode, Dict{DirectedAcyclicGraphs.DAG, Nothing}})()
    @ DirectedAcyclicGraphs ~/.julia/packages/DirectedAcyclicGraphs/teMfW/src/dags.jl:82
  [3] get!(default::DirectedAcyclicGraphs.var"#1#2"{ProbabilisticCircuits.var"#103#104"{LogicCircuits.BitCircuit{Vector{Int32}, Matrix{Int32}}, Vector{Float64}, Float64, Float64}, StructSumNode, Dict{DirectedAcyclicGraphs.DAG, Nothing}}, h::Dict{DirectedAcyclicGraphs.DAG, Nothing}, key::StructSumNode)
...

I can provide a fuller stack trace if it would be useful. It is 65 levels deep though. ;)

@khosravipasha
Copy link
Contributor

Thanks for the report. At the moment in the next version v0.4 both learn_circuit_miss and learn_circuit will be gone.

Our v0.4 is in master branch and is fairly stable now, some example scripts here. We are planning on releasing soon after some more testing and documentation.

There is major API changes so some code change would be needed. (For example, not using DataFrames anymore and just using Matrix{Union{Missing, ....}} for queries.


Comment for v0.3.3

In case, you want to stay with v0.3.3 for now.
What learn_circuit_miss does different than learn_circuit is that it uses imputation to generate the intial structure (using ChowLiu algorithm). After that both do greedy structure learning steps by doing splits and clones.

I think bug is not from learning the initial structure, and seems to be more of from paramter learning or bad paramter initilization. I will try to reproduce the bug. If you can provide minimal code that reproduces this would be nice.

https://github.com/Juice-jl/ProbabilisticCircuits.jl/blob/f22571801a2a001c374056aa3e030d22a961094a/src/structurelearner/learner.jl#L50-L56

Alternative Structure (HCLT)

Both this options give you determnistic and structured decomposable circuite, if you don't need determinism, we suggest using HCLT structures instead (they are decomposable but not determinsitic) as they usually perform better.

  1. Learn Structure using:
    https://github.com/Juice-jl/ProbabilisticCircuits.jl/blob/f22571801a2a001c374056aa3e030d22a961094a/src/structurelearner/hclt.jl#L385-L390

  2. Learn Parameters using:
    https://github.com/Juice-jl/ProbabilisticCircuits.jl/blob/f22571801a2a001c374056aa3e030d22a961094a/src/parameter_learn/parameters.jl#L456-L461

The code is much more simplified in v0.4 (current in master), so might be worth the swith there is major API changes though.

@khosravipasha
Copy link
Contributor

Also forgot to ask, where you using CPU or GPU version?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants