Limits to missingness handled by `learn_circuit_miss`? #118

robertfeldt · 2022-03-02T07:21:21Z

I have a situation where I need to train models with lots of missing information, i.e. the training matrix is very sparse (on the order of more than 90% of values are missing (in a sub-matrix of the full training matrix)). There are many training instances overall but for each one only a small subset of the covariates are active/non-missing.

Is there some known limit to the missingness that can be handled?

When I try training with this data I get an AssertionError:

c1 = learn_circuit_miss(df1; maxiter = 50)

Iteration 0/50. Marginal LogLikelihood = -11.778915; nodes = 1481; edges =  2050; params = 910
ERROR: LoadError: AssertionError: Parameters do not sum to one locally: 1.137407764369371; [-0.2097739565374725, -1.1188958040892987]
Stacktrace:
  [1] (::ProbabilisticCircuits.var"#103#104"{LogicCircuits.BitCircuit{Vector{Int32}, Matrix{Int32}}, Vector{Float64}, Float64, Float64})(pn::StructSumNode)
    @ ProbabilisticCircuits ~/.julia/packages/ProbabilisticCircuits/1cjGx/src/parameter_learn/parameters.jl:137
  [2] (::DirectedAcyclicGraphs.var"#1#2"{ProbabilisticCircuits.var"#103#104"{LogicCircuits.BitCircuit{Vector{Int32}, Matrix{Int32}}, Vector{Float64}, Float64, Float64}, StructSumNode, Dict{DirectedAcyclicGraphs.DAG, Nothing}})()
    @ DirectedAcyclicGraphs ~/.julia/packages/DirectedAcyclicGraphs/teMfW/src/dags.jl:82
  [3] get!(default::DirectedAcyclicGraphs.var"#1#2"{ProbabilisticCircuits.var"#103#104"{LogicCircuits.BitCircuit{Vector{Int32}, Matrix{Int32}}, Vector{Float64}, Float64, Float64}, StructSumNode, Dict{DirectedAcyclicGraphs.DAG, Nothing}}, h::Dict{DirectedAcyclicGraphs.DAG, Nothing}, key::StructSumNode)
...

I can provide a fuller stack trace if it would be useful. It is 65 levels deep though. ;)

khosravipasha · 2022-03-04T00:01:54Z

Thanks for the report. At the moment in the next version v0.4 both learn_circuit_miss and learn_circuit will be gone.

Our v0.4 is in master branch and is fairly stable now, some example scripts here. We are planning on releasing soon after some more testing and documentation.

There is major API changes so some code change would be needed. (For example, not using DataFrames anymore and just using Matrix{Union{Missing, ....}} for queries.

Comment for v0.3.3

In case, you want to stay with v0.3.3 for now.
What learn_circuit_miss does different than learn_circuit is that it uses imputation to generate the intial structure (using ChowLiu algorithm). After that both do greedy structure learning steps by doing splits and clones.

I think bug is not from learning the initial structure, and seems to be more of from paramter learning or bad paramter initilization. I will try to reproduce the bug. If you can provide minimal code that reproduces this would be nice.

https://github.com/Juice-jl/ProbabilisticCircuits.jl/blob/f22571801a2a001c374056aa3e030d22a961094a/src/structurelearner/learner.jl#L50-L56

Alternative Structure (HCLT)

Both this options give you determnistic and structured decomposable circuite, if you don't need determinism, we suggest using HCLT structures instead (they are decomposable but not determinsitic) as they usually perform better.

The code is much more simplified in v0.4 (current in master), so might be worth the swith there is major API changes though.

khosravipasha · 2022-03-04T00:14:57Z

Also forgot to ask, where you using CPU or GPU version?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limits to missingness handled by `learn_circuit_miss`? #118

Limits to missingness handled by `learn_circuit_miss`? #118

robertfeldt commented Mar 2, 2022

khosravipasha commented Mar 4, 2022

khosravipasha commented Mar 4, 2022

Limits to missingness handled by learn_circuit_miss? #118

Limits to missingness handled by learn_circuit_miss? #118

Comments

robertfeldt commented Mar 2, 2022

khosravipasha commented Mar 4, 2022

Comment for v0.3.3

Alternative Structure (HCLT)

khosravipasha commented Mar 4, 2022

Limits to missingness handled by `learn_circuit_miss`? #118

Limits to missingness handled by `learn_circuit_miss`? #118