You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Confidence counting of a high support rule (support 11.694.826) does not finish within five hours.
The problem is possibly inefficient memory usage since the allocated memory (according to a server-side `top') after five hours is 98.6% of available memory (94 GB) and CPU-use is only around 1% (with unlimited parallelism).
What is also noteworthy is that the reported memory use by RDFRules does not exactly match server-side metering (client shows "Used memory: 74.81 GB / 90.00 GB".
This is not a bug, but possibly a sampling strategy could be used to compute approximate confidence. taskAndRules.zip
The text was updated successfully, but these errors were encountered:
There is some other problem than just high support. Another rule in the same task ( ?b <interacts_with> ?a ) => ( ?a <interacts_with> ?b ) | HeadCoverage: 0.9917529917281246, HeadSize: 11702183, Support: 11605675 has almost identical support (11605675), but for this rule the confidence is computed in several seconds.
The problematic rules are ( ?b <provided_by> ?c ) ^ ( ?a <provided_by> ?c ) => ( ?a <interacts_with> ?b ) | HeadCoverage: 0.9993713138822047, HeadSize: 11702183, Support: 11694826 and ( ?a <category> ?c ) ^ ( ?b <category> ?c ) => ( ?a <interacts_with> ?b ) | HeadCoverage: 0.9918052042084797, HeadSize: 11702183, Support: 11606286.
It is the combinatorial explosion. One solution is to have an anytime approach with sampling and approximated results. Now, I added a better debugging of stucked rules and a possibility to interrupt mining or confidence computing tasks. Fortunately, during mining, the hardest rules are mined at the end of the refining rules queue.
Confidence counting of a high support rule (support 11.694.826) does not finish within five hours.
The problem is possibly inefficient memory usage since the allocated memory (according to a server-side `top') after five hours is 98.6% of available memory (94 GB) and CPU-use is only around 1% (with unlimited parallelism).
What is also noteworthy is that the reported memory use by RDFRules does not exactly match server-side metering (client shows "Used memory: 74.81 GB / 90.00 GB".
This is not a bug, but possibly a sampling strategy could be used to compute approximate confidence.
taskAndRules.zip
The text was updated successfully, but these errors were encountered: