Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mining rules for KG completion - confidence threshold within mining operator #71

Open
kliegr opened this issue Oct 11, 2021 · 0 comments

Comments

@kliegr
Copy link

kliegr commented Oct 11, 2021

For mining rules for KG completion, it would be useful to have some "mode" for finding rules suitable for the prediction of missing relations. Maybe it would suffice if there is a feature allowing imposing an upper limit on confidence. For example, in some tasks, it can happen that the first 10.000 rules all have the confidence of 100%. Rules with 100% confidence do not have any missing predictions, therefore they are not suitable for prediction.

RDFRules allows to filter rules by confidence using the Filter operator. The problem is that in some KG completion task, the vast majority of generated rules may have the support of 100% and I suspect that storage of such rules in memory could lead to an out of memory error.

For example, in the task attached, the mining generates in five minutes a ruleset cache file of 250 MBs size containing 3.474.110 rules. Without the time limit, the mining would eventually run out of memory. After applying the filtering operator on confidence ("<1.0"), only 449 rules remain. For KG Completion, only the rules with confidence below 1.0 are useful.
Computation of confidence within the mining operator could substantially reduce the memory footprint and thus allow for larger datasets to be analyzed.

task (71).json.zip

@kliegr kliegr changed the title Mining rules for KG completion - ceiling on confidence Mining rules for KG completion - confidence threshold within mining operator Oct 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant