Merge branch 'master' of https://github.com/OpenBioLink/SAFRAN

OpenBioLink · Feb 14, 2022 · cc49928 · cc49928
2 parents 8dd2a22 + 0c48ca1
commit cc49928
Show file tree

Hide file tree

Showing 7 changed files with 366 additions and 7 deletions.
diff --git a/README.md b/README.md
@@ -14,7 +14,7 @@
   </a>
 </p>
 
-SAFRAN (Scalable and fast non-redundant rule application) is a framework for fast inference of groundings and aggregation of logical rules on large heterogeneous knowledge graphs. It is based on the work of [AnyBURL](http://web.informatik.uni-mannheim.de/AnyBURL/) (Anytime Bottom Up Rule Learning), which is an algorithm for learning, applying and evaluating logical rules from large knowledge graphs in the context of link prediction.
+SAFRAN (Scalable and fast non-redundant rule application) is a framework for fast inference of groundings and aggregation of predictions of logical rules in the context of knowledge graph completion/link prediction. It uses rules learned by [AnyBURL](http://web.informatik.uni-mannheim.de/AnyBURL/) (Anytime Bottom Up Rule Learning), a highly-efficient approach for learning logical rules from knowledge graphs.
 
 
 <p align="center">

diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -13,12 +13,13 @@ Welcome to SAFRAN's documentation!
 ==================================
 
 SAFRAN (Scalable and fast non-redundant rule application) is a framework
-for fast inference of groundings and aggregation of logical rules on
-large heterogeneous knowledge graphs. It is based on the work of
-`AnyBURL <http://web.informatik.uni-mannheim.de/AnyBURL/>`__ (Anytime
-Bottom Up Rule Learning), which is an algorithm for learning, applying
-and evaluating logical rules from large knowledge graphs in the context
-of link prediction.
+for fast inference of groundings and aggregation of predictions of logical
+rules in the context of knowledge graph completion/link prediction. It uses
+rules learned by `AnyBURL <http://web.informatik.uni-mannheim.de/AnyBURL/>`__ (Anytime
+Bottom Up Rule Learning), a highly-efficient approach for learning logical rules from knowledge graphs.
+
+.. warning::
+   Currently only rules learned with the :code:`AnyBURL-RE` version are supported. Further information can be found at the *Previous and Special Versions* section at the  `AnyBURL Homepage <http://web.informatik.uni-mannheim.de/AnyBURL/>`__. You can use the version :code:`AnyBURL-JUNO`, however you have to set `ZERO_RULES_ACTIVE = false` in the properties file for learning the rules.
 
 .. toctree::
    :caption: Getting Started
@@ -38,6 +39,7 @@ of link prediction.
    manual/applynrnoisy
    manual/learnnrnoisy
    manual/calcjacc
+   manual/evaluation
    manual/input_fmt
    manual/expl
 

diff --git a/docs/source/manual/evaluation.rst b/docs/source/manual/evaluation.rst
@@ -0,0 +1,107 @@
+Evaluate predictions
+====================
+
+Following scripts can be found in the `python` folder.
+
+Evaluate a single prediction file
+---------------------------------
+
+Use this script to evaluate a single prediction file. 
+
+Script name: eval.py
+
+**Requires:**
+
+.. code:: bash
+
+   pip install scipy tqdm
+
+**Usage:**
+
+.. code:: bash
+
+   python eval.py {path to file containing predictions} {path to testset file}
+
+**Example output:**
+
+::
+
+   MRR: 0.389
+   Hits@1: 0.298
+   Hits@3: 0.371
+   Hits@10: 0.537
+
+Evaluate an experiment
+----------------------
+
+Use this script if you want to evaluate multiple datasets containing multiple prediction files at once (Multiple datasets -> Multiple prediction files).
+
+Script name: eval_experiment.py
+
+**Requires:**
+
+.. code:: bash
+
+   pip install scipy tqdm
+
+.. _usage-1:
+
+**Usage:**
+
+.. code:: bash
+
+   python eval_experiment.py --datasets {list of datasets} --predictions {list of prediction file names}
+
+**File structure:**
+
+Each dataset should have its own folder. Evaluations are run 
+
+::
+
+   for each {dataset} in {list of datasets}: 
+      for each {prediction file name} in {list of prediction file name}:
+         Path to prediction file: f”./{dataset}/predictions/{prediction file name}”
+         Path to testset file: f”./{dataset}/data/test.txt”
+
+Example:
+
+.. code:: bash
+
+   python eval_experiment.py --datasets OBL WN18RR --predictions predfile1.txt predfile2.txt
+
+.. code:: text
+
+   ---- OBL
+       |
+       ---- predictions
+           |
+           ---- predfile1.txt
+           |
+           ---- predfile2.txt
+       |
+       ---- data
+           |
+           ---- test.txt
+   ---- WN18RR
+       |
+       ---- predictions
+           |
+           ---- predfile1.txt
+           |
+           ---- predfile2.txt
+       |
+       ---- data
+           |
+           ---- test.txt
+
+Output:
+
+::
+
+   OBL
+   predfile1.txt MRR: 0.389 Hits@1: 0.298 Hits@3: 0.371 Hits@10: 0.537
+   predfile2.txt MRR: 0.389 Hits@1: 0.298 Hits@3: 0.371 Hits@10: 0.537
+   
+   WN18RR
+   predfile1.txt MRR: 0.389 Hits@1: 0.298 Hits@3: 0.371 Hits@10: 0.537
+   predfile2.txt MRR: 0.389 Hits@1: 0.298 Hits@3: 0.371 Hits@10: 0.537
diff --git a/python/README.md b/python/README.md
@@ -0,0 +1,96 @@
+
+## eval.py
+
+Script used to evaluate a single prediction file. 
+
+##### Requires:
+
+```
+pip install scipy tqdm
+```
+
+##### Usage:
+
+```bash
+python eval.py {path to file containing predictions} {path to testset file}
+```
+
+##### Example output:
+
+```
+MRR: 0.389
+Hits@1: 0.298
+Hits@3: 0.371
+Hits@10: 0.537
+```
+
+## eval_experiment.py
+
+Script used to evaluate a experiment (Multiple datasets -> Multiple prediction files)
+
+##### Requires:
+
+```
+pip install scipy tqdm
+```
+
+##### Usage:
+
+```bash
+python eval_experiment.py --datasets {list of datasets} --predictions {list of prediction file names}
+```
+
+**File structure:**
+
+Each dataset should have its own folder. Evaluations are run 
+
+```text
+for each {dataset} in {list of datasets}: 
+   for each {prediction file name} in {list of prediction file name}:
+      Path to prediction file: f”./{dataset}/predictions/{prediction file name}”
+      Path to testset file: f”./{dataset}/data/test.txt”
+```
+
+Example:
+
+```bash
+python eval_experiment.py --datasets OBL WN18RR --predictions predfile1.txt predfile2.txt
+```
+
+```text
+---- OBL
+	|
+	---- predictions
+		|
+		---- predfile1.txt
+		|
+		---- predfile2.txt
+	|
+	---- data
+		|
+		---- test.txt
+---- WN18RR
+	|
+	---- predictions
+		|
+		---- predfile1.txt
+		|
+		---- predfile2.txt
+	|
+	---- data
+		|
+		---- test.txt
+```
+
+## amie_2_anyburl.py
+
+Converts rules learned by AMIE+ to the format of AnyBURL rules
+
+**Usage:**
+
+```
+python amie_2_anyburl.py --from {path to amie rulefile} --to {path to file storing transformed rules, will be created} 
+```
+
+Optionally the flag `--pca` can be set to use pca confidence instead of std confidence.
+
diff --git a/python/amieToAnyBURL.py → python/amie_2_anyburl.py b/python/amieToAnyBURL.py → python/amie_2_anyburl.py
diff --git a/python/eval.py b/python/eval.py
@@ -0,0 +1,66 @@
+import os
+import math
+from scipy.stats import rankdata
+from tqdm import tqdm
+import sys
+
+def read_predictions(path):
+    with open(path, encoding="utf8") as infile:
+        while True:
+            triple = infile.readline().strip().split(" ")
+            if not triple or triple[0] == "":
+                break
+            head,rel,tail = triple
+            pred_heads = infile.readline().strip()[7:].split("\t")
+            pred_tails = infile.readline().strip()[7:].split("\t")
+
+            confidences_head = [int(x.replace("0.", "").replace("1.","1").ljust(100, "0")) if (not x.startswith("1.") and not x.startswith("1")) else int("1".ljust(101, "0")) for x in pred_heads[1::2]]
+            confidences_tail = [int(x.replace("0.", "").replace("1.","1").ljust(100, "0")) if (not x.startswith("1.") and not x.startswith("1")) else int("1".ljust(101, "0")) for x in pred_tails[1::2]]
+
+            yield (head, pred_heads[0::2], confidences_head)
+            yield (tail, pred_tails[0::2], confidences_tail)
+
+
+def get_n_test(path):
+    content = None
+    with open(path, encoding="utf8") as infile:
+        content = infile.readlines()
+    content = [x.strip() for x in content]
+    return len(content)       
+
+
+def evaluate_policy(path_predictions, n, policy):
+    hits1 = 0
+    hits3 = 0
+    hits10 = 0
+    mrr = 0.0
+    mr = 0
+
+    for true_entity, prediction, conf in read_predictions(path_predictions):
+        ranking = rankdata([-x for x in conf], method=policy)
+        try:
+            idx = prediction.index(true_entity)
+            rank = ranking[idx]
+
+            if rank == 1.:
+                hits1 = hits1 + 1
+            if rank <= 3.:
+                hits3 = hits3 + 1
+            if rank <= 10.:
+                hits10 = hits10 + 1
+            mrr = mrr + (1 / rank)
+        except ValueError:
+            pass
+    #return hits1/n, hits3/n, hits10/n, mrr/n
+    return "MRR: %.3f" % (mrr/n), "Hits@1: %.3f" % (hits1/n), "Hits@3: %.3f" % (hits3/n) , "Hits@10: %.3f" % (hits10/n) 
+
+def evaluate(path_predictions, path_test):
+    n = get_n_test(path_test) * 2
+    #["ordinal", "average", "min", "max", "dense"]
+    result = evaluate_policy(path_predictions, n, "average")
+    return "\n".join(result)
+
+
+if __name__ == "__main__":
+    res = evaluate(sys.argv[1], sys.argv[2])
+    print(res)
diff --git a/python/eval_experiment.py b/python/eval_experiment.py
@@ -0,0 +1,88 @@
+import os
+import math
+from scipy.stats import rankdata
+from tqdm import tqdm
+import argparse
+
+class ArgParser(argparse.ArgumentParser):
+    def __init__(self):
+        super(ArgParser, self).__init__()
+
+        self.add_argument('--datasets', type=str, default=[""], nargs='+',
+                          help='a list of datasets')
+        self.add_argument('--predictions', type=str, default=[""], nargs='+',
+                          help='a list of prediction file names')
+
+    def parse_args(self):
+        args = super().parse_args()
+        return args
+
+def read_predictions(path):
+    with open(path, encoding="utf8") as infile:
+        while True:
+            triple = infile.readline().strip().split(" ")
+            if not triple or triple[0] == "":
+                break
+            head,rel,tail = triple
+            pred_heads = infile.readline().strip()[7:].split("\t")
+            pred_tails = infile.readline().strip()[7:].split("\t")
+
+            confidences_head = [int(x.replace("0.", "0").replace("1.","1").ljust(100, "0")) if (not "E" in x) else int(str(float(x)).replace("0.","0").ljust(100, "0")) for x in pred_heads[1::2]]
+            confidences_tail = [int(x.replace("0.", "").replace("1.","1").ljust(100, "0")) if (not "E" in x) else int(str(float(x)).replace("0.","0").ljust(100, "0")) for x in pred_tails[1::2]]
+
+            yield (head, pred_heads[0::2], confidences_head)
+            yield (tail, pred_tails[0::2], confidences_tail)
+
+
+def get_n_test(path):
+    content = None
+    with open(path, encoding="utf8") as infile:
+        content = infile.readlines()
+    content = [x.strip() for x in content]
+    return len(content)       
+
+
+def evaluate_policy(path_predictions, n, policy):
+    hits1 = 0
+    hits3 = 0
+    hits10 = 0
+    mrr = 0.0
+    mr = 0
+
+    for true_entity, prediction, conf in read_predictions(path_predictions):
+        ranking = rankdata([-x for x in conf], method=policy)
+        try:
+            idx = prediction.index(true_entity)
+            rank = ranking[idx]
+
+            if rank == 1.:
+                hits1 = hits1 + 1
+            if rank <= 3.:
+                hits3 = hits3 + 1
+            if rank <= 10.:
+                hits10 = hits10 + 1
+            mrr = mrr + (1 / rank)
+        except ValueError:
+            pass
+    return "MRR: %.3f" % (mrr/n), "Hits@1: %.3f" % (hits1/n), "Hits@3: %.3f" % (hits3/n) , "Hits@10: %.3f" % (hits10/n) 
+
+def evaluate(path_predictions, path_test):
+    n = get_n_test(path_test) * 2
+    #["ordinal", "average", "min", "max", "dense"]
+    result = evaluate_policy(path_predictions, n, "average")
+    return " ".join(result)
+
+
+if __name__ == "__main__":
+    args = ArgParser().parse_args()
+
+    for dataset in args.datasets:
+        print(dataset)
+        for eval in args.predictions:
+
+            res = evaluate(f"./{dataset}/predictions/{eval}", f"./{dataset}/data/test.txt")
+
+            print(eval.ljust(25) + res)
+
+        print()
+
-Original file line number
+Diff line change
@@ Expand Up / @@ -14,7 +14,7 @@ @@
       </a>
     </p>
-    SAFRAN (Scalable and fast non-redundant rule application) is a framework for fast inference of groundings and aggregation of logical rules on large heterogeneous knowledge graphs. It is based on the work of [AnyBURL](http://web.informatik.uni-mannheim.de/AnyBURL/) (Anytime Bottom Up Rule Learning), which is an algorithm for learning, applying and evaluating logical rules from large knowledge graphs in the context of link prediction.
+    SAFRAN (Scalable and fast non-redundant rule application) is a framework for fast inference of groundings and aggregation of predictions of logical rules in the context of knowledge graph completion/link prediction. It uses rules learned by [AnyBURL](http://web.informatik.uni-mannheim.de/AnyBURL/) (Anytime Bottom Up Rule Learning), a highly-efficient approach for learning logical rules from knowledge graphs.
     <p align="center">
@@ Expand Down @@