-
Notifications
You must be signed in to change notification settings - Fork 14
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
9 changed files
with
183 additions
and
62 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -214,34 +214,17 @@ pre_trained_kge.predict_topk(r=[".."],t=[".."],topk=10) | |
|
||
</details> | ||
|
||
## Using Large Pre-trained Embedding Models | ||
## Downloading Pretrained Models | ||
|
||
<details> <summary> To see a code snippet </summary> | ||
|
||
**Stay tune for Keci with >10B parameters on DBpedia!** | ||
```bash | ||
# To download a pretrained ConEx on DBpedia 03-2022 | ||
mkdir ConEx && cd ConEx && wget -r -nd -np https://hobbitdata.informatik.uni-leipzig.de/KGE/DBpedia/ConEx/ && cd .. | ||
``` | ||
```python | ||
from dicee import KGE | ||
# (1) Load a pretrained ConEx on DBpedia | ||
pre_trained_kge = KGE(path='ConEx') | ||
pre_trained_kge.triple_score(h=["http://dbpedia.org/resource/Albert_Einstein"],r=["http://dbpedia.org/ontology/birthPlace"],t=["http://dbpedia.org/resource/Ulm"]) # tensor([0.9309]) | ||
pre_trained_kge.triple_score(h=["http://dbpedia.org/resource/Albert_Einstein"],r=["http://dbpedia.org/ontology/birthPlace"],t=["http://dbpedia.org/resource/German_Empire"]) # tensor([0.9981]) | ||
pre_trained_kge.triple_score(h=["http://dbpedia.org/resource/Albert_Einstein"],r=["http://dbpedia.org/ontology/birthPlace"],t=["http://dbpedia.org/resource/Kingdom_of_Württemberg"]) # tensor([0.9994]) | ||
pre_trained_kge.triple_score(h=["http://dbpedia.org/resource/Albert_Einstein"],r=["http://dbpedia.org/ontology/birthPlace"],t=["http://dbpedia.org/resource/Germany"]) # tensor([0.9498]) | ||
pre_trained_kge.triple_score(h=["http://dbpedia.org/resource/Albert_Einstein"],r=["http://dbpedia.org/ontology/birthPlace"],t=["http://dbpedia.org/resource/France"]) # very low | ||
pre_trained_kge.triple_score(h=["http://dbpedia.org/resource/Albert_Einstein"],r=["http://dbpedia.org/ontology/birthPlace"],t=["http://dbpedia.org/resource/Italy"]) # very low | ||
model = KGE(url="https://files.dice-research.org/projects/DiceEmbeddings/KINSHIP-Keci-dim128-epoch256-KvsAll") | ||
``` | ||
|
||
Please contact: ```[email protected] ``` or ```[email protected] ``` , if you lack hardware resources to obtain embeddings of a specific knowledge Graph. | ||
- [DBpedia version: 06-2022 Embeddings](https://hobbitdata.informatik.uni-leipzig.de/KGE/DBpediaQMultEmbeddings_03_07): | ||
- Models: ConEx, QMult | ||
- [YAGO3-10 ConEx embeddings](https://hobbitdata.informatik.uni-leipzig.de/KGE/conex/YAGO3-10.zip) | ||
- [FB15K-237 ConEx embeddings](https://hobbitdata.informatik.uni-leipzig.de/KGE/conex/FB15K-237.zip) | ||
- [WN18RR ConEx embeddings](https://hobbitdata.informatik.uni-leipzig.de/KGE/conex/WN18RR.zip) | ||
- For more please look at [Hobbit Data](https://files.dice-research.org/projects/DiceEmbeddings/) | ||
- For more please look at [dice-research.org/projects/DiceEmbeddings/](https://files.dice-research.org/projects/DiceEmbeddings/) | ||
|
||
</details> | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
# pip install dicee | ||
from dicee import KGE | ||
import pandas as pd | ||
from dicee.static_funcs import get_er_vocab | ||
from dicee.eval_static_funcs import evaluate_link_prediction_performance_with_reciprocals | ||
|
||
# (1) Download a pre-trained model and store it a newly created directory (KINSHIP-Keci-dim128-epoch256-KvsAll) | ||
model = KGE(url="https://files.dice-research.org/projects/DiceEmbeddings/KINSHIP-Keci-dim128-epoch256-KvsAll") | ||
# (2) Make a prediction | ||
print(model.predict(h="person49", r="term12", t="person39", logits=False)) | ||
# Load the train, validation, test datasets | ||
train_triples = pd.read_csv("KGs/KINSHIP/train.txt", | ||
sep="\s+", | ||
header=None, usecols=[0, 1, 2], | ||
names=['subject', 'relation', 'object'], | ||
dtype=str).values.tolist() | ||
valid_triples = pd.read_csv("KGs/KINSHIP/valid.txt", | ||
sep="\s+", | ||
header=None, usecols=[0, 1, 2], | ||
names=['subject', 'relation', 'object'], | ||
dtype=str).values.tolist() | ||
test_triples = pd.read_csv("KGs/KINSHIP/test.txt", | ||
sep="\s+", | ||
header=None, usecols=[0, 1, 2], | ||
names=['subject', 'relation', 'object'], | ||
dtype=str).values.tolist() | ||
# Compute the mapping from each unique entity and relation pair to all entities, i.e., | ||
# e.g. V_{e_i,r_j} = {x | x \in Entities s.t. e_i, r_j, x) \in Train \cup Val \cup Test} | ||
# This mapping is used to compute the filtered MRR and Hit@n | ||
er_vocab = get_er_vocab(train_triples + valid_triples + test_triples) | ||
|
||
result = model.get_eval_report() | ||
|
||
print(result["Train"]) | ||
print(evaluate_link_prediction_performance_with_reciprocals(model, triples=train_triples, | ||
er_vocab=er_vocab)) | ||
print(result["Val"]) | ||
print(evaluate_link_prediction_performance_with_reciprocals(model, triples=valid_triples, | ||
er_vocab=er_vocab)) | ||
print(result["Test"]) | ||
print(evaluate_link_prediction_performance_with_reciprocals(model, triples=test_triples, | ||
er_vocab=er_vocab)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.