Name		Name	Last commit message	Last commit date
Latest commit History 85 Commits
EDA		EDA
additional		additional
assets		assets
evaluation		evaluation
modelling		modelling
temp		temp
.gitattributes		.gitattributes
.gitignore		.gitignore
BA_Thesis_24FS_I4DS26_GNN_signed.pdf		BA_Thesis_24FS_I4DS26_GNN_signed.pdf
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Pipfile		Pipfile
README.md		README.md

Repository files navigation

Investigating Transfer Learning for Link Prediction in Graph Neural Networks

This repository contains code for the Paper Investigating Transfer Learning for Link Prediction in Graph Neural Networks by Thomas Mandelz, Jan Zwicky,Daniel Perruchoud and Stephan Heule.

Abstract

Graph Neural Networks (GNNs) emerged as powerful tools for learning representations of graph-structured data, increasingly applied to various domains. Despite their growing popularity, the transferability of GNNs remains underexplored. Transfer learning showed remarkable success in traditional deep learning tasks, enabling faster training and enhanced performance. Although GNNs are gaining popularity and are being applied in many areas, their transferability in link prediction is not well-studied.

This research investigates the applications of transfer learning in link prediction using GNNs, focusing on enhancing model performance as well as training efficiency through pre-training GNN models, followed by fine-tuning. Specifically, we train Graph Convolutional Network (GCN), GraphSAGE and Graph Isomorphism Network (GIN) architectures and investigate the benefits of transfer learning by pre-training and fine-tuning models on public data (i.e. on ogbn-papers100M and ogbn-arxiv datasets). Reference models, constructed with identical capacity and trained on the same datasets, ensure a fair comparison to the fine-tuned models. Jumpstart and asymptotic performance are used to determine the transferability between models, while training time ratios measure training efficiency.

Our findings show that transfer learning improves fine-tuned model performance, boosting jumpstart scores in relation to the reference models range from 0.63 (jumpstart) for GCN, 0.47 for GraphSAGE, 0.48 for GIN, while also reducing training time up to 15 times for GraphSAGE.

Methods

Graph Neural Networks (GNN)
Graph Theory,
Link Prediction
Transfer Learning
Deep Learning

Technologies

Python
PyTorch
PyTorch-geometric
NetworkX
Docker
Comet-ml

Datasets

ogbl-citation2 Used for reproduction and validation of the GNN pipeline.
ogbn-arxiv Used to train reference models and finetune models.
ogbl-papers100M Used to pretrain models for later finetuning.

Overview Folder Structure

Graph data will be downloaded to here
Scripts for Dataset Choices and Explorative Data Analysis are being kept here
Scripts for Qualitative Evaluation and Quantitative Visualisation are being kept here
Source code for GNN Pipeline, Heuristics, Dataset Split implementation, Qualitative evaluation and models are being kept here
Various additianl supporting file are being kept here
Temporary Folder which caches the training split and other temporary files at execution time here

Featured Files

Main Explorative Data Analysis Notebook - Includes all explorative data analyses for the datasets.
Main Quantitave Evaluation Notebook - Shows all quantitative visualisations and result aggragations used in this research.
Main GNN Pipeline File - Is the main training pipeline file for our GNN models.
Main Cosine Similarity Heuristic Pipeline File - Here are all the experiment hyperparameters.
Main Cosine Similarity Heuristic Pipeline File - Includes the source code for the cosine similarity heuristic.
Main Common Neighbor Heuristic File - Includes the source code for the common neighbor heuristic.

Installation Pipenv Environment

Voraussetzungen

Pipenv installed in local Python Environment Pipenv or just run pip install pipenv in your CLI

First Installation of Pipenv Environment

open your CLI
run cd /your/local/github/repofolder/
run pipenv install
Restart VS Code or IDE
Choose the newly created "link-prediction-in-graphs" Virtual Environment python Interpreter

Environment already installed (Update dependecies)

open your CLI
run cd /your/local/github/repofolder/
run pipenv sync

Usage

You need to change the API keys of comet-ml to yours, change your project and run name.

To reproduce our GraphSAGE reference model execute the following code.

gnn.py --project_name "your-comet-ml-project" --run_name "your-run-name" --epochs 2500 --dataset ogbn-arxiv --batch_size 35000 --lr 0.00085 --num_layers 2 --hidden_channels 384 --model_architecture SAGE --one_batch_training False --freeze_model False --save_model True --eval_n_hop_computational_graph 0 --epoch_checkpoints 50 --model_path ./modelling/gnn/sage_ref_long2500_model.pth --predictor_path ./modelling/gnn/sage_ref_long2500_predictor.pth

Further Resources

Contributing Members

Thomas Mandelz Jan Zwicky Daniel Perruchoud Stephan Heule

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Investigating Transfer Learning for Link Prediction in Graph Neural Networks

Abstract

Methods

Technologies

Datasets

Overview Folder Structure

Featured Files

Installation Pipenv Environment

Voraussetzungen

First Installation of Pipenv Environment

Environment already installed (Update dependecies)

Usage

Further Resources

Contributing Members

About

Releases

Packages

Contributors 2

Languages

License

tmandelz/link-prediction-in-graphs

Folders and files

Latest commit

History

Repository files navigation

Investigating Transfer Learning for Link Prediction in Graph Neural Networks

Abstract

Methods

Technologies

Datasets

Overview Folder Structure

Featured Files

Installation Pipenv Environment

Voraussetzungen

First Installation of Pipenv Environment

Environment already installed (Update dependecies)

Usage

Further Resources

Contributing Members

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages