Skip to content

Link Prediction in Graphs with Graph Neural Networks - Investigating Transfer Learning

License

Notifications You must be signed in to change notification settings

tmandelz/link-prediction-in-graphs

Repository files navigation

Investigating Transfer Learning for Link Prediction in Graph Neural Networks

Circular Image

This repository contains code for the Paper Investigating Transfer Learning for Link Prediction in Graph Neural Networks by Thomas Mandelz, Jan Zwicky,Daniel Perruchoud and Stephan Heule.

Abstract

Graph Neural Networks (GNNs) emerged as powerful tools for learning representations of graph-structured data, increasingly applied to various domains. Despite their growing popularity, the transferability of GNNs remains underexplored. Transfer learning showed remarkable success in traditional deep learning tasks, enabling faster training and enhanced performance. Although GNNs are gaining popularity and are being applied in many areas, their transferability in link prediction is not well-studied.

This research investigates the applications of transfer learning in link prediction using GNNs, focusing on enhancing model performance as well as training efficiency through pre-training GNN models, followed by fine-tuning. Specifically, we train Graph Convolutional Network (GCN), GraphSAGE and Graph Isomorphism Network (GIN) architectures and investigate the benefits of transfer learning by pre-training and fine-tuning models on public data (i.e. on ogbn-papers100M and ogbn-arxiv datasets). Reference models, constructed with identical capacity and trained on the same datasets, ensure a fair comparison to the fine-tuned models. Jumpstart and asymptotic performance are used to determine the transferability between models, while training time ratios measure training efficiency.

Our findings show that transfer learning improves fine-tuned model performance, boosting jumpstart scores in relation to the reference models range from 0.63 (jumpstart) for GCN, 0.47 for GraphSAGE, 0.48 for GIN, while also reducing training time up to 15 times for GraphSAGE.

Methods

  • Graph Neural Networks (GNN)
  • Graph Theory,
  • Link Prediction
  • Transfer Learning
  • Deep Learning

Technologies

  • Python
  • PyTorch
  • PyTorch-geometric
  • NetworkX
  • Docker
  • Comet-ml

Datasets

  • ogbl-citation2 Used for reproduction and validation of the GNN pipeline.
  • ogbn-arxiv Used to train reference models and finetune models.
  • ogbl-papers100M Used to pretrain models for later finetuning.

Overview Folder Structure

  • Graph data will be downloaded to here
  • Scripts for Dataset Choices and Explorative Data Analysis are being kept here
  • Scripts for Qualitative Evaluation and Quantitative Visualisation are being kept here
  • Source code for GNN Pipeline, Heuristics, Dataset Split implementation, Qualitative evaluation and models are being kept here
  • Various additianl supporting file are being kept here
  • Temporary Folder which caches the training split and other temporary files at execution time here

Featured Files

Installation Pipenv Environment

Voraussetzungen

  • Pipenv installed in local Python Environment Pipenv or just run pip install pipenv in your CLI

First Installation of Pipenv Environment

  • open your CLI
  • run cd /your/local/github/repofolder/
  • run pipenv install
  • Restart VS Code or IDE
  • Choose the newly created "link-prediction-in-graphs" Virtual Environment python Interpreter

Environment already installed (Update dependecies)

  • open your CLI
  • run cd /your/local/github/repofolder/
  • run pipenv sync

Usage

You need to change the API keys of comet-ml to yours, change your project and run name.

To reproduce our GraphSAGE reference model execute the following code.

gnn.py --project_name "your-comet-ml-project" --run_name "your-run-name" --epochs 2500 --dataset ogbn-arxiv --batch_size 35000 --lr 0.00085 --num_layers 2 --hidden_channels 384 --model_architecture SAGE --one_batch_training False --freeze_model False --save_model True --eval_n_hop_computational_graph 0 --epoch_checkpoints 50 --model_path ./modelling/gnn/sage_ref_long2500_model.pth --predictor_path ./modelling/gnn/sage_ref_long2500_predictor.pth

Further Resources

Contributing Members

Thomas Mandelz Jan Zwicky Daniel Perruchoud Stephan Heule

About

Link Prediction in Graphs with Graph Neural Networks - Investigating Transfer Learning

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published