Skip to content

Blue Brain Citation & Knowledge Graph implementation in Neo4j

License

Notifications You must be signed in to change notification settings

KeremKurban/citation-graph

 
 

Repository files navigation

Blue Brain Citation Graph

3D Force Graph

Table of Contents

Introduction

Generating or Loading the Database

Gallery

Funding and Acknowledgement

Introduction

The Blue Brain Citation Graph leverages advanced Neo4j technologies to enhance the exploration and analysis of citation data. Key features include:

  • Perspectives: These are specialized views that focus on different aspects of the knowledge graph. In this repository, you will find several perspectives under src/citations/perspectives/:

    • BBP or Not Perspective: This perspective helps in distinguishing between articles published by the Blue Brain Project and those by other collaborators.
    • Timeview Perspective: This perspective provides a temporal view of the citation data, allowing users to analyze trends and changes over time.
    • Topics Perspective: This perspective focuses on thematic clustering, enabling users to explore articles based on their subject matter.
  • Search Phrases: These are predefined queries that facilitate quick access to specific data points within the graph. They are designed to help users efficiently navigate the vast amount of information stored in the database.

  • Scene Actions: These are interactive elements that allow users to manipulate and explore the graph dynamically. Scene actions can include filtering nodes, highlighting specific paths, or triggering animations to better visualize relationships and patterns within the data.

By utilizing these Neo4j technologies, the Blue Brain Citation Graph provides a robust framework for researchers and analysts to gain deeper insights into the citation landscape of the Blue Brain Project and its collaborators.

Generating or Loading the Database

When working with the Blue Brain Citation Graph, you have two primary options: generating the database from scratch or loading an existing database.

Generating the Database

To generate the database, you will need to follow a series of steps that involve gathering articles, authors, and citation data, embedding articles, clustering, and performing dimension reduction. This process requires access to external APIs, specifically the SERP API and the OpenAI API. Please be aware that using these APIs may incur costs, as they are not free services. The generation process is comprehensive and allows for a customized and up-to-date database tailored to your specific needs.

For detailed instructions on creating the database, please refer to the step by step tutorial. But first install the necessary packages in a fresh virtual environment with:

pip install .

or

pip install -e .

We explain a comprehensive guide on gathering articles and authors, fetching citations, embedding articles, clustering, dimension reduction, and integrating data into Neo4j. It also includes additional steps for generating and integrating keywords into the database.

Loading the Database

In your neo4j desktop, you can create a new project, then import the neo4j.dump file by clicking add -> File and once its loaded, you can click on ... next to the imported file and select create new DBMS from dump.

After the loading process is complete, you can start your Neo4j database.

To use perspectives into Neo4j Bloom, you can import them from here for enhaced user experience.

Gallery

Below are some visualizations generated from the citation graph data:

Author Works on Keyword Visualization of author works on specific keywords.

Author Collaboration Network A network visualization of author collaborations extracted from the citation data.

Keyword Co-occurrence Co-occurrence of keywords extracted from articles, highlighting thematic groupings.

Top 3 Keywords Per Year (Node Weighted) Top 3 keywords per year with node weighting.

Top 3 Keywords Per Year (Weighted) Top 3 keywords per year with weighting.

UMAP Cluster Louvain UMAP clustering using the Louvain method.

These images provide a glimpse into the complex relationships and structures within the citation graph, offering insights into the research landscape.

Funding and Acknowledgement

The development of this software was supported by funding to the Blue Brain Project, a research center of the École polytechnique fédérale de Lausanne (EPFL), from the Swiss government’s ETH Board of the Swiss Federal Institutes of Technology.

Copyright (c) 2024 Blue Brain Project/EPFL

About

Blue Brain Citation & Knowledge Graph implementation in Neo4j

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%