In this document, you will find:
- how to use your own
graphml
as the input for the GraphOptima. - how to generate sample graphs similar to the default price network using
graph-tool
.
Note: To use your own graph, ensure that the graph has no parallel edges or self-loops. If the graph has parallel edges or self-loops, use the
graph_cleaning.ipynb
notebook to clean the graph. After that, it is recommended to reformat the cleared csv edgelist into a tsv (tab-separated values) file, since GraphOptima has been through tested with tsv files.
- Before proceeding, ensure the GraphOptima has been installed (refer to
README.md
).
-
Move your GraphML input graph to the
input_graphs
folderMake sure the GraphML file is located under the
$SCRATCH/netviz/readability_optimization/input_graphs/graph_file_name.graphml
. ChangeGRAPHML_FILE
field of thereadability_optimization/config.json
toinput_graphs/graph_file_name.graphml
-
Remove redundant graph properties (Optional)
A large GraphML file may slow down the framework and crush the layout generator. To avoid this, remove large text properties and those that are not used during the layout calculation. Check out
readability_optimization/input_graphs/graph_cleaning.ipynb
, which shows a example of usinggraph-tool
to remove all the vertex and edge properties, parallel edges, and self-loops. -
Filter graph by GraphTrimmer (Optional)
It the field of social network analysis, it is common to perform some pre-filtering of the graph to only focus on the interested relationships. Simpler graph by nature is easier to optimize and read. We provide a convent tool
GraphTrimmer
to filter the graph by the degree of the nodes. TheGraphTrimmer
is a Python script that can be found in thereadability_optimization
folder. The script will remove the nodes with degree less than the threshold. To use theGraphTrimmer
, run the following command:python GraphTrimmer.py --input_graph input_graphs/graph_file_name.graphml --output_graph input_graphs/graph_file_name_trimmed.graphml --threshold 10 --filter_type reciprocal
GraphTrimmer
provides four modes of filtering: in, out, total and reciprocal.in
: filter the nodes with in-degree (a<-b) less than the threshold.out
: filter the nodes with out-degree (a->b) less than the threshold.total
: filter the nodes with total degree (in + out) less than the threshold.reciprocal
: filter the nodes with reciprocal (mutual, meaning a<->b) degree less than the threshold.
NOTE:
graph-tool
is not available under Windows. To run theGraphTrimmer
, it is recommended to use either the Singularity environment outlined in theREADME
, or to use the following docker environment (since singularity is not available on Windows): https://hub.docker.com/repository/docker/0xc00005/netviz-graham-v10/general
-
Enter the Singularity Environment
Navigate to the appropriate directory and start the Singularity environment with the following command:
module load singularity cd $SCRATCH singularity shell --nv -B $SCRATCH $SCRATCH/netviz/readability_optimization/singularity/netviz-graham-v10.sif
-
Init the Singularity Container
Within the Singularity environment, set up the PATH variable and enable the RAPIDS environment with the following command:
export MPLCONFIGDIR=$SCRATCH source /opt/conda/etc/profile.d/conda.sh conda activate base
-
Navigate to
input_graphs
FolderProceed to the sample graph folder.
cd input_graphs
-
Run graph-to-network Generation Script
Execute the Python script
generate_networks_with_gt.py
. This generates various network types such as Price Network, Complete Graph, Block Model, and others. -
Run
csv2tsv.py
ScriptAfter the
graph_tool_network_generation.py
script finishes running, executecsv2tsv.py
. This script first uses an open-source scriptgraphml2csv.py
provided by the Amazon Graph Science Team,graphml2csv.py
, to convert GraphML files to CSV format. Then, it converts these CSV files to TSP.Upon completion, each GraphML network file hwill yield three different CSV/TSP files with distinct prefixes. The one used in our paper is
fixed_price_10000_nodes_edges.csv
andprice_10000.graphml
. -
Validate the Outputs
Ensure that the graph with format "fixed_<graph_name>-edge.csv" is present in the
sample_graphs
directory. This file will be used as input for the GraphOptima framework.