Implemention of various heuristics and ILP formulations for best match graph (BMG) editing.
Best match graphs (BMGs) are a class of colored digraphs that naturally appear in mathematical phylogenetics as a representation of the pairwise most closely related genes among multiple species. An arc connects a gene x with a gene y from another species (encoded as the color of the nodes) Y whenever it is one of the phylogenetically closest relatives of x. This package contains various methods to edit an arbitrary vertex-colored digraph to a valid BMG, i.e., a graph that has a certain representation as (leaf-colored) tree.
bmg-edt
requires Python 3.7 or higher. It has the following dependencies:
In order to use the ILP versions for BMG editing, an installation of Gurobi Optimizer (9.0 or higher) or IBM ILOG CPLEX Optimization Studio (12.10 or higher) is required.
Moreover, the corresponding Python packages gurobipy
or docplex
, respectively, must be installed.
The functions in bmg-edit
require a NetworkX DiGraph
as input.
Moreover, all nodes must have an attribute 'color'.
The following classes for optimal BMG editing are available in the module ilp.GurobiBMG
(requires an installation of Gurobi Optimzizer):
- BMGEditor edits the input graph with an arbitrary number of colors to the closest BMG.
- BinaryBMGEditor edits the input graph with an arbitrary number of colors to the closest BMG that can be explained by a binary tree.
- TwoBMGEditor edits the input graph with at most two distinct colors to the closest (2-)BMG.
Example usage: (Click to expand)
solver = BMGEditor(input_graph)
solver.build_model()
# run the optimization with an optional time limit in seconds
solver.optimize(time_limit=None)
optimal_editing_cost, solution_graph = solver.get_solution()
The following classes for optimal BMG editing are available in the module ilp.CplexBMG
(requires an installation of IBM ILOG CPLEX Optimization Studio):
- BMGEditor edits the input graph with an arbitrary number of colors to the closest BMG.
Example usage: (Click to expand)
solver = BMGEditor(input_graph)
solver.build_model()
# run the optimization with an optional time limit in seconds
solver.optimize(time_limit=3)
solver.get_solution()
optimal_editing_cost, solution_graph = solver.get_solution()
The package also implements various heuristic approaches for BMG editing. Some of these methods are based on the unsatisfiable relation (UR) which are insertions or deletions of arcs that are associated with a certain inner node of the tree that explains the editing results. More precisely, the heuristics construct this tree in a top-down manner (i.e., starting with the root) and attempt, in each step, to minimize the UR (see refrenced paper below for details).
The class BMGEditor
in the module BMGEditor
manages the editing:
editor = BMGEditor(disturbed, binary=True)
editor.build('Mincut', objective='cost')
solution_graph = editor.get_bmg(extract_triples_first=False)
The following methods are available (first parameter of the `build` method): (Click to expand)
- 'Mincut'
- 'BPMF'
- 'Karger'
- 'Greedy'
- 'Gradient_Walk'
- 'Louvain'
- 'Louvain_Obj'
See the paper for an explanation of these methods.
If you use bmg-edit
in your project or code from it, please consider citing:
- Schaller, D., Geiß, M., Hellmuth, M., Stadler, P. F. (2021) Heuristic Algorithms for Best Match Graph Editing. Algorithms for Molecular Biology (in press). arXiv:2103.07280 [math.CO].