Skip to content

A declarative framework for efficient tree decomposition powered optimization and sampling

License

Notifications You must be signed in to change notification settings

s-will/Infrared

Repository files navigation

[TOC]

Infrared

Infrared is a framework for efficient, tree decomposition based solving of declaratively modeled problems. Models can be solved by optimization or Boltzmann sampling. The latter allows targeting of features by multi-dimensional Boltzmann sampling and supports further stochastic optimization.

Resources

Name Downloads Version Platforms
Conda Recipe Conda Downloads Conda Version Conda Platforms

Main features

Infrared solves a broad class of sampling and optimization problems that can be expressed as feature networks by efficient tree decomposition based algorithms. Solving is thus performed with parameterized complexity depending on the treewidth of the network. The system was developed to target bioinformatics problems with potentially complex dependencies, originally RNA sequence design with multiple target structures. Such problems can be specified in a declarative, compositional style as (evaluated) constraint models using the Python high-level interface of Infrared.

Accessible through the Python interface, the framework provides a fast and flexible C++ engine that evaluates constraint networks consisting of variables, multi-ary functions, and multi-ary constraints. This evaluation by generic algorithms is efficient depending on the complexity, measured as tree-width, of the network.

Application specific functions and constraints can be directly defined in Python. The evaluation is performed efficiently using cluster tree elimination following a (hyper-)tree decomposition of the dependencies (due to functions and constraints). Interpreting the evaluations as partition functions, the system supports sampling of variable assignments from the corresponding Boltzmann distribution. Finally, Infrared implements a generic multi-dimensional Boltzmann sampling strategy to target specific feature values. Such functionality is made conveniently available via general Python classes. In particular, the interface was defined to allow the straightforward and declarative specification of the feature network model.

Installation

Conda installation

Infrared is installed most easily using conda. Infrared is depolyed on conda-forge channel. Users can skip the first line command if it's already done.

conda config --add channels conda-forge 
conda install infrared

Pip installation from source

For users who don't want to use conda, Infrared can also be installed with standard pip install from it's source, which we make freely available in Infrared's Gitlab repository. Compiling and installing requires a C++ / Python build environment including cmake, and installation of further dependencies, e.g. pybind11 and Treedecomp.

After installing dependencies, one compiles and installs Infrared from its base directory by

python3 -m pip install .

Treedecomp can be installed analogously; Infrared requires at least version 1.1.0.

Note that older Linux distributions, e.g. Ubuntu 18.04, install only outdated versions of pybind11 via their package managers; we require at least version 2.4. One can install pybind11 as well via pip by

PYBIND11_GLOBAL_SDIST=1 python3 -m pip install https://github.com/pybind/pybind11/archive/master.zip

Documentation

We provide Infrared's documentation online. The documentation comprises general information, API reference and examples for the use of Infrared's high-level Python interface.

Jupyter notebooks with code examples are part of the online documentation and can be as well downloaded from subdirectory Doc on Inrared's Gitlab repository.

A further entry-point to using the library in novel sampling applications is provided by the code of RNARedPrint 2 and RNAPOND.

Infrared architecture and background

The system was originally build to separate the core "Infrared" from applications like RNARedprint 2 or RNAPOND.

Python high level interface

We expose classes of the C++ library to Python, such that problems can be implemented (i.e. modeled) and solved by writing Python code. Python programs using the infrared library, typically import module infrared, create an instance of Model and populate it with variables (specifying their finite domains), constraints and functions. The model automatically generates features from the functions; in order to generate and control several features, the functions can be assigned to function groups. Moreover, the user can define additional features.

The model is passed to Solvers, i.e. Optimizers or Samplers that generate samples from a specific Boltzmann distrubition as well as samples targeted at specific feature values. The latter performed by multi-dimensional Boltzmann sampling. The module provides access to the Infrared core and exports the additional base classes

Internally, as well on the Python side, cluster trees are constructed and populated with constraints and functions.

Infrared C++ core

The C++ component solves feature networks, i.e. optimizes or samples, based on cluster trees.

A /cluster tree/ of a feature network (of variables, functions and constraints) corresponds to a (hyper-)tree decomposition of the dependency graph (induced by the functions/constraints on the variables). It consists of bags (aka clusters) of variables, functions, and constraints. The bags are connected by edges to form a tree (or forest), such that the following properties hold

  1. For each variable in the CN, there is one bag that contains the variable.

  2. For each variable, the bags that contain this variable form a subtree.

  3. For each function, there is exactly one bag that contains the function and its variables.

  4. For each constraint, there is at least one bag that contains the function and its variables. Constraints are only assigned to bags that contain all of their variables.

The core preforms precomputation, i.e. calculation of partition functions, and Boltzmann sampling based on a given, populated cluster tree.

By design, the Infrared core is strictly low-level and domain-agnostic, it only knows finite domain variables, constraints, and functions; as well as to evaluate partition functions and sample based on cluster trees for constraint networks. This will allow using the same core for evaluation (even not necessarily of partition functions, e.g. instead optimizing energy/fitness) and sampling in very different domains just by writing domain-specific Python code.

Disclaimer and license

Infrared is free software. It was part of project RNARedPrint, then separated as a stand alone project. Note that the system is in active development and is likely to still undergo changes. It is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details [http://www.gnu.org/licenses/].

About

A declarative framework for efficient tree decomposition powered optimization and sampling

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages