GSoC 2015 ZeroMQ

#Asynchronous communication layer (ØMQ) for the island model

PaGMO provides powerful parallelization built around the generalized island-model paradigm ("optimization in an archipelago of islands"). The optimization algorithms are automatically parallelized (asynchronously, coarse-grained approach) thus making efficient use of todays multi/many-core architectures. Islands may run in complete isolation in parallel but once candidate solutions start to migrate between islands the optimization process is speed up. The current implementation of the island model in PaGMO is mainly based on the two classes island_base and archipelago. The island_base class is a polymorphic class that takes care of starting computational tasks. Currently three tpyes of islands are available. A POSIX type of island (spawning a thread), a python type of island (spawning a process), and an MPI island.

This project will make use of a modern approach, namely ØMQ, an asynchronous communication layer that it very "light-weight" to define a new type of island (the ØMQ island) able to communicate to similar islands (anywhere in the network) and exchange information (regardless of their belonging or not to an archipelago object). This would allow an easy access to grid-computing architectures where different machines could be running isolated ØMQ island (on the same "topic") or have them embedded in some local archipelago and still have communication (migration) happening via the network.

Implementation Details:

We target a completely broker-less, peer-to-peer, asynchronous message passing solution for migration of solutions between islands.

Computer 1:

isl = island_zeromq(algo, prob, 20, topic="cool_stuff")
isl.evolve(100)

... in the meantime, somewhere else ....

Computer 2:

isl = island_zeromq(algo, prob, 20, topic="cool_stuff")
isl.evolve(100)

In this setup the two islands would be exchanging individuals as they are subscribed to and talk on the same topic (and only in case the exchange makes sense, i.e., same problem dimensions). The island_zeromq could also, on one or both computers, be embedded into a local archipelago containing other (non-ØMQ) islands. Here is an example of a more complex setup, connecting three local archipelagos (I, J, K) with three ØMQ connections (dashed edges):

References:

ØMQ
Redis

Mentors:

Architecture

This project will be divided in two phases. The first phase will use a Redis broker to simplify the discovery process, allowing us to focus on the effectiveness of the communication between islands. During this period we will also learn exactly which kinds of capabilities are needed for a completely brokerless approach, and whether that is actually practical.

During the second phase, we will design a replacement for the Redis broker scaffolding and decide whether to go for a brokerless solution (like Zeroconf, or a homegrown discovery mechanism) or write our own broker software using ZeroMQ. Once the design for that is complete, we will remove the Redis dependency and implement the native approach.

Phase One Design

User Interface (API)

The API must be very similar to the already existing MPI island interface. The proposed simple use case is:

#include "pagmo.h"

using namespace pagmo;

int main()
{
    problem::dejong problem(10);
    algorithm::monte_carlo algorithm(100);

    archipelago a;
    //a.set_topology(topology::ring());

    for (int i = 0; i < 10; ++i) {
        a.push_back(zmq_island(algorithm, prob, 1, "dejong_test", "127.0.0.1:6379"));
    }

    // Evolve the archipelago 10 times.
    a.evolve(10);
    a.join();
    return 0;
}

The ZeroMQ island requires two additional parameters: a "topic" or "channel string" which ensures that only islands working on a similar problem specification (as defined by the user) will communicate with each other. The second parameter is the broker address: in this case, the broker is a local Redis server running on the default port.

Protocol

When a ZeroMQ Island is instantiated, the following things happen:

Initialisation: The island opens a port (TODO: how to determine the port? Random?) and spawns a thread to receive data on this socket. Possibly using the [ZMQ Actor] framework. This socket is used to receive data from other islands; it is not used for outgoing data.
Bootstrap: On the main thread, the island connects to the broker address. First, it adds itself to a set of islands operating on the given topic. (TODO: determine which address to use if the node has several IP addresses) The address and port added to this ledger correspond to the socket created in the initialisation step. When the island adds itself to the ledger, it also sends a real-time, ephemeral message through a Redis channel.
Subscription: After the initial set of peers has been received, the island must listen for new islands that come online on the same topic. This is achieved with a Redis channel subscription.
Connection: The island will now connect to every address it received in the bootstrap phase.
Communication: The communication layer is fully established, and the exchange of solutions can begin.

Example

zmq_island(algorithm, prob, 1, "dejong_test", "127.0.0.1:6379") is instantiated.
A thread created using the ZMQ Actor framework is used to manage an incoming socket bound to 127.0.0.1:4494 (random port).
The main thread connects to the Redis broker, 127.0.0.1:6379, and asks for a list of peer islands already initialised and listening on the dejong_test topic. This is achieved using the SMEMBERS pagmo.islands.dejong_test Redis command. After it receives the list of peers, it adds itself to the ledger with the following Redis command: SADD pagmo.islands.dejong_test 127.0.0.1:4494. The address added to the ledger in this step is the same socket that is created in the first step. After that, the island sends a message to notify islands that are already initialised that we're about to join the topic. This is achieved with PUBLISH pagmo.islands.realtime.dejong_test 127.0.0.1:4494.
After the island has sent a message notifying others of its availability, it must now listen on that channel to be able to receive connection messages from other islands. The specifics of this are highly coupled with the C++ Redis library that we use for Phase One, but the command that would be issued would be similar to SUBSCRIBE pagmo.islands.realtime.dejong_test. Whenever a message is received on this realtime channel, the island will connect to the address contained in the payload of the message.
Once the island has obtained an initial list of peers and announced its presence, it must now connect with a socket that will be used only for outgoing communication (as opposed to the socket opened on the first step, which is used for inbound data).
PaGMO can now send data (individuals, solutions) to ZeroMQ islands that are connected on the same topic, with the assurance that even islands that started after the original island will receive the communication.

Provide feedback

Saved searches