pasqal-io · Yoric · Jan 8, 2025 · Jan 7, 2025 · Jan 7, 2025 · Jan 7, 2025
diff --git a/examples/pipeline.ipynb b/examples/pipeline.ipynb
@@ -5,7 +5,8 @@
    "metadata": {},
    "source": [
     "# QEK from A to Z\n",
-    "This notebook reproduces the results of [QEK](https://journals.aps.org/pra/abstract/10.1103/PhysRevA.107.042615) by running the jupyter notebook.\n",
+    "\n",
+    "This notebook reproduces the results of the [QEK paper](https://journals.aps.org/pra/abstract/10.1103/PhysRevA.107.042615).\n",
     "\n",
     "At the end, you will be able to:\n",
     "1. Find the embeddings of a molecular dataset\n",
@@ -15,26 +16,16 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 13,
    "metadata": {},
-   "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "/tmp/ipykernel_160993/4114388664.py:9: TqdmExperimentalWarning: Using `tqdm.autonotebook.tqdm` in notebook mode. Use `tqdm.tqdm` instead to force console mode (e.g. in jupyter console)\n",
-      "  from tqdm.autonotebook import tqdm\n"
-     ]
-    }
-   ],
+   "outputs": [],
    "source": [
     "from __future__ import annotations\n",
     "\n",
     "from dataclasses import dataclass\n",
     "\n",
     "import numpy as np\n",
     "import pulser as pl\n",
-    "import torch_geometric.data as pyg_data\n",
     "import torch_geometric.datasets as pyg_dataset\n",
     "from tqdm.autonotebook import tqdm\n"
    ]
@@ -50,7 +41,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 14,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -60,12 +51,11 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 7,
+   "execution_count": 15,
    "metadata": {},
    "outputs": [],
    "source": [
-    "import qek.data.datatools as qek_datatools\n",
-    "from qek.utils import compute_register, is_disk_graph"
+    "import qek.data.datatools as qek_datatools"
    ]
   },
   {
@@ -74,24 +64,26 @@
    "source": [
     "# Graph embedding\n",
     "\n",
-    "A graph, molecular or otherwise, does not have coordinates in space. We therefore need to find a regular embedding as possible. \n",
-    "In addition to this regularity, we must be sure that the embedding obtained by the function `add_graph_coord` is an **unit-disk graph embedding**.\n",
-    "\n",
-    "Here, we want that the distance between two connected qubits is equal to `RADIUS=5.001` $\\mu m$.\n",
-    "The `is_disk_graph` function check if the found embedding is an embedding of unit-disk graph, i.e. the distance between two connected nodes should be less than `RADIUS` and the distance between two disconnected nodes should be greater than `RADIUS`.\n",
+    "QEK lets researchers embed _graphs_ on Quantum Devices. To do this, we need to give these graphs a geometry (positions in\n",
+    "space) and to confirm that the geometry is compatible with a Quantum Device. Here, our dataset consists in molecules (represented\n",
+    "as graphs). To simplify things, QEK comes with a dedicated class `qek_datatools.MoleculeGraph` that adds a geometry to the graphs.\n",
     "\n",
+    "One of the core ideas behind QEK is that each nodes (aka atoms) in a graph (aka molecule) from the dataset is represented by one\n",
+    "cold atom on the Device and if two nodes are joined by an edge, their cold atoms must be close to each other. In geometrical terms,\n",
+    "this means that the `MoleculeGraph` must be a _disk graph_, with a radius of 5.001 $\\mu m$. In this notebook, for the sake of\n",
+    "simplicity, we simply discard graphs that are not disk graphs.\n",
     " "
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 8,
+   "execution_count": 16,
    "metadata": {},
    "outputs": [
     {
      "data": {
       "application/vnd.jupyter.widget-view+json": {
-       "model_id": "fd00873529984a7ea9c20e87bc59b4c2",
+       "model_id": "9deae7c65fae4de9b557c916e3df32bc",
        "version_major": 2,
        "version_minor": 0
       },
@@ -104,13 +96,13 @@
     }
    ],
    "source": [
-    "list_of_graph = []\n",
+    "list_of_graphs = []\n",
     "RADIUS = 5.001\n",
     "EPS = 0.01\n",
-    "for graph in tqdm(og_ptcfm):\n",
-    "    graph_with_pos = qek_datatools.add_graph_coord(graph=graph, blockade_radius=RADIUS)\n",
-    "    if is_disk_graph(graph_with_pos, radius=RADIUS+EPS):\n",
-    "        list_of_graph.append((graph_with_pos, graph.y.item()))"
+    "for data in tqdm(og_ptcfm):\n",
+    "    graph = qek_datatools.MoleculeGraph(data=data, blockade_radius=RADIUS)\n",
+    "    if graph.is_disk_graph(radius=RADIUS+EPS):\n",
+    "        list_of_graphs.append((graph, graph.pyg.y.item()))"
    ]
   },
   {
@@ -119,87 +111,54 @@
    "source": [
     "## Create a Pulser sequence\n",
     "\n",
-    "Once the embedding is found, we will create a pulser sequence that can be interpreted by the QPU or a Pasqal emulator. A sequence consists of a **register**, which means the position of qubits in a device and a **pulse** sequence.\n",
-    "\n",
-    "The `create_sequence_from_graph` function is responsible for doing this. It checks if the positions of the qubits respect the constraints of the device (number of qubits, minimum and maximum distance between qubits, etc.) and create a register if the embedding pass all the tests. Finally, the pulse sequence is the same as that in the scientific paper, which is a constant pulse with values $\\Omega = 2\\pi$, $\\delta = 0$ and a duration of $660 ns$.\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 9,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Create a sequence:\n",
-    "\n",
-    "def create_sequence_from_graph(graph:pyg_data.Data, device=pl.devices.Device)-> pl.Sequence:\n",
-    "    if not qek_datatools.check_compatibility_graph_device(graph, device):\n",
-    "        raise ValueError(f\"The graph is not compatible with {device}\")\n",
-    "    reg = compute_register(data_graph=graph)\n",
-    "    seq = pl.Sequence(register=reg, device=device)\n",
-    "    Omega_max = 1.0 * 2 * np.pi\n",
-    "    t_max = 660\n",
-    "    pulse = pl.Pulse.ConstantAmplitude(\n",
-    "        amplitude=Omega_max,\n",
-    "        detuning=pl.waveforms.RampWaveform(t_max, 0, 0),\n",
-    "        phase=0.0,\n",
-    "    )\n",
-    "    seq.declare_channel(\"ising\", \"rydberg_global\")\n",
-    "    seq.add(pulse, \"ising\")\n",
-    "    return seq"
+    "Once the embedding is found, we create a Pulser Sequence that can be interpreted by a Quantum Device. A Sequence consists of a **register** (i.e. a geometry of cold atoms on the device) and **pulse**s. Sequences need to be designed for a specific device, so our graph object offers a method `compute_sequence` that does exactly that."
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 10,
+   "execution_count": 17,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "f3e5e9dc5b864843ba85a4281951072b",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "  0%|          | 0/294 [00:00<?, ?it/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
    "source": [
     "dataset_sequence = []\n",
     "\n",
-    "# In this tutorial, to make things faster, we'll only run the first compatible entry in the dataset.\n",
-    "# If you wish to run more entries, feel free to increase this value.\n",
-    "MAX_NUMBER_OF_DATASETS = 1\n",
-    "\n",
-    "for graph, target in list_of_graph:\n",
-    "    # Some graph are not compatible with the AnalogDevice device\n",
-    "    try:\n",
-    "        dataset_sequence.append((create_sequence_from_graph(graph, device=pl.AnalogDevice), target))\n",
-    "        if len(dataset_sequence) >= MAX_NUMBER_OF_DATASETS:\n",
-    "            break\n",
-    "    except ValueError as err:\n",
-    "        print(f\"{err}\")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 11,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from pulser_simulation import QutipEmulator"
+    "for graph, target in tqdm(list_of_graphs):\n",
+    "    # Some graph are not compatible with AnalogDevice\n",
+    "    if graph.is_embeddable(device=pl.AnalogDevice):\n",
+    "        dataset_sequence.append((graph.compute_sequence(device=pl.AnalogDevice), target))\n"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "A pulser sequence is all you need for a quantum calculation on a Pasqal QPU! Before submitting the calculation to an actual quantum computer, we must first verify that everything works by emulation. For this, Pasqal has developed `pulser_simulation`.\n",
-    "\n",
-    "The code below allows us to emulate the entire \"quantum compatible\" PTC-FM dataset (i.e., whose embeddings are unit-disk and compatible with the device). However, we advise against running it for time reasons.\n",
-    "\n",
-    "Fortunately, we have already emulated the entire PTC-FM compatible dataset. You just need to load it up.\n"
+    "A pulser sequence is all you need for a quantum calculation on a Pasqal QPU! Before submitting the calculation to an actual quantum computer, let's verify that everything works on our machine. For this, Pasqal has developed several simulators, which you may find in `pulser_simulation`. Of course, quantum simulators are much slower than a real quantum computer, so we're not going to run all these embeddings on our simulator."
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 12,
+   "execution_count": 18,
    "metadata": {},
    "outputs": [
     {
      "data": {
       "application/vnd.jupyter.widget-view+json": {
-       "model_id": "479ba694fb0540718becf9971c5ef27e",
+       "model_id": "36483a40288c431eb0482e66325b375a",
        "version_major": 2,
        "version_minor": 0
       },
@@ -212,7 +171,13 @@
     }
    ],
    "source": [
-    "for seq, target in tqdm(dataset_sequence):\n",
+    "from pulser_simulation import QutipEmulator\n",
+    "\n",
+    "# In this tutorial, to make things faster, we'll only run the first compatible entry in the dataset.\n",
+    "# If you wish to run more entries, feel free to increase this value.\n",
+    "MAX_NUMBER_OF_DATASETS = 1\n",
+    "\n",
+    "for seq, target in tqdm(dataset_sequence[0:MAX_NUMBER_OF_DATASETS]):\n",
     "    simul = QutipEmulator.from_sequence(sequence=seq)\n",
     "    results = simul.run()\n"
    ]
@@ -221,28 +186,15 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Loading the already existing dataset"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 13,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "processed_dataset = qek_datatools.load_dataset(file_path=\"ptcfm_processed_dataset.json\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Some properties of the newly created dataset:"
+    "## Loading the already existing dataset\n",
+    "\n",
+    "For this notebook, instead of spending ours running the simulator on your computer, we're going to skip\n",
+    "this step and load on we're going to cheat and load the results, which are conveniently stored in a file."
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 14,
+   "execution_count": 19,
    "metadata": {},
    "outputs": [
     {
@@ -254,19 +206,20 @@
     }
    ],
    "source": [
+    "processed_dataset = qek_datatools.load_dataset(file_path=\"ptcfm_processed_dataset.json\")\n",
     "print(f\"Size of the quantum compatible dataset = {len(processed_dataset)}\")"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Cherry picked register and pulse sequence:"
+    "Let's look at a the sequence and register for one of these samples:"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 15,
+   "execution_count": 20,
    "metadata": {},
    "outputs": [
     {
@@ -286,7 +239,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 16,
+   "execution_count": 21,
    "metadata": {},
    "outputs": [
     {
@@ -323,16 +276,16 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 17,
+   "execution_count": 22,
    "metadata": {},
    "outputs": [],
    "source": [
-    "from qek.kernel import Kernel\n"
+    "from qek.kernel import QuantumEvolutionKernel as QEK\n"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 18,
+   "execution_count": 23,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -415,11 +368,11 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 19,
+   "execution_count": 24,
    "metadata": {},
    "outputs": [],
    "source": [
-    "kernel = Kernel(mu=2.)"
+    "kernel = QEK(mu=2.)"
    ]
   },
   {
@@ -431,7 +384,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 20,
+   "execution_count": 25,
    "metadata": {},
    "outputs": [
     {
@@ -504,7 +457,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 21,
+   "execution_count": 26,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -514,7 +467,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 22,
+   "execution_count": 27,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -538,7 +491,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 23,
+   "execution_count": 28,
    "metadata": {},
    "outputs": [
     {