diff --git a/interactive_notebooks/lab10/lab10-empty.ipynb b/interactive_notebooks/lab10/lab10-empty.ipynb
new file mode 100644
index 00000000..290d10d7
--- /dev/null
+++ b/interactive_notebooks/lab10/lab10-empty.ipynb
@@ -0,0 +1,1445 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "202c75d1-a3ba-476f-aad8-7e119f11b2c7",
+ "metadata": {},
+ "source": [
+ "# Lab 10: Parallel computing\n",
+ "\n",
+ "In this lab we are going to introduce tools that Julia's ecosystem offers for different ways of parallel computing. As an ilustration for how capable Julia was/is consider the fact that it has joined (alongside C,C++ and Fortran) the so-called \"PetaFlop club\"[^1][], a list of languages capable of running at over 1PFLOPS.\n",
+ "\n",
+ "\n",
+ "[^1]: https://juliacomputing.com/media/2017/09/julia-joins-petaflop-club/ \"Julia Joins Petaflop Club\"\n",
+ "\n",
+ "## Introduction\n",
+ "Nowadays there is no need to convince anyone about the advantages of having more cores available for your computation be it on a laptop, workstation or a cluster. The trend can be nicely illustrated in the figure bellow:\n",
+ "\n",
+ "Image source[^2]\n",
+ "\n",
+ "[^2]: https://www.karlrupp.net/2018/02/42-years-of-microprocessor-trend-data/ \"Performance metrics trend of CPUs in the last 42years\"\n",
+ "\n",
+ "However there are some shortcomings when going from sequential programming, that we have to note\n",
+ "- We don't think in parallel\n",
+ "- We learn to write and reason about programs serially\n",
+ "- The desire for parallelism often comes *after* you've written your algorithm (and found it too slow!)\n",
+ "- Harder to reason and therefore harder to debug\n",
+ "- The number of cores is increasing, thus knowing how the program scales is crucial (not just that it runs better)\n",
+ "- Benchmarking parallel code, that tries to exhaust the processor pool is much more affected by background processes\n",
+ "\n",
+ "\n",
+ "
\n",
+ "Warning: \"Shortcomings of parallelism\": Parallel computing brings its own set of problems and not an insignificant overhead with data manipulation and communication, therefore try always to optimize your serial code as much as you can before advancing to parallel acceleration.\n",
+ "
\n",
+ "\n",
+ "\n",
+ "Warning:\"Disclaimer\":\n",
+ " With the increasing complexity of computer HW some statements may become outdated. Moreover we won't cover as many tips that you may encounter on a parallel programming specific course, which will teach you more in the direction of how to think in parallel, whereas here we will focus on the tools that you can use to realize the knowledge gained therein.\n",
+ "
\n",
+ " \n",
+ "## Process based parallelism\n",
+ "As the name suggest process based parallelism is builds on the concept of running code on multiple processes, which can run even on multiple machines thus allowing to scale computing from a local machine to a whole network of machines - a major difference from the other parallel concept of threads. In Julia this concept is supported within standard library `Distributed` and the scaling to cluster can be realized by 3rd party library [`ClusterManagers.jl`](https://github.com/JuliaParallel/ClusterManagers.jl).\n",
+ "\n",
+ "Let's start simply with knowing how to start up additional Julia processes. There are two ways:\n",
+ "- by adding processes using cmd line argument `-p ##`\n",
+ "```bash\n",
+ "julia -p 4\n",
+ "```\n",
+ "- by adding processes after startup using the `addprocs(##)` function from std library `Distributed`\n",
+ "```julia\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "id": "55dae624-9cf1-4e5c-836a-91df43c350ea",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "cd(\"Scientific-Programing-in-Julia\") # just for me"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "c8346fcc-c8c9-4e66-95c3-1289faf282f6",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "using Distributed\n",
+ "addprocs(4)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "11e8210b-a0ce-46ac-a8ec-3f584b6c432a",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "nworkers() # returns number of workers"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "f72fb84d-2b3a-4f87-9ad2-8e54e6a19ba1",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "nprocs() # returns number of processes `nworkers() + 1`"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "75158b5e-cd0e-4d98-af58-6b2025d3087f",
+ "metadata": {},
+ "source": [
+ "The result shown in a process manager such as `htop`:\n",
+ "```bash\n",
+ ".../julia-1.6.2/bin/julia --project \n",
+ ".../julia-1.6.2/bin/julia -Cnative -J/home/honza/Apps/julia-1.6.2/lib/julia/sys.so -g1 --bind-to 127.0.0.1 --worker\n",
+ ".../julia-1.6.2/bin/julia -Cnative -J/home/honza/Apps/julia-1.6.2/lib/julia/sys.so -g1 --bind-to 127.0.0.1 --worker\n",
+ ".../julia-1.6.2/bin/julia -Cnative -J/home/honza/Apps/julia-1.6.2/lib/julia/sys.so -g1 --bind-to 127.0.0.1 --worker\n",
+ ".../julia-1.6.2/bin/julia -Cnative -J/home/honza/Apps/julia-1.6.2/lib/julia/sys.so -g1 --bind-to 127.0.0.1 --worker\n",
+ "```\n",
+ "\n",
+ "Both of these result in total of 5 running processes - 1 controller, 4 workers - with their respective ids accessible via `myid()` function call. Note that the controller process has always id 1 and other processes are assigned subsequent integers, see for yourself with `@everywhere` macro, which runs easily code on all or a subset of processes."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "22bcb19a-9ac3-4e0e-b6b5-fdc722663906",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "@everywhere println(myid())"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "a851ee33-752c-4feb-9b64-5b8ec53df359",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "myid()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "d2acdb1d-f83f-4a96-b67e-6ff1060cb9d6",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "@everywhere [2,3] println(myid()) # select a subset of workers"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "ce8fe801-d84d-43a3-b306-2be506710bf6",
+ "metadata": {},
+ "source": [
+ "The same way that we have added processes we can also remove them"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "745d5c29-328c-4cab-adc6-c07c9fd0cb0a",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "workers()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "6dfcc441-2583-44ae-8228-a285a3b8f674",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "rmprocs(2) # kills worker with id 2"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "0830e2c0-603a-44d0-911d-0f4ac1d2bbee",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "workers()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "dbfa8653-2700-4968-b23c-cb2a0c61cbd0",
+ "metadata": {},
+ "source": [
+ "As we have seen from the `htop/top` output, added processes start with specific cmd line arguments, however they are not shared with any aliases that we may have defined, e.g. `julia` ~ `julia --project=.`. Therefore in order to use an environment, we have to first activate it on all processes"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "4b283197-8791-4ceb-b1af-2a39cfd101b1",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "@everywhere begin\n",
+ " using Pkg; Pkg.activate(@__DIR__);Pkg.instantiate() # @__DIR__ equivalent to a call to pwd()\n",
+ "end"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "cb1379d9-ef34-4e67-8c7f-f5579f265ff6",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "or we can load files containing this line on all processes with cmdline option `-L ###.jl` together with `-p ##`.\n",
+ "\n",
+ "#### There are generally two ways of working with multiple processes\n",
+ "- using low level functionality - we specify what/where is loaded, what/where is being run and when we fetch results\n",
+ " + `@everywhere` to run everywhere and wait for completion\n",
+ " + `@spawnat` and `remotecall` to run at specific process and return `Future` (a reference to a future result - remote reference)\n",
+ " + `fetch` - fetching remote reference\n",
+ " * `pmap` - for easily mapping a function over a collection\n",
+ "\n",
+ "- using high level functionality - define only simple functions and apply them on collections\n",
+ " + [`DistributedArrays`](https://github.com/JuliaParallel/DistributedArrays.jl)' with `DArray`s\n",
+ " + [`Transducers.jl`](https://github.com/JuliaFolds/Transducers.jl) pipelines\n",
+ "\n",
+ "### Sum with processes (First example)\n",
+ "Writing your own sum of an array function is a good way to show all the potential problems, you may encounter with parallel programming. For comparison here is the naive version that uses `zero` for initialization and `@inbounds` for removing boundschecks.\n",
+ "\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "dc79108b-2b8d-4bbc-b070-921b97493183",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "function naive_sum(a)\n",
+ " r = zero(eltype(a))\n",
+ " for i in eachindex(a)\n",
+ " @inbounds r += a[i]\n",
+ " end\n",
+ " r\n",
+ "end"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "356efae4-3ec5-4c43-9148-80a06887408b",
+ "metadata": {},
+ "source": [
+ "Its performance will serve us as a sequential baseline."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "id": "69e01f3a-2b3e-496b-ac63-227599f175d8",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "using BenchmarkTools"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "id": "41f44158-90c5-4037-8a7c-7dc2793b042d",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "a = rand(10_000_000); # 10^7"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "72a603b9-73d9-4df5-b13a-255afe08ec7e",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "sum(a) ≈ naive_sum(a)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "05494f1e-61fd-4ee2-ab5e-d926b6e7eed6",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "@btime sum($a)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "0cfe892c-e702-4064-a772-f8e4c5dd52aa",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "@btime naive_sum($a)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "c3e3391c-b4f0-485a-8637-46870f698aeb",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "rmprocs(workers()) # clean workers"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "c33d6d96-ad00-4f94-8f47-fdfd1108c341",
+ "metadata": {},
+ "source": [
+ "Note that the built-in `sum` exploits single core parallelism with Single instruction, multiple data (SIMD instructions) and is thus faster.\n",
+ "\n",
+ "\n",
+ "\n",
+ "Exercise: \n",
+ "\n",
+ "Write a distributed/multiprocessing version of `sum` function `dist_sum(a, np=nworkers())` without the help of `DistributedArrays`. Measure the speed up when doubling the number of workers (up to the number of logical cores - see [note](@ref lab10_thread) on hyper threading).\n",
+ " \n",
+ "**HINTS**:\n",
+ "- map builtin `sum` over chunks of the array using `pmap`\n",
+ "- there are built in partition iterators `Iterators.partition(array, chunk_size)`\n",
+ "- `chunk_size` should relate to the number of available workers\n",
+ "- `pmap` has the option to pass the ids of workers as the second argument `pmap(f, WorkerPool([2,4]), collection)`\n",
+ "- `pmap` collects the partial results to the controller where it can be collected with another `sum`\n",
+ " \n",
+ "
\n",
+ " \n",
+ "####"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "793679a7-a5f4-4bd0-ab2b-a365e749fda4",
+ "metadata": {},
+ "source": [
+ "\n",
+ "Solution:
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "0d52d226",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "46b443c1",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "id": "d381dce9-3c71-49e8-988c-549d0f1fc2bd",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "####"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "9e904130-21c2-4855-b515-2a84ab91846f",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "###\n",
+ "As you can see the built-in `pmap` already abstracts quite a lot from the process and all the data movement is handled internally, however in order to show off how we can abstract even more, let's use the `DistributedArrays.jl` pkg."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "0c53debe-e495-44a1-8f5f-e95d42bd9f8e",
+ "metadata": {},
+ "source": [
+ "\n",
+ "Exercise: \n",
+ "Write a distributed/multiprocessing version of `sum` function `dist_sum_lib(a, np=nworkers())` with the help of `DistributedArrays`. Measure the speed up when doubling the number of workers (up to the number of logical cores - see note on hyper threading).\n",
+ "\n",
+ "**HINTS**:\n",
+ "- chunking and distributing the data can be handled for us using the `distribute` function on an array (creates a `DArray`)\n",
+ "- `distribute` has an option to specify on which workers should an array be distributed to\n",
+ "- `sum` function has a method for `DArray`\n",
+ "- remember to run `using DistributedArrays` on every process\n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 81,
+ "id": "cca63a27-05ce-43e2-bd99-88e909c8947c",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "using DistributedArrays"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "548f0019-aaa1-45c7-82c8-a9c7958f2443",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "function dist_sum_lib(a, np=nworkers())\n",
+ " adist = distribute(a,procs=workers()[1:np])\n",
+ " sum(adist)\n",
+ "end"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "289071c8-bf9d-46ba-868c-d686235009e4",
+ "metadata": {},
+ "source": [
+ "\n",
+ "Solution:
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "533c6ae1-3654-4951-b993-7c9c00a56150",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "####"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "1f34a26e",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "99a8ad42",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "id": "77e88418-1e4e-4c3d-bb97-a0931844a701",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "###\n",
+ "In both previous examples we have included the data transfer time from the controller process, in practice however distributed computing is used in situations where the data may be stored on individual local machines. As a general rule of thumb we should always send only instruction what to do and not the actual data to be processed. This will be more clearly demonstrated in the next more practical example.\n",
+ "\n",
+ "### Distributed file processing\n",
+ "`Distributed` is often used in processing of files, such as the commonly encountered `mapreduce` jobs with technologies like [`Hadoop`](https://hadoop.apache.org/), [`Spark`](http://spark.apache.org/), where the files live on a distributed file system and a typical job requires us to map over all the files and gather some statistics such as histograms, sums and others. We will simulate this situation with the Julia's pkg codebase, which on a typical user installation can contain up to hundreds of thousand of `.jl` files (depending on how extensively one uses Julia).\n",
+ "\n",
+ "\n",
+ "Exercise:\n",
+ "Write a distributed pipeline for computing a histogram of symbols found in AST by parsing Julia source files in your `.julia/packages/` directory. We have already implemented most of the code that you will need (available as source code \n",
+ " [`here`](https://github.com/JuliaTeachingCTU/Scientific-Programming-in-Julia/blob/master/docs/src/lecture_10/pkg_processing.jl) .\n",
+ " \n",
+ "Your task is to write a function that does the `map` and `reduce` steps, that will create and gather the dictionaries from different workers. There are two ways to do a map\n",
+ "- either over directories inside `.julia/packages/` - call it `distributed_histogram_pkgwise`\n",
+ "- or over all files obtained by concatenation of `filter_jl` outputs (*NOTE* that this might not be possible if the listing itself is expensive - speed or memory requirements) - call it `distributed_histogram_filewise`\n",
+ "Measure if the speed up scales linearly with the number of processes by restricting the number of workers inside a `pmap`.\n",
+ "\n",
+ "**HINTS**:\n",
+ "- for each file path apply `tokenize` to extract symbols and follow it with the update of a local histogram\n",
+ "- try writing sequential version first\n",
+ "- either load `./pkg_processing.jl` on startup with `-L` and `-p` options or `include(\"./pkg_processing.jl\")` inside `@everywhere`\n",
+ "- use `pmap` to easily iterate in parallel over a collection - the result should be an array of histogram, which has to be merged on the controller node (use builtin `mergewith!` function in conjunction with `reduce`)\n",
+ "- `pmap` supports `do` syntax\n",
+ "```julia\n",
+ "pmap(collection) do item\n",
+ " do_something(item)\n",
+ "end\n",
+ "```\n",
+ "- pkg directory can be obtained with `joinpath(DEPOT_PATH[1], \"packages\")`\n",
+ "\n",
+ "**BONUS**:\n",
+ "What is the most frequent symbol in your codebase?\n",
+ " \n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "ec6fa0f7-f646-4540-8c85-a934e3b78b4f",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "include(\"lab10/pkg_processing.jl\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "9b437c39-55c4-4583-8e2b-31a1bdb9467f",
+ "metadata": {},
+ "source": [
+ "\n",
+ "Solution:
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "1086127c-21b0-46a3-bcec-638688740ec4",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "###"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "9c397e26-0423-4889-b416-58015731c6e3",
+ "metadata": {},
+ "source": [
+ "Let's implement first a sequential version as it is much easier to debug."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "a5f6ffc2",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "0e043b49",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "2735bf3e-6591-4743-878a-0bd03f9441be",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "path = joinpath(DEPOT_PATH[1], \"packages\") # usually the first entry\n",
+ "@time h = sequential_histogram(path)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "23b773a2-bd99-4fdb-8b75-36891090e742",
+ "metadata": {},
+ "source": [
+ "First we try to distribute over package folders. TODO add the ability to run it only on some workers"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "c1a50186-58a4-4570-8177-500d5393ceff",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "addprocs(8)\n",
+ "\n",
+ "@everywhere begin\n",
+ " cd(\"Scientific-Programing-in-Julia\")\n",
+ " using Pkg; Pkg.activate(@__DIR__); Pkg.instantiate()\n",
+ " using ProgressMeter\n",
+ " # we have to realize that the code that workers have access to functions we have defined\n",
+ " include(\"lab10/pkg_processing.jl\") \n",
+ "end\n",
+ "\n",
+ "\"\"\"\n",
+ " merge_with!(h1, h2)\n",
+ "\n",
+ "Merges count dictionary `h2` into `h1` by adding the counts. Equivalent to `Base.mergewith!(+)`.\n",
+ "\"\"\"\n",
+ "function merge_with!(h1, h2)\n",
+ " for s in keys(h2)\n",
+ " get!(h1, s, 0)\n",
+ " h1[s] += h2[s]\n",
+ " end\n",
+ " h1\n",
+ "end"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "2a3897d6",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "b18c2935",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "ecb1a4bd-075f-4d8f-8bb1-d7bfac617ca6",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "path = joinpath(DEPOT_PATH[1], \"packages\")\n",
+ "\n",
+ "@time h = distributed_histogram_pkgwise(path, 2); # 41.5s\n",
+ "@time h = distributed_histogram_pkgwise(path, 4); # 24.0s\n",
+ "@time h = distributed_histogram_pkgwise(path, 8); # 24.0s"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "8ccafc49-3dea-4d30-8d9f-c8f553dd7c8d",
+ "metadata": {},
+ "source": [
+ "Second we try to distribute over all files."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "b7b329d8",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "0563ad72",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "dcf2a0ca-03f5-4c28-8230-69039ec845b5",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "path = joinpath(DEPOT_PATH[1], \"packages\")\n",
+ "@time h = distributed_histogram_filewise(path, 2); # 46.9s\n",
+ "@time h = distributed_histogram_filewise(path, 4); # 24.8s\n",
+ "@time h = distributed_histogram_filewise(path, 8); # 20.4s"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "f09855f5-f09b-470e-842c-54d6deb721e1",
+ "metadata": {},
+ "source": [
+ "Here we can see that we have improved the timings a bit by increasing granularity of tasks.\n",
+ "\n",
+ "**BONUS**: You can do some analysis with `DataFrames`"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "5ead99b7-1038-4060-8eef-cdc2572a4e2e",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "using DataFrames\n",
+ "df = DataFrame(:sym => collect(keys(h)), :count => collect(values(h)));\n",
+ "sort!(df, :count, rev=true);\n",
+ "df[1:50,:]"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "acdd8ee6-ce2c-40b0-b28c-cb95efe160af",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "rmprocs(workers())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "fcba0d07-9645-4867-91ed-aa7c351d1454",
+ "metadata": {},
+ "source": [
+ "## Threading\n",
+ "The number of threads for a Julia process can be set up in an environmental variable `JULIA_NUM_THREADS` or directly on Julia startup with cmd line option `-t ##` or `--threads ##`. If both are specified the latter takes precedence.\n",
+ "```bash\n",
+ "julia -t 8\n",
+ "```\n",
+ "In order to find out how many threads are currently available, there exist the `nthreads` function inside `Base.Threads` library. There is also an analog to the Distributed `myid` example, called `threadid`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 99,
+ "id": "edd07ccb-1c6b-492e-b0db-7fb0499aad50",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "using Base.Threads"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "b583d927-ae80-44a2-bb43-f07d61c76528",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "nthreads()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "960efa67-f6be-442e-aab7-1d37510335d8",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "threadid()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "60af3cc9-5f74-4f42-93f6-bb28e6dfc3f3",
+ "metadata": {},
+ "source": [
+ "As opposed to distributed/multiprocessing programming, threads have access to the whole memory of Julia's process, therefore we don't have to deal with separate environment manipulation, code loading and data transfers. However we have to be aware of the fact that memory can be modified from two different places and that there may be some performance penalties of accessing memory that is physically further from a given core (e.g. caches of different core or different NUMA[^3] nodes). Another significant difference from distributed computing is that we cannot spawn additional threads on the fly in the same way that we have been able to do with `addprocs` function.\n",
+ "\n",
+ "[^3]: https://en.wikipedia.org/wiki/Non-uniform\\_memory\\_access \"NUMA\" \n",
+ "\n",
+ "\n",
+ "Info: \"Hyper threads\"\n",
+ " In most of today's CPUs the number of threads is larger than the number of physical cores. These additional threads are usually called hyper threads[^4] or when talking about cores - logical cores. The technology relies on the fact, that for a given \"instruction\" there may be underutilized parts of the CPU core's machinery (such as one of many arithmetic units) and if a suitable work/instruction comes in it can be run simultaneously. In practice this means that adding more threads than physical cores may not be accompanied with the expected speed up.\n",
+ " \n",
+ "[^4]: https://en.wikipedia.org/wiki/Hyper-threading \"Hyperthreading\"\n",
+ " \n",
+ "
\n",
+ "\n",
+ "The easiest (not always yielding the correct result) way how to turn a code into multi threaded code is putting the `@threads` macro in front of a for loop, which instructs Julia to run the body on separate threads."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 102,
+ "id": "fffb5c2f-7536-468d-bf27-f6b08268d7a1",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "A = Array{Union{Int,Missing}}(missing, nthreads());"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 103,
+ "id": "c31a30ca-56ed-4c99-b5bd-d54c591d4189",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "for i in 1:nthreads()\n",
+ " A[threadid()] = threadid()\n",
+ "end"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "8c7d034f-1f91-4675-9aa8-b2fb5ed1c17d",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "A # only the first element is filled"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 105,
+ "id": "31f940e8-7da4-4257-8dab-7c671532a318",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "A = Array{Union{Int,Missing}}(missing, nthreads());"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 106,
+ "id": "b9ecf382-a294-44fc-a2ab-87ebc092e46d",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "@threads for i in 1:nthreads()\n",
+ " A[threadid()] = threadid()\n",
+ "end"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "d0177268-4a5a-46f0-94f1-74ab4f0e2425",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "A"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "450f33ce-24ec-406d-bdfa-09d0ba364c46",
+ "metadata": {},
+ "source": [
+ "### Multithreaded sum\n",
+ "Armed with this knowledge let's tackle the problem of the simple `sum`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "a34eb5b0-b009-447e-8bc2-f3c387d60a7c",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "function threaded_sum_naive(a)\n",
+ " r = zero(eltype(a))\n",
+ " @threads for i in eachindex(a)\n",
+ " @inbounds r += a[i]\n",
+ " end\n",
+ " return r\n",
+ "end"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "0002f30c-d1fe-49a9-9939-7eb65ed0fbb5",
+ "metadata": {},
+ "source": [
+ "Comparing this with the built-in sum we see not an insignificant discrepancy (one that cannot be explained by reordering of computation) and moreover the timings show us some ridiculous overhead."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 109,
+ "id": "7d424a00-e3c3-4278-a3d8-e1dae06e7438",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "a = rand(10_000_000); # 10^7"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "d4b89afd-a9a5-4931-8dfe-1fea77386760",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "sum(a), threaded_sum_naive(a)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "e88b9729-168c-4d71-91ca-62ccfb824a02",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "@btime sum($a)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "ea398331-ea41-42d9-ac8a-87cee30d913b",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "@btime threaded_sum_naive($a)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "33a1454e-ce96-4071-bea9-3fcbb9def2e3",
+ "metadata": {},
+ "source": [
+ "Recalling what has been said above we have to be aware of the fact that the data can be accessed from multiple threads at once, which if not taken into an account means that each thread reads possibly outdated value and overwrites it with its own updated state. \n",
+ "\n",
+ "There are two solutions which we will tackle in the next two exercises. \n",
+ "\n",
+ "\n",
+ "Exercise:\n",
+ "Implement `threaded_sum_atom`, which uses `Atomic` wrapper around the accumulator variable `r` in order to ensure correct locking of data access. \n",
+ "\n",
+ "**HINTS**:\n",
+ "- use `atomic_add!` as a replacement of `r += A[i]`\n",
+ "- \"collect\" the result by dereferencing variable `r` with empty bracket operator `[]`\n",
+ "\n",
+ "!!! info \"Side note on dereferencing\"\n",
+ " In Julia we can create references to a data types, which are guarranteed to point to correct and allocated type in memory, as long as a reference exists the memory is not garbage collected. These are constructed with `Ref(x)`, `Ref(a, 7)` or `Ref{T}()` for reference to variable `x`, `7`th element of array `a` and an empty reference respectively. Dereferencing aka asking about the underlying value is done using empty bracket operator `[]`.\n",
+ " ```@repl lab10_refs\n",
+ " x = 1 # integer\n",
+ " rx = Ref(x) # reference to that particular integer `x`\n",
+ " x == rx[] # dereferencing yields the same value\n",
+ " ```\n",
+ " There also exist unsafe references/pointers `Ptr`, however we should not really come into a contact with those.\n",
+ "\n",
+ "**BONUS**:\n",
+ "Try chunking the array and calling sum on individual chunks to obtain some real speedup.\n",
+ "\n",
+ "\n",
+ " \n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "52cd10ee-21fa-4162-8527-75b6ba3488e4",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "id": "71b26fb9-c03b-417d-856e-b87e5b921f87",
+ "metadata": {},
+ "source": [
+ "\n",
+ "Solution:
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "dbf59510-2dab-4ac1-838a-269542db905f",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "#### 1 option"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "d2b07f7c",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "bc5b1794",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "1db17a78",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "f610fe94",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "id": "4a457420-903d-48d3-840f-82af9fffba91",
+ "metadata": {
+ "jp-MarkdownHeadingCollapsed": true,
+ "tags": []
+ },
+ "source": [
+ "#### Bonus"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "725075b9-4e0c-429b-9142-074b01372831",
+ "metadata": {},
+ "source": [
+ "That's better but far from the performance we need. \n",
+ "\n",
+ "**BONUS**: There is a fancier and faster way to do this by chunking the array"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "2f242796",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "d5ee5a6d",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "12e52549",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "id": "0109425e-2df7-4cd8-a7c9-7116e71999eb",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "Finally we have beaten the \"sequential\" sum. The quotes are intentional, because the Base's implementation of a sum uses Single instruction, multiple data (SIMD) instructions as well, which allow to process multiple elements at once."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "f6324559-b958-4117-a479-981533d98704",
+ "metadata": {},
+ "source": [
+ "###"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "1f119487-8006-4324-8164-72b8e6423511",
+ "metadata": {},
+ "source": [
+ "\n",
+ "Exercise:\n",
+ "Implement `threaded_sum_buffer`, which uses an array of length `nthreads()` (we will call this buffer) for local aggregation of results of individual threads. \n",
+ "\n",
+ "**HINTS**:\n",
+ "- use `threadid()` to index the buffer array\n",
+ "- sum the buffer array to obtain final result\n",
+ " \n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "7a037801-d6a2-44bf-b92d-722a12567ffb",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "id": "25ab1f4f-0a99-40bd-a0e5-37790387e612",
+ "metadata": {},
+ "source": [
+ "\n",
+ "Solution:
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "41ede830",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "8875f81c",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "1fa3aaf6",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "f7041b34",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "id": "97cceee9-0326-4f69-8a7d-6238a8900c4a",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "####"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "f0af5ffe-100c-45c9-abf5-e89c42b98b9f",
+ "metadata": {},
+ "source": [
+ "\n",
+ "Though this implementation is cleaner and faster, there is possible drawback with this implementation, as the buffer `R` lives in a continuous part of the memory and each thread that accesses it brings it to its caches as a whole, thus invalidating the values for the other threads, which it in the same way."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "119d8c23-2be5-44de-8aa3-25dad1e89d09",
+ "metadata": {},
+ "source": [
+ "###\n",
+ "\n",
+ "Seeing how multithreading works on a simple example, let's apply it on the \"more practical\" case of the Symbol histogram from exercise above."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "c1488682-b285-42e7-9c2a-b23a126521ec",
+ "metadata": {},
+ "source": [
+ "\n",
+ "Exercise:\n",
+ "\n",
+ "Write a multithreaded analog of the file processing pipeline from [exercise](@ref lab10_dist_file_p) above. Again the task is to write the `map` and `reduce` steps, that will create and gather the dictionaries from different workers. There are two ways to map\n",
+ "- either over directories inside `.julia/packages/` - `threaded_histogram_pkgwise`\n",
+ "- or over all files obtained by concatenation of `filter_jl` outputs - `threaded_histogram_filewise`\n",
+ "Compare the speedup with the version using process based parallelism.\n",
+ "\n",
+ "**HINTS**:\n",
+ "- create a separate dictionary for each thread in order to avoid the need for atomic operations\n",
+ "- \n",
+ "\n",
+ "**BONUS**:\n",
+ "In each of the cases count how many files/pkgs each thread processed. Would the dynamic scheduler help us in this situation?\n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "389df79c-2301-4589-b463-9bb5f5267482",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "id": "1c42b9c4-d239-4298-8d3c-eb82f0b5b5b3",
+ "metadata": {},
+ "source": [
+ "\n",
+ "Solution:
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "2ffbf1a2-f607-4e2c-af24-dae688e91985",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "####"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "0673af7e-6fdb-4e32-b580-0c50f7bdf81c",
+ "metadata": {},
+ "source": [
+ "Setup is now much simpler.\n",
+ "```julia\n",
+ "using Base.Threads\n",
+ "include(\"./pkg_processing.jl\") \n",
+ "path = joinpath(DEPOT_PATH[1], \"packages\")\n",
+ "```\n",
+ "Firstly the version with folder-wise parallelism."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "aa7ae687-3468-4552-9415-463153b7a924",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "path = joinpath(DEPOT_PATH[1], \"packages\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "52a6c829",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "7a47daee",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "4cdd03ea-c0a4-412c-b866-6cac0f03d90a",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "@time h = threaded_histogram_pkgwise(path);"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "bfe4bc84-6a96-416c-9004-a3d28e8521ff",
+ "metadata": {},
+ "source": [
+ "Secondly the version with file-wise parallelism."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "b03cd97b",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "1f34c84e",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "9fae172f",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "ce47174a-4f0b-480b-bf9b-afa932932c48",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "@time h = threaded_histogram_filewise(path);"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "801a28d8-478e-4ed0-8558-2842186dd8aa",
+ "metadata": {},
+ "source": [
+ "## Task switching\n",
+ "There is a way how to run \"multiple\" things at once, which does not necessarily involve either threads or processes. In Julia this concept is called task switching or asynchronous programming, where we fire off our requests in a short time and let the cpu/os/network handle the distribution. As an example which we will try today is querying a web API, which has some variable latency. In the usuall sequantial fashion we can always post queries one at a time, however generally the APIs can handle multiple request at a time, therefore in order to better utilize them, we can call them asynchronously and fetch all results later, in some cases this will be faster.\n",
+ "\n",
+ "\n",
+ "Burst requests:\n",
+ " It is a good practice to check if an API supports some sort of batch request, because making a burst of single request might lead to a worse performance for others and a possible blocking of your IP/API key.\n",
+ "
\n",
+ "\n",
+ "Consider following functions"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "c190f3bb-8ec6-4f48-88bb-dbe4ee8cab3b",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "function aa()\n",
+ " for i in 1:10\n",
+ " sleep(1)\n",
+ " end\n",
+ "end\n",
+ "\n",
+ "function bb()\n",
+ " for i in 1:10\n",
+ " @async sleep(1)\n",
+ " end\n",
+ "end\n",
+ "\n",
+ "function cc()\n",
+ " @sync for i in 1:10\n",
+ " @async sleep(1)\n",
+ " end\n",
+ "end"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "bffea71a-d15b-4713-a25c-e57db15f53ec",
+ "metadata": {},
+ "source": [
+ "How much time will the execution of each of them take?"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "6c154602-7942-4099-a770-75feaa6a5383",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "@time aa()\n",
+ "@time bb() \n",
+ "@time cc()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "a4368616-2a94-4892-ad98-7dc122c7f1e1",
+ "metadata": {},
+ "source": [
+ "\n",
+ "Exercise:\n",
+ "Choose one of the free web APIs and query its endpoint using the `HTTP.jl` library. Implement both sequential and asynchronous version. Compare them on an burst of 10 requests.\n",
+ "\n",
+ "**HINTS**:\n",
+ "- use `HTTP.request` for `GET` requests on your chosen API, e.g. `r = HTTP.request(\"GET\", \"https://catfact.ninja/fact\")` for random cat fact\n",
+ "- converting body of a response can be done simply by constructing a `String` out of it - `String(r.body)`\n",
+ "- in order to parse a json string use `JSON.jl`'s parse function\n",
+ "- Julia offers `asyncmap` - asynchronous `map`\n",
+ "\n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "4c92113f-4387-4d63-8728-78c17aae0080",
+ "metadata": {},
+ "source": [
+ "\n",
+ "Solution:
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "b799cf94-d292-43dd-85fb-8362dd4beb9c",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "####"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "54e4c8dd-07d8-4d95-89a2-efcbd52c8799",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "using HTTP, JSON\n",
+ "\n",
+ "function query_cat_fact()\n",
+ " r = HTTP.request(\"GET\", \"https://catfact.ninja/fact\")\n",
+ " j = String(r.body)\n",
+ " d = JSON.parse(j)\n",
+ " d[\"fact\"]\n",
+ "end\n",
+ "\n",
+ "# without asyncmap\n",
+ "function get_cat_facts_async(n)\n",
+ " facts = Vector{String}(undef, n)\n",
+ " @sync for i in 1:10\n",
+ " @async facts[i] = query_cat_fact()\n",
+ " end\n",
+ " facts\n",
+ "end\n",
+ "\n",
+ "get_cat_facts_async(n) = asyncmap(x -> query_cat_fact(), Base.OneTo(n))\n",
+ "get_cat_facts(n) = map(x -> query_cat_fact(), Base.OneTo(n))\n",
+ "\n",
+ "@time get_cat_facts_async(10);\n",
+ "@time get_cat_facts(10);"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "2784b44e-1644-4f01-b83b-c4c9a8afb0ce",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Julia 1.7.3",
+ "language": "julia",
+ "name": "julia-1.7"
+ },
+ "language_info": {
+ "file_extension": ".jl",
+ "mimetype": "application/julia",
+ "name": "julia",
+ "version": "1.7.3"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}