From ecb237194f971f24674c942b6db5a55a1a189c9b Mon Sep 17 00:00:00 2001 From: "Documenter.jl" Date: Thu, 7 Nov 2024 16:05:44 +0000 Subject: [PATCH] build based on 9ec30d1 --- dev/.documenter-siteinfo.json | 2 +- dev/index.html | 2 +- dev/installation/index.html | 2 +- dev/lecture_01/basics/index.html | 2 +- dev/lecture_01/demo/index.html | 2 +- dev/lecture_01/hw/591f558f.svg | 60 ----- dev/lecture_01/hw/873e3706.svg | 60 +++++ dev/lecture_01/hw/index.html | 2 +- dev/lecture_01/lab/index.html | 14 +- dev/lecture_01/motivation/index.html | 2 +- dev/lecture_01/outline/index.html | 2 +- dev/lecture_02/hw/index.html | 2 +- dev/lecture_02/lab/index.html | 2 +- dev/lecture_02/lecture/index.html | 2 +- dev/lecture_03/hw/index.html | 10 +- dev/lecture_03/lab/index.html | 16 +- dev/lecture_03/lecture/index.html | 2 +- dev/lecture_04/hw/index.html | 2 +- .../lab/{a27a9792.svg => 0420031e.svg} | 68 +++--- dev/lecture_04/lab/index.html | 34 ++- dev/lecture_04/lecture/index.html | 2 +- dev/lecture_05/hw/index.html | 2 +- dev/lecture_05/lab/index.html | 6 +- dev/lecture_05/lecture/index.html | 2 +- dev/lecture_06/hw/index.html | 2 +- .../lab/{8dbaade1.svg => 1fb0fc85.svg} | 90 ++++---- dev/lecture_06/lab/index.html | 14 +- dev/lecture_06/lecture/index.html | 2 +- dev/lecture_07/hw/index.html | 2 +- dev/lecture_07/lab/index.html | 2 +- dev/lecture_07/lecture/index.html | 2 +- dev/lecture_08/hw/index.html | 2 +- .../lab/{02c39ce9.svg => 61a32be0.svg} | 207 +++++++++--------- dev/lecture_08/lab/index.html | 30 +-- .../lecture/{b138b834.svg => 9d533ca5.svg} | 68 +++--- dev/lecture_08/lecture/index.html | 4 +- dev/lecture_09/lab/index.html | 42 ++-- dev/lecture_09/lecture/index.html | 2 +- dev/lecture_10/hw/index.html | 2 +- dev/lecture_10/lab/index.html | 2 +- dev/lecture_10/lecture/index.html | 2 +- dev/lecture_11/lab/index.html | 2 +- dev/lecture_11/lecture/index.html | 2 +- .../hw/{b27ff9c6.svg => d4abc442.svg} | 84 +++---- dev/lecture_12/hw/index.html | 2 +- .../lab/{5ea3d638.svg => 4f18fd49.svg} | 68 +++--- .../lab/{6729df7c.svg => a48eac4d.svg} | 80 +++---- .../lab/{dbee3586.svg => d072df33.svg} | 76 +++---- dev/lecture_12/lab/index.html | 6 +- dev/lecture_12/lecture/index.html | 2 +- dev/projects/index.html | 2 +- 51 files changed, 546 insertions(+), 553 deletions(-) delete mode 100644 dev/lecture_01/hw/591f558f.svg create mode 100644 dev/lecture_01/hw/873e3706.svg rename dev/lecture_04/lab/{a27a9792.svg => 0420031e.svg} (52%) rename dev/lecture_06/lab/{8dbaade1.svg => 1fb0fc85.svg} (80%) rename dev/lecture_08/lab/{02c39ce9.svg => 61a32be0.svg} (94%) rename dev/lecture_08/lecture/{b138b834.svg => 9d533ca5.svg} (86%) rename dev/lecture_12/hw/{b27ff9c6.svg => d4abc442.svg} (92%) rename dev/lecture_12/lab/{5ea3d638.svg => 4f18fd49.svg} (95%) rename dev/lecture_12/lab/{6729df7c.svg => a48eac4d.svg} (97%) rename dev/lecture_12/lab/{dbee3586.svg => d072df33.svg} (91%) diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json index 9cc88ae3..273b0bdb 100644 --- a/dev/.documenter-siteinfo.json +++ b/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.11.1","generation_timestamp":"2024-10-31T15:38:01","documenter_version":"1.7.0"}} \ No newline at end of file +{"documenter":{"julia_version":"1.11.1","generation_timestamp":"2024-11-07T16:05:36","documenter_version":"1.7.0"}} \ No newline at end of file diff --git a/dev/index.html b/dev/index.html index 021039f4..05be2af2 100644 --- a/dev/index.html +++ b/dev/index.html @@ -6,4 +6,4 @@ Wield the power of abstraction. Example: The essence of forward mode automatic differentiation. -

Before joining the course, consider reading the following two blog posts to figure out if Julia is a language in which you want to invest your time.

What will you learn?

First and foremost you will learn how to think julia - meaning how write fast, extensible, reusable, and easy-to-read code using things like optional typing, multiple dispatch, and functional programming concepts. The later part of the course will teach you how to use more advanced concepts like language introspection, metaprogramming, and symbolic computing. Amonst others you will implement your own automatic differentiation (the backbone of modern machine learning) package based on these advanced techniques that can transform intermediate representations of Julia code.

Organization

This course webpage contains all information about the course that you need, including lecture notes, lab instructions, and homeworks. The official format of the course is 2+2 (2h lectures/2h labs per week) for 4 credits.

The official course code is: B0M36SPJ and the timetable for the winter semester 2022 can be found here.

The course will be graded based on points from your homework (max. 20 points) and points from a final project (max. 30 points).

Below is a table that shows which lectures have homeworks (and their points).

Homework12345678910111213
Points22222222-2-2-

Hint: The first few homeworks are easier. Use them to fill up your points.

Final project

The final project will be individually agreed on for each student. Ideally you can use this project to solve a problem you have e.g. in your thesis, but don't worry - if you cannot come up with an own project idea, we will suggest one to you. More info and project suggestion can be found here. Check out the list of projects from past to get a feeling for what we expect.

Grading

Your points from the homeworks and the final project are summed and graded by the standard grading scale below.

GradeABCDEF
Points45-5040-4435-3930-3425-290-25

Teachers

E-mailRoomRole
Tomáš Pevnýpevnak@protonmail.chKN:E-406Lecturer
Vašek Šmídlsmidlva1@fjfi.cvut.czKN:E-333Lecturer
Matěj Zorekzorekmat@fel.cvut.czKN:E-333Lab Instructor
Niklas Heimheimnikl@fel.cvut.czKN:E-333Lab Instructor

Prerequisites

There are no hard requirements to take the course. We go through the basics of Julia, but we do it rather quickly, so some familiarity with the language is an advantage. If you are looking for an in depth course on the basics of the language, we recommend to check out Julia for Optimization and Learning before enrolling in this course. The Functional Programming course also contains some helpful concepts for this course. And knowledge about computer hardware, namely basics of how CPU works, how it interacts with memory through caches, and basics of multi-threading certainly helps.

References

+

Before joining the course, consider reading the following two blog posts to figure out if Julia is a language in which you want to invest your time.

What will you learn?

First and foremost you will learn how to think julia - meaning how write fast, extensible, reusable, and easy-to-read code using things like optional typing, multiple dispatch, and functional programming concepts. The later part of the course will teach you how to use more advanced concepts like language introspection, metaprogramming, and symbolic computing. Amonst others you will implement your own automatic differentiation (the backbone of modern machine learning) package based on these advanced techniques that can transform intermediate representations of Julia code.

Organization

This course webpage contains all information about the course that you need, including lecture notes, lab instructions, and homeworks. The official format of the course is 2+2 (2h lectures/2h labs per week) for 4 credits.

The official course code is: B0M36SPJ and the timetable for the winter semester 2022 can be found here.

The course will be graded based on points from your homework (max. 20 points) and points from a final project (max. 30 points).

Below is a table that shows which lectures have homeworks (and their points).

Homework12345678910111213
Points22222222-2-2-

Hint: The first few homeworks are easier. Use them to fill up your points.

Final project

The final project will be individually agreed on for each student. Ideally you can use this project to solve a problem you have e.g. in your thesis, but don't worry - if you cannot come up with an own project idea, we will suggest one to you. More info and project suggestion can be found here. Check out the list of projects from past to get a feeling for what we expect.

Grading

Your points from the homeworks and the final project are summed and graded by the standard grading scale below.

GradeABCDEF
Points45-5040-4435-3930-3425-290-25

Teachers

E-mailRoomRole
Tomáš Pevnýpevnak@protonmail.chKN:E-406Lecturer
Vašek Šmídlsmidlva1@fjfi.cvut.czKN:E-333Lecturer
Matěj Zorekzorekmat@fel.cvut.czKN:E-333Lab Instructor
Niklas Heimheimnikl@fel.cvut.czKN:E-333Lab Instructor

Prerequisites

There are no hard requirements to take the course. We go through the basics of Julia, but we do it rather quickly, so some familiarity with the language is an advantage. If you are looking for an in depth course on the basics of the language, we recommend to check out Julia for Optimization and Learning before enrolling in this course. The Functional Programming course also contains some helpful concepts for this course. And knowledge about computer hardware, namely basics of how CPU works, how it interacts with memory through caches, and basics of multi-threading certainly helps.

References

diff --git a/dev/installation/index.html b/dev/installation/index.html index 91d21ee1..3247459b 100644 --- a/dev/installation/index.html +++ b/dev/installation/index.html @@ -14,4 +14,4 @@ _/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release |__/ | -julia>

Julia IDE

There is no one way to install/develop and run Julia, which may be strange users coming from MATLAB, but for users of general purpose languages such as Python, C++ this is quite common. Most of the Julia programmers to date are using

This setup is described in a comprehensive step-by-step guide in our bachelor course Julia for Optimization & Learning.

Note that this setup is not a strict requirement for the lectures/labs and any other text editor with the option to send code to the terminal such as Vim (+Tmux), Emacs, or Sublime Text will suffice.

GitHub registration & Git setup

As one of the goals of the course is writing code that can be distributed to others, we recommend a GitHub account, which you can create here (unless you already have one). In order to interact with GitHub repositories, we will be using git. For installation instruction (Windows only) see the section in the bachelor course.

+julia>

Julia IDE

There is no one way to install/develop and run Julia, which may be strange users coming from MATLAB, but for users of general purpose languages such as Python, C++ this is quite common. Most of the Julia programmers to date are using

This setup is described in a comprehensive step-by-step guide in our bachelor course Julia for Optimization & Learning.

Note that this setup is not a strict requirement for the lectures/labs and any other text editor with the option to send code to the terminal such as Vim (+Tmux), Emacs, or Sublime Text will suffice.

GitHub registration & Git setup

As one of the goals of the course is writing code that can be distributed to others, we recommend a GitHub account, which you can create here (unless you already have one). In order to interact with GitHub repositories, we will be using git. For installation instruction (Windows only) see the section in the bachelor course.

diff --git a/dev/lecture_01/basics/index.html b/dev/lecture_01/basics/index.html index 5d319944..78f12c85 100644 --- a/dev/lecture_01/basics/index.html +++ b/dev/lecture_01/basics/index.html @@ -16,4 +16,4 @@ out = fsum(varargin{1},varargin{2:end}) end

The need to build intuition for function composition.

Dispatch is easier to optimize by the compiler.

Operators are functions

operatorfunction name
[A B C ...]hcat
[A; B; C; ...]vcat
[A B; C D; ...]hvcat
A'adjoint
A[i]getindex
A[i] = xsetindex!
A.ngetproperty
A.n = xsetproperty!
struct Foo end
 
-Base.getproperty(a::Foo, x::Symbol) = x == :a ? 5 : error("does not have property $(x)")

Can be redefined and overloaded for different input types. The getproperty method can define access to the memory structure.

Broadcasting revisited

The a.+b syntax is a syntactic sugar for broadcast(+,a,b).

The special meaning of the dot is that they will be fused into a single call:

The same logic works for lists, tuples, etc.

+Base.getproperty(a::Foo, x::Symbol) = x == :a ? 5 : error("does not have property $(x)")

Can be redefined and overloaded for different input types. The getproperty method can define access to the memory structure.

Broadcasting revisited

The a.+b syntax is a syntactic sugar for broadcast(+,a,b).

The special meaning of the dot is that they will be fused into a single call:

The same logic works for lists, tuples, etc.

diff --git a/dev/lecture_01/demo/index.html b/dev/lecture_01/demo/index.html index abcd533d..9571aceb 100644 --- a/dev/lecture_01/demo/index.html +++ b/dev/lecture_01/demo/index.html @@ -24,4 +24,4 @@ prob = ODEProblem(lotka_volterra,u0,tspan,p) sol = solve(prob) -plot(sol,denseplot=false)

Integration with other toolkits

Flux: toolkit for modelling Neural Networks. Neural network is a function.

Turing: Probabilistic modelling toolkit

+plot(sol,denseplot=false)

Integration with other toolkits

Flux: toolkit for modelling Neural Networks. Neural network is a function.

Turing: Probabilistic modelling toolkit

diff --git a/dev/lecture_01/hw/591f558f.svg b/dev/lecture_01/hw/591f558f.svg deleted file mode 100644 index 1767bbf7..00000000 --- a/dev/lecture_01/hw/591f558f.svg +++ /dev/null @@ -1,60 +0,0 @@ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - diff --git a/dev/lecture_01/hw/873e3706.svg b/dev/lecture_01/hw/873e3706.svg new file mode 100644 index 00000000..16b5ce18 --- /dev/null +++ b/dev/lecture_01/hw/873e3706.svg @@ -0,0 +1,60 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/lecture_01/hw/index.html b/dev/lecture_01/hw/index.html index 5033939a..2a0b767d 100644 --- a/dev/lecture_01/hw/index.html +++ b/dev/lecture_01/hw/index.html @@ -18,4 +18,4 @@ pkg> add GraphRecipes

This process downloads the pkgs and triggers some build steps, if for example some binary dependencies are needed. The process duration depends on the "freshness" of Julia installation and the size of each pkg. With Plots being quite dependency heavy, expect few minutes. After the installation is complete we can check the updated environment with the status command.

pkg> status

The plotting itself as easy as calling the graphplot function on our adjacency matrix.

julia> using GraphRecipes, Plots
julia> graphplot(A)Plot{Plots.GRBackend() n=21} Captured extra kwargs: Series{21}: - markerstrokesize: 0
Example block output + markerstrokesize: 0Example block output diff --git a/dev/lecture_01/lab/index.html b/dev/lecture_01/lab/index.html index 5bc7d480..491329c4 100644 --- a/dev/lecture_01/lab/index.html +++ b/dev/lecture_01/lab/index.html @@ -141,8 +141,8 @@ @ Base operators.jl:596 *(::Real, ::Complex{Bool}) @ Base complex.jl:330 - *(::Graphs.LinAlg.Noop, ::Any) - @ Graphs ~/.julia/packages/Graphs/1ALGD/src/linalg/graphmatrices.jl:216 + *(::ChainRulesCore.NotImplemented, ::Any) + @ ChainRulesCore ~/.julia/packages/ChainRulesCore/6Pucz/src/tangent_arithmetic.jl:37 ...

In the stacktrace we can see the location of each function call. If we include the function polynomial from some file poly.jl using include("poly.jl"), we will see that the location changes from REPL[X]:10 to the actual file name.

By swapping square brackets for round in the array comprehension ac above, we have defined so called generator/iterator, which as opposed to original variable ac does not allocate an array, only the structure that produces it.

ag = (2i^2 + 1 for i in -2:1)
 typeof(ag), eltype(ag)
(Base.Generator{UnitRange{Int64}, Main.var"#3#4"}, Any)

You may notice that the element type in this case is Any, which means that a function using this generator as an argument cannot specialize based on the type and has to infer it every time an element is generated/returned. We will touch on how this affects performance in one of the later lectures.

julia> polynomial(ag, x)ERROR: MethodError: no method matching getindex(::Base.Generator{UnitRange{Int64}, Main.var"#3#4"}, ::Int64)
 The function `getindex` exists, but no method is defined for this combination of argument types.

The problem that we face during evaluation is that generator type is missing the getindex operation, as they are made for situations where the size of the collection may be unknown and the only way of obtaining particular elements is through sequential iteration. Generators can be useful for example when creating batches of data for a machine learning training. We can "fix" the situation using collect function, mentioned earlier, however that again allocates an array.

Extending/limiting the polynomial example

Following up on the polynomial example, let's us expand it a little further in order to facilitate the arguments, that have been throwing exceptions. The first direction, which we will move forward to, is providing the user with more detailed error message when an incorrect type of coefficients has been provided.

Exercise

Design an if-else condition such that the array of Char example throws an error with custom string message, telling the user what went wrong and printing the incorrect input alongside it. Confirm that we have not broken the functionality of other examples from previous exercise.

HINTS:

  • Throw the ArgumentError(msg) with throw function and string message msg. More details in help mode ? or at the end of this document.
  • Strings are defined like this s = "Hello!"
  • Use string interpolation to create the error message. It allows injecting an expression into a string with the $ syntax b = 1; s = "Hellow Number $(b)"
  • Compare eltype of the coefficients with Char type.
  • The syntax for if-else:
if condition
@@ -184,8 +184,8 @@
    @ Base operators.jl:596
   *(::Real, ::Complex{Bool})
    @ Base complex.jl:330
-  *(::Graphs.LinAlg.Noop, ::Any)
-   @ Graphs ~/.julia/packages/Graphs/1ALGD/src/linalg/graphmatrices.jl:216
+  *(::ChainRulesCore.NotImplemented, ::Any)
+   @ ChainRulesCore ~/.julia/packages/ChainRulesCore/6Pucz/src/tangent_arithmetic.jl:37
   ...
julia> polynomial(ac, x)108.0
julia> polynomial(ag, x)108.0

BONUS: You may have noticed that in the example above, the powers variable is allocating an additional, unnecessary vector. With the current, scalar x, this is not such a big deal. But in your homework you will generalize this function to matrix inputs of x, which means that powers becomes a vector of (potentially very large) matrices. This is a very natural use case for the mapreduce: function:

polynomial(a, x) = mapreduce(+, enumerate(a), init=zero(x)) do (i, aᵢ)
     x^(i-1) * aᵢ
 end
@@ -222,8 +222,8 @@
    @ Base operators.jl:596
   *(::Real, ::Complex{Bool})
    @ Base complex.jl:330
-  *(::Graphs.LinAlg.Noop, ::Any)
-   @ Graphs ~/.julia/packages/Graphs/1ALGD/src/linalg/graphmatrices.jl:216
+  *(::ChainRulesCore.NotImplemented, ::Any)
+   @ ChainRulesCore ~/.julia/packages/ChainRulesCore/6Pucz/src/tangent_arithmetic.jl:37
   ...
julia> getindex((i for i in 1:4), 3) # no candidatesERROR: MethodError: no method matching getindex(::Base.Generator{UnitRange{Int64}, typeof(identity)}, ::Int64) The function `getindex` exists, but no method is defined for this combination of argument types.

Both of these examples have a short stacktrace, showing that the execution failed on the top most level in REPL, however if this code is a part of some function in a separate file, the stacktrace will reflect it. What this error tells us is that the dispatch system could not find a method for a given function, that would be suitable for the type of arguments, that it has been given. In the first case Julia offers also a list of candidate methods, that match at least some of the arguments

When dealing with basic Julia functions and types, this behavior can be treated as something given and though one could locally add a method for example for multiplication of Char and Int, there is usually a good reason why Julia does not support such functionality by default. On the other hand when dealing with user defined code, this error may suggest the developer, that either the functions are too strictly typed or that another method definition is needed in order to satisfy the desired functionality.

InexactError

This type of error is most commonly thrown by the type conversion system (centered around convert function), informing the user that it cannot exactly convert a value of some type to match arguments of a function being called.

julia> Int(1.2)                      # root causeERROR: InexactError: Int64(1.2)
julia> append!([1,2,3], 1.2) # same as above but shows the root cause deeper in the stack traceERROR: InexactError: Int64(1.2)

In this case the function being Int and the value a floating point. The second example shows InexactError may be caused deeper inside an inconspicuous function call, where we want to extend an array by another value, which is unfortunately incompatible.

ArgumentError

As opposed to the previous two errors, ArgumentError can contain user specified error message and thus can serve multiple purposes. It is however recommended to throw this type of error, when the parameters to a function call do not match a valid signature, e.g. when factorial were given negative or non-integer argument (note that this is being handled in Julia by multiple dispatch and specific DomainError).

This example shows a concatenation of two 2d arrays of incompatible sizes 3x3 and 2x2.

julia> hcat(ones(3,3), zeros(2,2))ERROR: DimensionMismatch: number of rows of each array must match (got (3, 2))

KeyError

This error is specific to hash table based objects such as the Dict type and tells the user that and indexing operation into such structure tried to access or delete a non-existent element.

julia> d = Dict(:a => [1,2,3], :b => [1,23])Dict{Symbol, Vector{Int64}} with 2 entries:
   :a => [1, 2, 3]
@@ -232,4 +232,4 @@
            return x + uno
        endplusone (generic function with 1 method)
julia> uno # defined only within plusoneERROR: UndefVarError: `uno` not defined in `Main` Suggestion: check for spelling errors or missing imports.

Unless there is variable I_am_not_defined in the global scope, the following should throw an error.

julia> I_am_not_definedERROR: UndefVarError: `I_am_not_defined` not defined in `Main`
-Suggestion: check for spelling errors or missing imports.

Often these kind of errors arise as a result of bad code practices, such as long running sessions of Julia having long forgotten global variables, that do not exist upon new execution (this one in particular has been addressed by the authors of the reactive Julia notebooks Pluto.jl).

For more details on code scoping we recommend particular places in the bachelor course lectures here and there.

ErrorException & error function

ErrorException is the most generic error, which can be thrown/raised just by calling the error function with a chosen string message. As a result developers may be inclined to misuse this for any kind of unexpected behavior a user can run into, often providing out-of-context/uninformative messages.

+Suggestion: check for spelling errors or missing imports.

Often these kind of errors arise as a result of bad code practices, such as long running sessions of Julia having long forgotten global variables, that do not exist upon new execution (this one in particular has been addressed by the authors of the reactive Julia notebooks Pluto.jl).

For more details on code scoping we recommend particular places in the bachelor course lectures here and there.

ErrorException & error function

ErrorException is the most generic error, which can be thrown/raised just by calling the error function with a chosen string message. As a result developers may be inclined to misuse this for any kind of unexpected behavior a user can run into, often providing out-of-context/uninformative messages.

diff --git a/dev/lecture_01/motivation/index.html b/dev/lecture_01/motivation/index.html index 38b08564..72687e4c 100644 --- a/dev/lecture_01/motivation/index.html +++ b/dev/lecture_01/motivation/index.html @@ -35,4 +35,4 @@ m ~ Normal(0, sqrt(s²)) x ~ Normal(m, sqrt(s²)) y ~ Normal(m, sqrt(s²)) -end

Such tools allow building a very convenient user experience on abstract level, and reaching very efficient code.

Reproducible research

Think about a code that was written some time ago. To run it, you often need to be able to have the same version of the language it was written for.

Environment

Is an independent set of packages that can be local to an individual project or shared and selected by name.

Package

A package is a source tree with a standard layout providing functionality that can be reused by other Julia projects.

This allows Julia to be a rapidly evolving ecosystem with frequent changes due to:

Package manager

Julia from user's point of view

  1. compilation of everything to as specialized as possible

    • very fast code
    • slow interaction (caching...)
    • generating libraries is harder
      • think of fsum,
      • everything is ".h" (Eigen library)
    • debugging is different to matlab/python
  2. extensibility, Multiple dispatch = multi-functions

    • allows great extensibility and code composition
    • not (yet) mainstream thinking
    • Julia is not Object-oriented
    • Julia is (not pure) functional language
+end

Such tools allow building a very convenient user experience on abstract level, and reaching very efficient code.

Reproducible research

Think about a code that was written some time ago. To run it, you often need to be able to have the same version of the language it was written for.

Environment

Is an independent set of packages that can be local to an individual project or shared and selected by name.

Package

A package is a source tree with a standard layout providing functionality that can be reused by other Julia projects.

This allows Julia to be a rapidly evolving ecosystem with frequent changes due to:

Package manager

Julia from user's point of view

  1. compilation of everything to as specialized as possible

    • very fast code
    • slow interaction (caching...)
    • generating libraries is harder
      • think of fsum,
      • everything is ".h" (Eigen library)
    • debugging is different to matlab/python
  2. extensibility, Multiple dispatch = multi-functions

    • allows great extensibility and code composition
    • not (yet) mainstream thinking
    • Julia is not Object-oriented
    • Julia is (not pure) functional language
diff --git a/dev/lecture_01/outline/index.html b/dev/lecture_01/outline/index.html index 9d40f934..4a4a49d9 100644 --- a/dev/lecture_01/outline/index.html +++ b/dev/lecture_01/outline/index.html @@ -1,2 +1,2 @@ -Outline · Scientific Programming in Julia

Course outline

  1. Introduction

  2. Type system

    • user: tool for abstraction
    • compiler: tool for memory layout
  3. Design patterns (mental setup)

    • Julia is a type-based language
    • multiple-dispatch generalizes OOP and FP
  4. Packages

    • way how to organize code
    • code reuse (alternative to libraries)
    • experiment reproducibility
  5. Benchmarking

    • how to measure code efficiency
  6. Introspection

    • understand how the compiler process the data
  7. Macros

    • automate writing of boring the boilerplate code
    • good macro create cleaner code
  8. Automatic Differentiation

    • Theory: difference between the forward and backward mode
    • Implementation techniques
  9. Intermediate representation

    • how to use internal the representation of the code
    • example in automatic differentiation
  10. Parallel computing

    • threads, processes
  11. Graphics card coding

    • types for GPU
    • specifics of architectures
  12. Ordinary Differential Equations

    • simple solvers
    • error propagation
  13. Data driven ODE

    • combine ODE with optimization
    • automatic differentiation (adjoints)
+Outline · Scientific Programming in Julia

Course outline

  1. Introduction

  2. Type system

    • user: tool for abstraction
    • compiler: tool for memory layout
  3. Design patterns (mental setup)

    • Julia is a type-based language
    • multiple-dispatch generalizes OOP and FP
  4. Packages

    • way how to organize code
    • code reuse (alternative to libraries)
    • experiment reproducibility
  5. Benchmarking

    • how to measure code efficiency
  6. Introspection

    • understand how the compiler process the data
  7. Macros

    • automate writing of boring the boilerplate code
    • good macro create cleaner code
  8. Automatic Differentiation

    • Theory: difference between the forward and backward mode
    • Implementation techniques
  9. Intermediate representation

    • how to use internal the representation of the code
    • example in automatic differentiation
  10. Parallel computing

    • threads, processes
  11. Graphics card coding

    • types for GPU
    • specifics of architectures
  12. Ordinary Differential Equations

    • simple solvers
    • error propagation
  13. Data driven ODE

    • combine ODE with optimization
    • automatic differentiation (adjoints)
diff --git a/dev/lecture_02/hw/index.html b/dev/lecture_02/hw/index.html index 4a3f4950..98b2cd31 100644 --- a/dev/lecture_02/hw/index.html +++ b/dev/lecture_02/hw/index.html @@ -4,4 +4,4 @@
  1. Implement a function agent_count that can be called on a single Agent and returns a number between $(0,1)$ (i.e. always 1 for animals; and size(plant)/max_size(plant) for plants).

  2. Add a method for a vector of agents Vector{<:Agent} which sums all agent counts.

  3. Add a method for a World which returns a dictionary that contains pairs of Symbols and the agent count like below:

julia> grass1 = Grass(1,5,5);
julia> agent_count(grass1)1.0
julia> grass2 = Grass(2,1,5);
julia> agent_count([grass1,grass2]) # one grass is fully grown; the other only 20% => 1.21.2
julia> sheep = Sheep(3,10.0,5.0,1.0,1.0);
julia> wolf = Wolf(4,20.0,10.0,1.0,1.0);
julia> world = World([grass1, grass2, sheep, wolf]);
julia> agent_count(world)Dict{Symbol, Real} with 3 entries: :Wolf => 1 :Grass => 1.2 - :Sheep => 1

Hint: You can get the name of a type by using the nameof function:

julia> nameof(Grass):Grass

Use as much dispatch as you can! ;)

+ :Sheep => 1

Hint: You can get the name of a type by using the nameof function:

julia> nameof(Grass):Grass

Use as much dispatch as you can! ;)

diff --git a/dev/lecture_02/lab/index.html b/dev/lecture_02/lab/index.html index c6fac607..dc9d2d31 100644 --- a/dev/lecture_02/lab/index.html +++ b/dev/lecture_02/lab/index.html @@ -109,4 +109,4 @@ end
s1, s2 = Sheep(1), Sheep(2)
 w = World([s1, s2])
 reproduce!(s1, w);
-w
+w diff --git a/dev/lecture_02/lecture/index.html b/dev/lecture_02/lecture/index.html index 0ba72c6b..7af0f8d4 100644 --- a/dev/lecture_02/lecture/index.html +++ b/dev/lecture_02/lecture/index.html @@ -143,4 +143,4 @@ move(a::Vector{T}, by::Vector{T}) where {T<:Position} = move.(a, by) move(a::Vector{<:Position}, by::Position) = move.(a, by)

Which method will compiler select for

move(Position(1.0,2.0), Position(1.0,2.0))

The first three methods match the types of arguments, but the compiler will select the third one, since it is the most specific.

Which method will compiler select for

move(Position(1,2), Position(1,2))

Again, the first and second method definitions match the argument, but the second is the most specific.

Which method will the compiler select for

move([Position(1,2)], [Position(1,2)])

Again, the fourth and fifth method definitions match the argument, but the fifth is the most specific.

move([Position(1,2), Position(1.0,2.0)], [Position(1,2), Position(1.0,2.0)])

A bizzare definition which you can encounter

The following definition of a one-hot matrix is taken from Flux.jl

struct OneHotArray{T<:Integer, L, N, var"N+1", I<:Union{T,AbstractArray{T, N}}} <: AbstractArray{Bool, var"N+1"}
   indices::I
-end

The parameters of the type carry information about the type used to encode the position of one in each column in T, the dimension of one-hot vectors in L, the dimension of the storage of indices in N (which is zero for OneHotVector and one for OneHotMatrix), number of dimensions of the OneHotArray in var"N+1" and the type of underlying storage of indices I.

+end

The parameters of the type carry information about the type used to encode the position of one in each column in T, the dimension of one-hot vectors in L, the dimension of the storage of indices in N (which is zero for OneHotVector and one for OneHotMatrix), number of dimensions of the OneHotArray in var"N+1" and the type of underlying storage of indices I.

diff --git a/dev/lecture_03/hw/index.html b/dev/lecture_03/hw/index.html index bc21ad00..9a8aa00d 100644 --- a/dev/lecture_03/hw/index.html +++ b/dev/lecture_03/hw/index.html @@ -1,12 +1,12 @@ Homework · Scientific Programming in Julia

Homework 3

In this homework we will implement a function find_food and practice the use of closures. The solution of lab 3 can be found here. You can use this file and add the code that you write for the homework to it.

How to submit?

Put all your code (including your or the provided solution of lab 2) in a script named hw.jl. Zip only this file (not its parent folder) and upload it to BRUTE.

Agents looking for food

Homework:
-

Implement a method find_food(a::Animal, w::World) returns one randomly chosen agent from all w.agents that can be eaten by a or nothing if no food could be found. This means that if e.g. the animal is a Wolf you have to return one random Sheep, etc.

Hint: You can write a general find_food method for all animals and move the parts that are specific to the concrete animal types to a separate function. E.g. you could define a function eats(::Animal{Wolf}, ::Animal{Sheep}) = true, etc.

You can check your solution with the public test:

julia> sheep = Sheep(1,pf=1.0)🐑♂ #1 E=4.0 ΔE=0.2 pr=0.8 pf=1.0
julia> world = World([Grass(2), sheep])Main.World{Main.Agent} - 🌿 #2 20% grown - 🐑♂ #1 E=4.0 ΔE=0.2 pr=0.8 pf=1.0
julia> find_food(sheep, world) isa Plant{Grass}true

Callbacks & Closures

+

Implement a method find_food(a::Animal, w::World) returns one randomly chosen agent from all w.agents that can be eaten by a or nothing if no food could be found. This means that if e.g. the animal is a Wolf you have to return one random Sheep, etc.

Hint: You can write a general find_food method for all animals and move the parts that are specific to the concrete animal types to a separate function. E.g. you could define a function eats(::Animal{Wolf}, ::Animal{Sheep}) = true, etc.

You can check your solution with the public test:

julia> sheep = Sheep(1,pf=1.0)🐑♀ #1 E=4.0 ΔE=0.2 pr=0.8 pf=1.0
julia> world = World([Grass(2), sheep])Main.World{Main.Agent} + 🌿 #2 40% grown + 🐑♀ #1 E=4.0 ΔE=0.2 pr=0.8 pf=1.0
julia> find_food(sheep, world) isa Plant{Grass}true

Callbacks & Closures

Homework:

Implement a function every_nth(f::Function,n::Int) that takes an inner function f and uses a closure to construct an outer function g that only calls f every nth call to g. For example, if n=3 the inner function f be called at the 3rd, 6th, 9th ... call to g (not at the 1st, 2nd, 4th, 5th, 7th... call).

Hint: You can use splatting via ... to pass on an unknown number of arguments from the outer to the inner function.

You can use every_nth to log (or save) the agent count only every couple of steps of your simulation. Using every_nth will look like this:

julia> w = World([Sheep(1), Grass(2), Wolf(3)])
        # `@info agent_count(w)` is executed only every 3rd call to logcb(w)Main.World{Main.Agent}
-  🌿  #2 80% grown
+  🌿  #2 40% grown
   🐺♀ #3 E=10.0 ΔE=8.0 pr=0.1 pf=0.2
-  🐑♂ #1 E=4.0 ΔE=0.2 pr=0.8 pf=0.6
julia> logcb = every_nth(w->(@info agent_count(w)), 3);
julia> logcb(w); # x->(@info agent_count(w)) is not called
julia> logcb(w); # x->(@info agent_count(w)) is not called
julia> logcb(w); # x->(@info agent_count(w)) *is* called[ Info: Dict(:Wolf => 1.0, :Grass => 0.8, :Sheep => 1.0)
+ 🐑♀ #1 E=4.0 ΔE=0.2 pr=0.8 pf=0.6
julia> logcb = every_nth(w->(@info agent_count(w)), 3);
julia> logcb(w); # x->(@info agent_count(w)) is not called
julia> logcb(w); # x->(@info agent_count(w)) is not called
julia> logcb(w); # x->(@info agent_count(w)) *is* called[ Info: Dict(:Wolf => 1.0, :Grass => 0.4, :Sheep => 1.0) diff --git a/dev/lecture_03/lab/index.html b/dev/lecture_03/lab/index.html index be700a30..41a711d6 100644 --- a/dev/lecture_03/lab/index.html +++ b/dev/lecture_03/lab/index.html @@ -3,7 +3,7 @@ sheep::Sheep sex::Symbol end -⚥Sheep(id, e=4.0, Δe=0.2, pr=0.8, pf=0.6, sex=rand(Bool) ? :female : :male) = ⚥Sheep(Sheep(id,e,Δe,pr,pf),sex)
julia> sheep = ⚥Sheep(1)Main.⚥Sheep(🐑 #1 E=4.0 ΔE=0.2 pr=0.8 pf=0.6, :female)
julia> sheep.sheep🐑 #1 E=4.0 ΔE=0.2 pr=0.8 pf=0.6
julia> sheep.sex:female

Instead of littering the whole code with custom getters/setters Julia allows us to overload the sheep.field behaviour by implementing custom getproperty/setproperty! methods.

+⚥Sheep(id, e=4.0, Δe=0.2, pr=0.8, pf=0.6, sex=rand(Bool) ? :female : :male) = ⚥Sheep(Sheep(id,e,Δe,pr,pf),sex)
julia> sheep = ⚥Sheep(1)Main.⚥Sheep(🐑 #1 E=4.0 ΔE=0.2 pr=0.8 pf=0.6, :male)
julia> sheep.sheep🐑 #1 E=4.0 ΔE=0.2 pr=0.8 pf=0.6
julia> sheep.sex:male

Instead of littering the whole code with custom getters/setters Julia allows us to overload the sheep.field behaviour by implementing custom getproperty/setproperty! methods.

Exercise:

Implement custom getproperty/setproperty! methods which allow to access the Sheep inside the ⚥Sheep as if we would not be wrapping it.

@@ -24,7 +24,7 @@ setfield!(s,name,x) end end

You should be able to do the following with your overloads now

julia> sheep = ⚥Sheep(1)Main.⚥Sheep(🐑 #1 E=4.0 ΔE=0.2 pr=0.8 pf=0.6, :male)
julia> sheep.id1
julia> sheep.sex:male
julia> sheep.energy += 15.0
julia> sheepMain.⚥Sheep(🐑 #1 E=5.0 ΔE=0.2 pr=0.8 pf=0.6, :male)

In order to make the ⚥Sheep work with the rest of the code we only have to forward the eat! method

julia> eat!(s::⚥Sheep, food, world) = eat!(s.sheep, food, world);
julia> sheep = ⚥Sheep(1);
julia> grass = Grass(2);
julia> world = World([sheep,grass])Main.World{Main.Agent} - 🌿 #2 70% grown + 🌿 #2 100% grown Main.⚥Sheep(🐑 #1 E=4.0 ΔE=0.2 pr=0.8 pf=0.6, :female)
julia> eat!(sheep, grass, world)0

and implement a custom reproduce! method with the behaviour that we want.

However, the extension of Sheep to ⚥Sheep is a very object-oriented approach. With a little bit of rethinking, we can build a much more elegant solution that makes use of Julia's powerful parametric types.

Part II: A new, parametric type hierarchy

First, let us note that there are two fundamentally different types of agents in our world: animals and plants. All species such as grass, sheep, wolves, etc. can be categorized as one of those two. We can use Julia's powerful, parametric type system to define one large abstract type for all agents Agent{S}. The Agent will either be an Animal or a Plant with a type parameter S which will represent the specific animal/plant species we are dealing with.

This new type hiearchy can then look like this:

abstract type Species end
 
 abstract type PlantSpecies <: Species end
@@ -74,7 +74,7 @@
 # get the per species defaults back
 randsex() = rand(instances(Sex))
 Sheep(id; E=4.0, ΔE=0.2, pr=0.8, pf=0.6, s=randsex()) = Sheep(id, E, ΔE, pr, pf, s)
-Wolf(id; E=10.0, ΔE=8.0, pr=0.1, pf=0.2, s=randsex()) = Wolf(id, E, ΔE, pr, pf, s)

We have our convenient, high-level behaviour back!

julia> Sheep(1)🐑♂ #1 E=4.0 ΔE=0.2 pr=0.8 pf=0.6
julia> Wolf(2)🐺♂ #2 E=10.0 ΔE=8.0 pr=0.1 pf=0.2
+Wolf(id; E=10.0, ΔE=8.0, pr=0.1, pf=0.2, s=randsex()) = Wolf(id, E, ΔE, pr, pf, s)

We have our convenient, high-level behaviour back!

julia> Sheep(1)🐑♀ #1 E=4.0 ΔE=0.2 pr=0.8 pf=0.6
julia> Wolf(2)🐺♂ #2 E=10.0 ΔE=8.0 pr=0.1 pf=0.2
Exercise:

Check the methods for eat! and kill_agent! which involve Animals and update their type signatures such that they work for the new type hiearchy.

@@ -118,7 +118,7 @@ 🐑♂ #2 E=4.0 ΔE=0.2 pr=0.8 pf=0.6 🐑♀ #1 E=4.0 ΔE=0.2 pr=0.8 pf=0.6
julia> reproduce!(s1, w); wMain.World{Main.Animal{🐑}} 🐑♂ #2 E=4.0 ΔE=0.2 pr=0.8 pf=0.6 - 🐑♂ #3 E=2.0 ΔE=0.2 pr=0.8 pf=0.6 + 🐑♀ #3 E=2.0 ΔE=0.2 pr=0.8 pf=0.6 🐑♀ #1 E=2.0 ΔE=0.2 pr=0.8 pf=0.6
Exercise:

Implement the type hiearchy we designed for Plants as well.

@@ -147,8 +147,8 @@ sheep.energy += grass.size * sheep.Δenergy grass.size = 0 end -eats(::Animal{Sheep},g::Plant{Grass}) = g.size > 0

julia> g = Grass(2)🌿  #2 10% grown
julia> s = Sheep(3)🐑♂ #3 E=4.0 ΔE=0.2 pr=0.8 pf=0.6
julia> w = World([g,s])Main.World{Main.Agent} - 🌿 #2 10% grown - 🐑♂ #3 E=4.0 ΔE=0.2 pr=0.8 pf=0.6
julia> eat!(s,g,w); wMain.World{Main.Agent} +eats(::Animal{Sheep},g::Plant{Grass}) = g.size > 0

julia> g = Grass(2)🌿  #2 50% grown
julia> s = Sheep(3)🐑♀ #3 E=4.0 ΔE=0.2 pr=0.8 pf=0.6
julia> w = World([g,s])Main.World{Main.Agent} + 🌿 #2 50% grown + 🐑♀ #3 E=4.0 ΔE=0.2 pr=0.8 pf=0.6
julia> eat!(s,g,w); wMain.World{Main.Agent} 🌿 #2 0% grown - 🐑♂ #3 E=4.2 ΔE=0.2 pr=0.8 pf=0.6
+ 🐑♀ #3 E=5.0 ΔE=0.2 pr=0.8 pf=0.6 diff --git a/dev/lecture_03/lecture/index.html b/dev/lecture_03/lecture/index.html index 23f033c4..e7bdc63a 100644 --- a/dev/lecture_03/lecture/index.html +++ b/dev/lecture_03/lecture/index.html @@ -121,4 +121,4 @@ end end

Is this confusing? What can cb() do and what it can not?

Note that function train! does not have many local variables. The important ones are arguments, i.e. exist in the scope from which the function was invoked.

loss(x,y)=mse(model(x),y)
 cb() = @info "training" loss(x,y)
-train!(loss, ps, data, opt; cb=cb)

Usage

Usage of closures:

Beware: Performance of captured variables

Inference of types may be difficult in closures: https://github.com/JuliaLang/julia/issues/15276

Aditional materials

+train!(loss, ps, data, opt; cb=cb)

Usage

Usage of closures:

Beware: Performance of captured variables

Inference of types may be difficult in closures: https://github.com/JuliaLang/julia/issues/15276

Aditional materials

diff --git a/dev/lecture_04/hw/index.html b/dev/lecture_04/hw/index.html index ea970861..aaeb6c18 100644 --- a/dev/lecture_04/hw/index.html +++ b/dev/lecture_04/hw/index.html @@ -13,4 +13,4 @@
Homework:
  1. Create a Sheep with food probability $p_f=1$
  2. Create fully grown Grass and a World with the two agents.
  3. Execute eat!(::Animal{Sheep}, ::Plant{Grass}, ::World)
  4. @test that the size of the Grass now has size == 0

Test Wolf

Homework:
-
  1. Create a Wolf with food probability $p_f=1$
  2. Create a Sheep and a World with the two agents.
  3. Execute eat!(::Animal{Wolf}, ::Animal{Sheep}, ::World)
  4. @test that the World only has one agent left in the agents dictionary
+
  1. Create a Wolf with food probability $p_f=1$
  2. Create a Sheep and a World with the two agents.
  3. Execute eat!(::Animal{Wolf}, ::Animal{Sheep}, ::World)
  4. @test that the World only has one agent left in the agents dictionary
diff --git a/dev/lecture_04/lab/a27a9792.svg b/dev/lecture_04/lab/0420031e.svg similarity index 52% rename from dev/lecture_04/lab/a27a9792.svg rename to dev/lecture_04/lab/0420031e.svg index bdb6b93b..2899d9f3 100644 --- a/dev/lecture_04/lab/a27a9792.svg +++ b/dev/lecture_04/lab/0420031e.svg @@ -1,48 +1,48 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/lecture_04/lab/index.html b/dev/lecture_04/lab/index.html index 94f9e609..901bd77c 100644 --- a/dev/lecture_04/lab/index.html +++ b/dev/lecture_04/lab/index.html @@ -21,25 +21,23 @@

Finally, lets implement a function world_step! which performs one agent_step! for each agent. Note that simply iterating over all agents could lead to problems because we are mutating the agent dictionary. One solution for this is to iterate over a copy of all agent IDs that are present when starting to iterate over agents. Additionally, it could happen that an agent is killed by another one before we apply agent_step! to it. To solve this you can check if a given ID is currently present in the World.

Solution:

julia> w = World([Sheep(1), Sheep(2), Wolf(3)])Main.World{Main.Animal}
-  🐑♀ #2 E=4.0 ΔE=0.2 pr=0.8 pf=0.6
+  🐑♂ #2 E=4.0 ΔE=0.2 pr=0.8 pf=0.6
   🐺♂ #3 E=10.0 ΔE=8.0 pr=0.1 pf=0.2
-  🐑♂ #1 E=4.0 ΔE=0.2 pr=0.8 pf=0.6
julia> world_step!(w); wMain.World{Main.Animal} + 🐑♀ #1 E=4.0 ΔE=0.2 pr=0.8 pf=0.6
julia> world_step!(w); wMain.World{Main.Animal} + 🐑♀ #5 E=1.5 ΔE=0.2 pr=0.8 pf=0.6 🐑♂ #4 E=1.5 ΔE=0.2 pr=0.8 pf=0.6 - 🐑♀ #2 E=1.5 ΔE=0.2 pr=0.8 pf=0.6 + 🐑♂ #2 E=1.5 ΔE=0.2 pr=0.8 pf=0.6 🐺♂ #3 E=9.0 ΔE=8.0 pr=0.1 pf=0.2 - 🐑♂ #1 E=3.0 ΔE=0.2 pr=0.8 pf=0.6
julia> world_step!(w); wMain.World{Main.Animal} - 🐑♂ #5 E=0.25 ΔE=0.2 pr=0.8 pf=0.6 + 🐑♀ #1 E=1.5 ΔE=0.2 pr=0.8 pf=0.6
julia> world_step!(w); wMain.World{Main.Animal} + 🐑♀ #5 E=0.5 ΔE=0.2 pr=0.8 pf=0.6 🐑♂ #4 E=0.25 ΔE=0.2 pr=0.8 pf=0.6 - 🐑♀ #6 E=0.25 ΔE=0.2 pr=0.8 pf=0.6 - 🐑♂ #7 E=1.0 ΔE=0.2 pr=0.8 pf=0.6 - 🐑♀ #2 E=0.25 ΔE=0.2 pr=0.8 pf=0.6 + 🐑♂ #6 E=0.25 ΔE=0.2 pr=0.8 pf=0.6 + 🐑♀ #7 E=0.25 ΔE=0.2 pr=0.8 pf=0.6 + 🐑♂ #2 E=0.25 ΔE=0.2 pr=0.8 pf=0.6 + 🐑♂ #8 E=0.25 ΔE=0.2 pr=0.8 pf=0.6 🐺♂ #3 E=8.0 ΔE=8.0 pr=0.1 pf=0.2 - 🐑♂ #1 E=1.0 ΔE=0.2 pr=0.8 pf=0.6
julia> world_step!(w); wMain.World{Main.Animal} - 🐑♂ #7 E=0.0 ΔE=0.2 pr=0.8 pf=0.6 - 🐑♂ #9 E=0.0 ΔE=0.2 pr=0.8 pf=0.6 - 🐑♀ #8 E=0.0 ΔE=0.2 pr=0.8 pf=0.6 - 🐺♂ #3 E=7.0 ΔE=8.0 pr=0.1 pf=0.2 - 🐑♂ #1 E=0.0 ΔE=0.2 pr=0.8 pf=0.6

Finally, lets run a few simulation steps and plot the solution

n_grass  = 1_000
+  🐑♀ #1 E=0.25 ΔE=0.2 pr=0.8 pf=0.6
julia> world_step!(w); wMain.World{Main.Animal} + 🐺♂ #3 E=7.0 ΔE=8.0 pr=0.1 pf=0.2

Finally, lets run a few simulation steps and plot the solution

n_grass  = 1_000
 n_sheep  = 40
 n_wolves = 4
 
@@ -61,7 +59,7 @@
 for (n,c) in counts
     plot!(plt, c, label=string(n), lw=2)
 end
-plt
Example block output

Package: Ecosystem.jl

In the main section of this lab you will create your own Ecosystem.jl package to organize and test (!) the code that we have written so far.

PkgTemplates.jl

+pltExample block output

Package: Ecosystem.jl

In the main section of this lab you will create your own Ecosystem.jl package to organize and test (!) the code that we have written so far.

PkgTemplates.jl

Exercise:

The simplest way to create a new package in Julia is to use PkgTemplates.jl. ]add PkgTemplates to your global julia env and create a new package by running:

using PkgTemplates
 Template(interactive=true)("Ecosystem")

to interactively specify various options for your new package or use the following snippet to generate it programmatically:

using PkgTemplates
@@ -164,7 +162,7 @@
            @test repr(s) == "🐑♂ #2 E=1.0 ΔE=1.0 pr=1.0 pf=1.0"
            @test repr(w) == "🐺♀ #3 E=1.0 ΔE=1.0 pr=1.0 pf=1.0"
        endTest Summary: | Pass  Total  Time
-Base.show     |    3      3  0.4s
-Test.DefaultTestSet("Base.show", Any[], 3, false, false, true, 1.730389029196625e9, 1.730389029585469e9, false, "REPL[2]")

Github CI

+Base.show | 3 3 0.3s +Test.DefaultTestSet("Base.show", Any[], 3, false, false, true, 1.730995488878833e9, 1.730995489221471e9, false, "REPL[2]")

Github CI

Exercise:
-

If you want you can upload you package to Github and add the julia-runtest Github Action to automatically test your code for every new push you make to the repository.

+

If you want you can upload you package to Github and add the julia-runtest Github Action to automatically test your code for every new push you make to the repository.

diff --git a/dev/lecture_04/lecture/index.html b/dev/lecture_04/lecture/index.html index f3903fbc..4651f4ac 100644 --- a/dev/lecture_04/lecture/index.html +++ b/dev/lecture_04/lecture/index.html @@ -103,4 +103,4 @@ precompile(fsum,(Float64,Float64,Float64)) end

Can be investigated using MethodAnalysis.

using MethodAnalysis
-mi =methodinstances(fsum)

Useful packages:

  • PackageCompiler.jl has three main purposes:

    • Creating custom sysimages for reduced latency when working locally with packages that has a high startup time.
    • Creating "apps" which are a bundle of files including an executable that can be sent and run on other machines without Julia being installed on that machine.
    • Creating a relocatable C library bundle form of Julia code.
  • AutoSysimages.jl allows easy generation of precompiles images - reduces package loading

Additional material

+mi =methodinstances(fsum)

Useful packages:

  • PackageCompiler.jl has three main purposes:

    • Creating custom sysimages for reduced latency when working locally with packages that has a high startup time.
    • Creating "apps" which are a bundle of files including an executable that can be sent and run on other machines without Julia being installed on that machine.
    • Creating a relocatable C library bundle form of Julia code.
  • AutoSysimages.jl allows easy generation of precompiles images - reduces package loading

Additional material

diff --git a/dev/lecture_05/hw/index.html b/dev/lecture_05/hw/index.html index 20b474ee..2f697d5d 100644 --- a/dev/lecture_05/hw/index.html +++ b/dev/lecture_05/hw/index.html @@ -11,4 +11,4 @@

Voluntary exercise

Voluntary exercise
-

Use Plots.jl to plot the polynomial $p$ on the interval $[-5, 5]$ and visualize the progress/convergence of each method, with a dotted vertical line and a dot on the x-axis for each subsequent root approximation .

HINTS:

  • plotting scalar function f - plot(r, f), where r is a range of x values at which we evaluate f
  • updating an existing plot - either plot!(plt, ...) or plot!(...), in the former case the plot lives in variable plt whereas in the latter we modify some implicit global variable
  • plotting dots - for example with scatter/scatter!
  • plot([(1.0,2.0), (1.0,3.0)], ls=:dot) will create a dotted line from position (x=1.0,y=2.0) to (x=1.0,y=3.0)
+

Use Plots.jl to plot the polynomial $p$ on the interval $[-5, 5]$ and visualize the progress/convergence of each method, with a dotted vertical line and a dot on the x-axis for each subsequent root approximation .

HINTS:

  • plotting scalar function f - plot(r, f), where r is a range of x values at which we evaluate f
  • updating an existing plot - either plot!(plt, ...) or plot!(...), in the former case the plot lives in variable plt whereas in the latter we modify some implicit global variable
  • plotting dots - for example with scatter/scatter!
  • plot([(1.0,2.0), (1.0,3.0)], ls=:dot) will create a dotted line from position (x=1.0,y=2.0) to (x=1.0,y=3.0)
diff --git a/dev/lecture_05/lab/index.html b/dev/lecture_05/lab/index.html index ecc66246..5063df27 100644 --- a/dev/lecture_05/lab/index.html +++ b/dev/lecture_05/lab/index.html @@ -178,12 +178,12 @@ end world = create_world();
w = Wolf(4000)
 find_food(w, world)
-@code_warntype find_food(w, world)
MethodInstance for Main.find_food(::Main.Animal{🐺, ♂}, ::Main.World{@NamedTuple{Grass::Dict{Int64, Main.Plant{🌿}}, SheepFemale::Dict{Int64, Main.Animal{🐑, ♀}}, SheepMale::Dict{Int64, Main.Animal{🐑, ♂}}, WolfMale::Dict{Int64, Main.Animal{🐺, ♂}}, WolfFemale::Dict{Int64, Main.Animal{🐺, ♀}}}})
+@code_warntype find_food(w, world)
MethodInstance for Main.find_food(::Main.Animal{🐺, ♂}, ::Main.World{@NamedTuple{Grass::Dict{Int64, Main.Plant{🌿}}, SheepFemale::Dict{Int64, Main.Animal{🐑, ♀}}, SheepMale::Dict{Int64, Main.Animal{🐑, ♂}}, WolfFemale::Dict{Int64, Main.Animal{🐺, ♀}}, WolfMale::Dict{Int64, Main.Animal{🐺, ♂}}}})
   from find_food(::Main.Animal{🐺}, w::Main.World) @ Main ~/work/Scientific-Programming-in-Julia/Scientific-Programming-in-Julia/docs/build/lecture_05/ecosystems/animal_ST_world_NamedTupleDict/Ecosystem.jl:158
 Arguments
   #self#::Core.Const(Main.find_food)
   _::Main.Animal{🐺, ♂}
-  w::Main.World{@NamedTuple{Grass::Dict{Int64, Main.Plant{🌿}}, SheepFemale::Dict{Int64, Main.Animal{🐑, ♀}}, SheepMale::Dict{Int64, Main.Animal{🐑, ♂}}, WolfMale::Dict{Int64, Main.Animal{🐺, ♂}}, WolfFemale::Dict{Int64, Main.Animal{🐺, ♀}}}}
+  w::Main.World{@NamedTuple{Grass::Dict{Int64, Main.Plant{🌿}}, SheepFemale::Dict{Int64, Main.Animal{🐑, ♀}}, SheepMale::Dict{Int64, Main.Animal{🐑, ♂}}, WolfFemale::Dict{Int64, Main.Animal{🐺, ♀}}, WolfMale::Dict{Int64, Main.Animal{🐺, ♂}}}}
 Body::Union{Nothing, Main.Animal{🐑, ♀}, Main.Animal{🐑, ♂}}
 1 ─ %1 = Main.find_agent(Main.Sheep, w)::Union{Nothing, Main.Animal{🐑, ♀}, Main.Animal{🐑, ♂}}
-└──      return %1

Useful resources

The problem we have been touching in the latter part is quite pervasive in some systems with many agents. One solution we have not used here is to use SumTypes. Julia does not have a native support, but offers solutions thourough libraries like SumTypes.jl, UniTyper.jl or Moshi.

+└── return %1

Useful resources

The problem we have been touching in the latter part is quite pervasive in some systems with many agents. One solution we have not used here is to use SumTypes. Julia does not have a native support, but offers solutions thourough libraries like SumTypes.jl, UniTyper.jl or Moshi.

diff --git a/dev/lecture_05/lecture/index.html b/dev/lecture_05/lecture/index.html index 4d67ba78..82ffff2e 100644 --- a/dev/lecture_05/lecture/index.html +++ b/dev/lecture_05/lecture/index.html @@ -584,4 +584,4 @@ ) => var"#7#8"{Base.RefValue{Int64}}

Jet is also happy.


 julia> @report_opt ref_abmult(1)
 No errors !
-

So when you use closures, you should be careful of the accidental boxing, since it can inhibit the speed of code. This is a big deal in Multithreadding and in automatic differentiation, both heavily uses closures. You can track the discussion here.

+

So when you use closures, you should be careful of the accidental boxing, since it can inhibit the speed of code. This is a big deal in Multithreadding and in automatic differentiation, both heavily uses closures. You can track the discussion here.

diff --git a/dev/lecture_06/hw/index.html b/dev/lecture_06/hw/index.html index 70db68ed..cdb4e026 100644 --- a/dev/lecture_06/hw/index.html +++ b/dev/lecture_06/hw/index.html @@ -7,4 +7,4 @@
Voluntary exercise

Create a function that replaces each of +, -, * and / with the respective checked operation, which checks for overflow. E.g. + should be replaced by Base.checked_add.

+Solution:

Not yet published.

diff --git a/dev/lecture_06/lab/8dbaade1.svg b/dev/lecture_06/lab/1fb0fc85.svg similarity index 80% rename from dev/lecture_06/lab/8dbaade1.svg rename to dev/lecture_06/lab/1fb0fc85.svg index 1d7f4f7a..95fcc1dc 100644 --- a/dev/lecture_06/lab/8dbaade1.svg +++ b/dev/lecture_06/lab/1fb0fc85.svg @@ -1,59 +1,59 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/lecture_06/lab/index.html b/dev/lecture_06/lab/index.html index cb414e69..5859ed9a 100644 --- a/dev/lecture_06/lab/index.html +++ b/dev/lecture_06/lab/index.html @@ -103,9 +103,9 @@ store ptr %"Memory{Int64}[]", ptr %gc_slot_addr_0, align 16 %ptls_field = getelementptr inbounds ptr, ptr %tls_pgcstack, i64 2 %ptls_load = load ptr, ptr %ptls_field, align 8 - %"new::Array" = call noalias nonnull align 8 dereferenceable(32) ptr @ijl_gc_pool_alloc_instrumented(ptr %ptls_load, i32 800, i32 32, i64 140553721025088) #8 + %"new::Array" = call noalias nonnull align 8 dereferenceable(32) ptr @ijl_gc_pool_alloc_instrumented(ptr %ptls_load, i32 800, i32 32, i64 140438448968256) #8 %"new::Array.tag_addr" = getelementptr inbounds i64, ptr %"new::Array", i64 -1 - store atomic i64 140553721025088, ptr %"new::Array.tag_addr" unordered, align 8 + store atomic i64 140438448968256, ptr %"new::Array.tag_addr" unordered, align 8 %1 = getelementptr inbounds ptr, ptr %"new::Array", i64 1 store ptr %0, ptr %"new::Array", align 8 store ptr %"Memory{Int64}[]", ptr %1, align 8 @@ -168,7 +168,7 @@ mov r12, qword ptr [rax + 8] mov qword ptr [rbp - 48], rax mov rbx, rax - movabs r14, 140553721025088 + movabs r14, 140438448968256 movabs rax, offset ijl_gc_pool_alloc_instrumented mov esi, 800 mov edx, 32 @@ -207,9 +207,9 @@ .quad 1 # 0x1 .size ".L_j_const#2", 8 -.set ".L+Core.Array#34954.jit", 140553721025088 +.set ".L+Core.Array#34954.jit", 140438448968256 .size ".L+Core.Array#34954.jit", 8 -.set ".L+Core.GenericMemory#34952.jit", 140553763301824 +.set ".L+Core.GenericMemory#34952.jit", 140438491244992 .size ".L+Core.GenericMemory#34952.jit", 8 .section ".note.GNU-stack","",@progbits

Let's see how these tools can help us understand some of Julia's internals on examples from previous labs and lectures.

Understanding runtime dispatch and type instabilities

We will start with a question: Can we spot internally some difference between type stable/unstable code?

Exercise

Inspect the following two functions using @code_lowered, @code_typed, @code_llvm and @code_native.

x = rand(10^5)
 function explicit_len(x)
@@ -304,7 +304,7 @@
       args: Array{Any}((3,))
         1: Symbol +
         2: Symbol x
-        3: Symbol y

The type of both multiline and single line expression is Expr with fields head and args. Notice that Expr type is recursive in the args, which can store other expressions resulting in a tree structure - abstract syntax tree (AST) - that can be visualized for example with the combination of GraphRecipes and Plots packages.

plot(code_expr_block, fontsize=12, shorten=0.01, axis_buffer=0.15, nodeshape=:rect)
Example block output

This recursive structure has some major performance drawbacks, because the args field is of type Any and therefore modifications of this expression level AST won't be type stable. Building blocks of expressions are Symbols and literal values (numbers).

A possible nuisance of working with multiline expressions is the presence of LineNumber nodes, which can be removed with Base.remove_linenums! function.

julia> Base.remove_linenums!(code_parse_block)quote
+        3: Symbol y

The type of both multiline and single line expression is Expr with fields head and args. Notice that Expr type is recursive in the args, which can store other expressions resulting in a tree structure - abstract syntax tree (AST) - that can be visualized for example with the combination of GraphRecipes and Plots packages.

plot(code_expr_block, fontsize=12, shorten=0.01, axis_buffer=0.15, nodeshape=:rect)
Example block output

This recursive structure has some major performance drawbacks, because the args field is of type Any and therefore modifications of this expression level AST won't be type stable. Building blocks of expressions are Symbols and literal values (numbers).

A possible nuisance of working with multiline expressions is the presence of LineNumber nodes, which can be removed with Base.remove_linenums! function.

julia> Base.remove_linenums!(code_parse_block)quote
     x = 2
     y = 3
     x + y
@@ -312,4 +312,4 @@
 replace_i(e::Expr) = Expr(e.head, map(replace_i, e.args)...)
 replace_i(u) = u

Given a function replace_i, which replaces variables i for k in an expression like the following

julia> ex = :(i + i*i + y*i - sin(z)):((i + i * i + y * i) - sin(z))
julia> @test replace_i(ex) == :(k + k*k + y*k - sin(z))Test Passed

write a different function sreplace_i(s), which does the same thing but instead of a parsed expression (AST) it manipulates a string, such as

julia> s = string(ex)"(i + i * i + y * i) - sin(z)"

HINTS:

  • Use Meta.parse in combination with replace_i ONLY for checking of correctness.
  • You can use the replace function in combination with regular expressions.
  • Think of some corner cases, that the method may not handle properly.

If the exercises so far did not feel very useful let's focus on one, that is similar to a part of the IntervalArithmetics.jl pkg.

Exercise

Write function wrap!(ex::Expr) which wraps literal values (numbers) with a call to f(). You can test it on the following example

f = x -> convert(Float64, x)
 ex = :(x*x + 2*y*x + y*y)     # original expression
-rex = :(x*x + f(2)*y*x + y*y) # result expression

HINTS:

  • use recursion and multiple dispatch
  • dispatch on ::Number to detect numbers in an expression
  • for testing purposes, create a copy of ex before mutating

This kind of manipulation is at the core of some pkgs, such as aforementioned IntervalArithmetics.jl where every number is replaced with a narrow interval in order to find some bounds on the result of a computation.


Resources

+rex = :(x*x + f(2)*y*x + y*y) # result expression

HINTS:

  • use recursion and multiple dispatch
  • dispatch on ::Number to detect numbers in an expression
  • for testing purposes, create a copy of ex before mutating

This kind of manipulation is at the core of some pkgs, such as aforementioned IntervalArithmetics.jl where every number is replaced with a narrow interval in order to find some bounds on the result of a computation.


Resources

diff --git a/dev/lecture_06/lecture/index.html b/dev/lecture_06/lecture/index.html index 7764570a..96eeb4f1 100644 --- a/dev/lecture_06/lecture/index.html +++ b/dev/lecture_06/lecture/index.html @@ -374,4 +374,4 @@ for f in [:setindex!, :getindex, :size, :length] @eval $(f)(A::MyMatrix, args...) = $(f)(A.x, args...) -end

Notice that we have just hand-implemented parts of @forward macro from MacroTools, which does exactly this.


Resources

+end

Notice that we have just hand-implemented parts of @forward macro from MacroTools, which does exactly this.


Resources

diff --git a/dev/lecture_07/hw/index.html b/dev/lecture_07/hw/index.html index 73f96c41..426b74bb 100644 --- a/dev/lecture_07/hw/index.html +++ b/dev/lecture_07/hw/index.html @@ -15,4 +15,4 @@ genex = _ecosystem(ex) world = eval(genex) +Solution:

Nothing to see here

diff --git a/dev/lecture_07/lab/index.html b/dev/lecture_07/lab/index.html index c3d5fe58..3c2032e6 100644 --- a/dev/lecture_07/lab/index.html +++ b/dev/lecture_07/lab/index.html @@ -268,4 +268,4 @@ species = :Rabbit foodlist = :([Grass => 0.5, Broccoli => 1.0]) -_eats(species, foodlist)


Resources

Type{T} type selectors

We have used ::Type{T} signature[2] at few places in the Ecosystem family of packages (and it will be helpful in the HW as well), such as in the show methods

Base.show(io::IO,::Type{World}) = print(io,"🌍")

This particular example defines a method where the second argument is the World type itself and not an instance of a World type. As a result we are able to dispatch on specific types as values.

Furthermore we can use subtyping operator to match all types in a hierarchy, e.g. ::Type{<:AnimalSpecies} matches all animal species

+_eats(species, foodlist)


Resources

Type{T} type selectors

We have used ::Type{T} signature[2] at few places in the Ecosystem family of packages (and it will be helpful in the HW as well), such as in the show methods

Base.show(io::IO,::Type{World}) = print(io,"🌍")

This particular example defines a method where the second argument is the World type itself and not an instance of a World type. As a result we are able to dispatch on specific types as values.

Furthermore we can use subtyping operator to match all types in a hierarchy, e.g. ::Type{<:AnimalSpecies} matches all animal species

diff --git a/dev/lecture_07/lecture/index.html b/dev/lecture_07/lecture/index.html index 898c19f1..2e81f21a 100644 --- a/dev/lecture_07/lecture/index.html +++ b/dev/lecture_07/lecture/index.html @@ -420,4 +420,4 @@ print(io, lb,r.left,",",r.right,rb) end

We can check it does the job by trying Interval("[1,2)"). Finally, we define a string macro as

macro int_str(s)
 	Interval(s)
-end

which allows us to define interval as int"[1,2)".

Sources

+end

which allows us to define interval as int"[1,2)".

Sources

diff --git a/dev/lecture_08/hw/index.html b/dev/lecture_08/hw/index.html index caf01245..259a2adf 100644 --- a/dev/lecture_08/hw/index.html +++ b/dev/lecture_08/hw/index.html @@ -48,4 +48,4 @@ accum!.(ts) end
gradient (generic function with 1 method)

We will use it to compute the derivative of the Babylonian square root.

babysqrt(x, t=(1+x)/2, n=10) = n==0 ? t : babysqrt(x, (t+x/t)/2, n-1)

In order to differentiate through babysqrt you will need a reverse rule for / for Base.:/(TrackedReal,TrackedReal) as well as the cases where you divide with constants in volved (e.g. Base.:/(TrackedReal,Real)).

Homework (2 points)
-

Write the reverse rules for / and the missing rules for + such that you can differentiate through division and addition with and without constants.

You can verify your solution with the gradient function.

julia> gradient(babysqrt, 2.0)(0.35355339059327373,)
julia> 1/(2babysqrt(2.0))0.3535533905932738
+

Write the reverse rules for / and the missing rules for + such that you can differentiate through division and addition with and without constants.

You can verify your solution with the gradient function.

julia> gradient(babysqrt, 2.0)(0.35355339059327373,)
julia> 1/(2babysqrt(2.0))0.3535533905932738
diff --git a/dev/lecture_08/lab/02c39ce9.svg b/dev/lecture_08/lab/61a32be0.svg similarity index 94% rename from dev/lecture_08/lab/02c39ce9.svg rename to dev/lecture_08/lab/61a32be0.svg index f25477f6..8b5310f4 100644 --- a/dev/lecture_08/lab/02c39ce9.svg +++ b/dev/lecture_08/lab/61a32be0.svg @@ -1,125 +1,120 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - + - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + diff --git a/dev/lecture_08/lab/index.html b/dev/lecture_08/lab/index.html index 838180b2..e024f673 100644 --- a/dev/lecture_08/lab/index.html +++ b/dev/lecture_08/lab/index.html @@ -79,7 +79,7 @@ using Plots color_scheme = cgrad(:RdYlBu_5, rev=true) -contour(-4:0.1:4, -2:0.1:2, g, fill=true, c=color_scheme, xlabel="x", ylabel="y")Example block output

We can find a local minimum of $g$ by starting at an initial point $(x_0,y_0)$ and taking small steps in the opposite direction of the gradient

\[\begin{align} +contour(-4:0.1:4, -2:0.1:2, g, fill=true, c=color_scheme, xlabel="x", ylabel="y")Example block output

We can find a local minimum of $g$ by starting at an initial point $(x_0,y_0)$ and taking small steps in the opposite direction of the gradient

\[\begin{align} x_{i+1} &= x_i - \lambda \frac{\partial f}{\partial x_i} \\ y_{i+1} &= y_i - \lambda \frac{\partial f}{\partial y_i}, \end{align}\]

where $\lambda$ is the learning rate that has to be tuned manually.

@@ -187,8 +187,8 @@ # pretty print TrackedArray Base.show(io::IO, x::TrackedArray) = print(io, "Tracked $(x.data)") Base.print_array(io::IO, x::TrackedArray) = Base.print_array(io, x.data)

Creating a TrackedArray should work like this:

julia> track(rand(2,2))2×2 Main.TrackedArray{Float64, 2, Matrix{Float64}}:
- 0.934468  0.687123
- 0.679224  0.0849663
function accum!(x::Union{TrackedReal,TrackedArray})
+ 0.391436  0.264661
+ 0.097583  0.542853
function accum!(x::Union{TrackedReal,TrackedArray})
     if isnothing(x.grad)
         x.grad = sum(λ(accum!(Δ)) for (Δ,λ) in x.children)
     end
@@ -204,19 +204,19 @@
     Y.children[Z] = Δ -> X.data' * Δ
     Z
 end

julia> X = rand(2,3) |> track2×3 Main.TrackedArray{Float64, 2, Matrix{Float64}}:
- 0.675123  0.0185254  0.474878
- 0.79757   0.403296   0.444625
julia> Y = rand(3,2) |> track3×2 Main.TrackedArray{Float64, 2, Matrix{Float64}}: - 0.465317 0.618373 - 0.576009 0.014442 - 0.114051 0.943347
julia> Z = X*Y2×2 Main.TrackedArray{Float64, 2, Matrix{Float64}}: - 0.378977 0.86572 - 0.654134 0.918455
julia> f = X.children[Z]#3 (generic function with 1 method)
julia> Ω̄ = ones(size(Z)...)2×2 Matrix{Float64}: + 0.298771 0.329327 0.558223 + 0.0930047 0.678368 0.716352
julia> Y = rand(3,2) |> track3×2 Main.TrackedArray{Float64, 2, Matrix{Float64}}: + 0.987909 0.896507 + 0.447637 0.412403 + 0.748435 0.465515
julia> Z = X*Y2×2 Main.TrackedArray{Float64, 2, Matrix{Float64}}: + 0.860371 0.663527 + 0.931686 0.696613
julia> f = X.children[Z]#3 (generic function with 1 method)
julia> Ω̄ = ones(size(Z)...)2×2 Matrix{Float64}: 1.0 1.0 1.0 1.0
julia> f(Ω̄)2×3 Matrix{Float64}: - 1.08369 0.590451 1.0574 - 1.08369 0.590451 1.0574
julia> Ω̄*Y.data'2×3 Matrix{Float64}: - 1.08369 0.590451 1.0574 - 1.08369 0.590451 1.0574
+ 1.88442 0.86004 1.21395 + 1.88442 0.86004 1.21395
julia> Ω̄*Y.data'2×3 Matrix{Float64}: + 1.88442 0.86004 1.21395 + 1.88442 0.86004 1.21395
Exercise

Implement rules for sum, +, -, and abs2.

@@ -288,4 +288,4 @@ end end y -end

You can see a full implementation of our tracing based AD here and a simple implementation of a Neural Network that can learn an approximation to the function g here. Running the latter script will produce an animation that shows how the network is learning.

anim

This lab is heavily inspired by Rufflewind

+end

You can see a full implementation of our tracing based AD here and a simple implementation of a Neural Network that can learn an approximation to the function g here. Running the latter script will produce an animation that shows how the network is learning.

anim

This lab is heavily inspired by Rufflewind

diff --git a/dev/lecture_08/lecture/b138b834.svg b/dev/lecture_08/lecture/9d533ca5.svg similarity index 86% rename from dev/lecture_08/lecture/b138b834.svg rename to dev/lecture_08/lecture/9d533ca5.svg index e2aefe7d..4c931bdd 100644 --- a/dev/lecture_08/lecture/b138b834.svg +++ b/dev/lecture_08/lecture/9d533ca5.svg @@ -1,48 +1,48 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/lecture_08/lecture/index.html b/dev/lecture_08/lecture/index.html index cbbdc402..7bb3a747 100644 --- a/dev/lecture_08/lecture/index.html +++ b/dev/lecture_08/lecture/index.html @@ -58,7 +58,7 @@ - Run `import Pkg; Pkg.add("FiniteDifferences")` to install the FiniteDifferences package.
julia> forward_dsqrt(x) = forward_diff(babysqrt,x)forward_dsqrt (generic function with 1 method)
julia> analytc_dsqrt(x) = 1/(2babysqrt(x))analytc_dsqrt (generic function with 1 method)
julia> forward_dsqrt(2.0)0.35355339059327373
julia> analytc_dsqrt(2.0)0.3535533905932738
julia> central_fdm(5, 1)(babysqrt, 2.0)ERROR: UndefVarError: `central_fdm` not defined in `Main` Suggestion: check for spelling errors or missing imports.
plot(0.0:0.01:2, babysqrt, label="f(x) = babysqrt(x)", lw=3)
 plot!(0.1:0.01:2, analytc_dsqrt, label="Analytic f'", ls=:dot, lw=3)
-plot!(0.1:0.01:2, forward_dsqrt, label="Dual Forward Mode f'", lw=3, ls=:dash)
Example block output

Takeaways

  1. Forward mode $f'$ is obtained simply by pushing a Dual through babysqrt
  2. To make the forward diff work in Julia, we only need to overload a few operators for forward mode AD to work on any function. Therefore the name of the approach is called operator overloading.
  3. For vector valued function we can use Hyperduals
  4. Forward diff can differentiation through the setindex! (called each time an element is assigned to a place in array, e.g. x = [1,2,3]; x[2] = 1)
  5. ForwardDiff is implemented in ForwardDiff.jl, which might appear to be neglected, but the truth is that it is very stable and general implementation.
  6. ForwardDiff does not have to be implemented through Dual numbers. It can be implemented similarly to ReverseDiff through multiplication of Jacobians, which is what is the community work on now (in Diffractor, Zygote with rules defined in ChainRules).

Reverse mode

In reverse mode, the computation of the gradient follow the opposite order. We initialize the computation by setting $\mathbf{J}_0 = \frac{\partial y}{\partial y_0},$ which is again an identity matrix. Then we compute Jacobians and multiplications in the opposite order. The problem is that to calculate $J_i$ we need to know the value of $y_i^0$, which cannot be calculated in the reverse pass. The backward pass therefore needs to be preceded by the forward pass, where $\{y_i^0\}_{i=1}^n$ are calculated.

The complete reverse mode algorithm therefore proceeds as

  1. Forward pass: iterate i from n down to 1 as
    • calculate the next intermediate output as $y^0_{i-1} = f_i(y^0_i)$
  2. Backward pass: iterate i from 1 down to n as
    • calculate Jacobian $J_i = \left.\frac{f_i}{\partial y_i}\right|_{y_i^0}$ at point $y_i^0$
    • pull back the gradient as $\left.\frac{\partial f(x)}{\partial y_{i}}\right|_{y^0_i} = \left.\frac{\partial y_0}{\partial y_{i-1}}\right|_{y^0_{i-1}} \times J_i$

The need to store intermediate outs has a huge impact on memory requirements, which particularly on GPU is a big deal. Recall few lectures ago we have been discussing how excessive memory allocations can be damaging for performance, here we are given an algorithm where the excessive allocation is by design.

Tricks to decrease memory consumptions

  • Define custom rules over large functional blocks. For example while we can auto-grad (in theory) matrix product, it is much more efficient to define make a matrix multiplication as one large function, for which we define Jacobians (note that by doing so, we can dispatch on Blas). e.g

\[\begin{alignat*}{2} +plot!(0.1:0.01:2, forward_dsqrt, label="Dual Forward Mode f'", lw=3, ls=:dash)Example block output


Takeaways

  1. Forward mode $f'$ is obtained simply by pushing a Dual through babysqrt
  2. To make the forward diff work in Julia, we only need to overload a few operators for forward mode AD to work on any function. Therefore the name of the approach is called operator overloading.
  3. For vector valued function we can use Hyperduals
  4. Forward diff can differentiation through the setindex! (called each time an element is assigned to a place in array, e.g. x = [1,2,3]; x[2] = 1)
  5. ForwardDiff is implemented in ForwardDiff.jl, which might appear to be neglected, but the truth is that it is very stable and general implementation.
  6. ForwardDiff does not have to be implemented through Dual numbers. It can be implemented similarly to ReverseDiff through multiplication of Jacobians, which is what is the community work on now (in Diffractor, Zygote with rules defined in ChainRules).

Reverse mode

In reverse mode, the computation of the gradient follow the opposite order. We initialize the computation by setting $\mathbf{J}_0 = \frac{\partial y}{\partial y_0},$ which is again an identity matrix. Then we compute Jacobians and multiplications in the opposite order. The problem is that to calculate $J_i$ we need to know the value of $y_i^0$, which cannot be calculated in the reverse pass. The backward pass therefore needs to be preceded by the forward pass, where $\{y_i^0\}_{i=1}^n$ are calculated.

The complete reverse mode algorithm therefore proceeds as

  1. Forward pass: iterate i from n down to 1 as
    • calculate the next intermediate output as $y^0_{i-1} = f_i(y^0_i)$
  2. Backward pass: iterate i from 1 down to n as
    • calculate Jacobian $J_i = \left.\frac{f_i}{\partial y_i}\right|_{y_i^0}$ at point $y_i^0$
    • pull back the gradient as $\left.\frac{\partial f(x)}{\partial y_{i}}\right|_{y^0_i} = \left.\frac{\partial y_0}{\partial y_{i-1}}\right|_{y^0_{i-1}} \times J_i$

The need to store intermediate outs has a huge impact on memory requirements, which particularly on GPU is a big deal. Recall few lectures ago we have been discussing how excessive memory allocations can be damaging for performance, here we are given an algorithm where the excessive allocation is by design.

Tricks to decrease memory consumptions

  • Define custom rules over large functional blocks. For example while we can auto-grad (in theory) matrix product, it is much more efficient to define make a matrix multiplication as one large function, for which we define Jacobians (note that by doing so, we can dispatch on Blas). e.g

\[\begin{alignat*}{2} \mathbf{C} &= \mathbf{A} * \mathbf{B} \\ \frac{\partial{\mathbf{C}}}{\partial \mathbf{A}} &= \mathbf{B} \\ \frac{\partial{\mathbf{C}}}{\partial \mathbf{B}} &= \mathbf{A}^{\mathrm{T}} \\ @@ -218,4 +218,4 @@ │ %2 = Main.sin(x) │ %3 = %1 + %2 └── return %3 -)

This form is particularly nice for automatic differentiation, as we have on the left hand side always a single variable, which means the compiler has provided us with a form, on which we know, how to apply AD rules.

What if we somehow be able to talk to the compiler and get this form from him?

Sources for this lecture

  • 1Linnainmaa, S. (1976). Taylor expansion of the accumulated rounding error. BIT Numerical Mathematics, 16(2), 146-160.
  • 2Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323, 533–536.
+)

This form is particularly nice for automatic differentiation, as we have on the left hand side always a single variable, which means the compiler has provided us with a form, on which we know, how to apply AD rules.

What if we somehow be able to talk to the compiler and get this form from him?

Sources for this lecture

  • 1Linnainmaa, S. (1976). Taylor expansion of the accumulated rounding error. BIT Numerical Mathematics, 16(2), 146-160.
  • 2Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323, 533–536.
diff --git a/dev/lecture_09/lab/index.html b/dev/lecture_09/lab/index.html index 27fe1f6e..58f69e8c 100644 --- a/dev/lecture_09/lab/index.html +++ b/dev/lecture_09/lab/index.html @@ -5,7 +5,7 @@ acc = x*acc + p[i] end acc -end

Julia has its own implementation of this function called evalpoly. If we compare the performance of our polynomial and Julia's evalpoly we can observe a pretty big difference:

julia> x = 2.02.0
julia> p = ntuple(float,20);
julia> @btime polynomial($x,$p) 9.006 ns (0 allocations: 0 bytes) +end

Julia has its own implementation of this function called evalpoly. If we compare the performance of our polynomial and Julia's evalpoly we can observe a pretty big difference:

julia> x = 2.02.0
julia> p = ntuple(float,20);
julia> @btime polynomial($x,$p) 9.015 ns (0 allocations: 0 bytes) 1.9922945e7
julia> @btime evalpoly($x,$p) 5.269 ns (0 allocations: 0 bytes) 1.9922945e7

Julia's implementation uses a generated function which specializes on different tuple lengths (i.e. it unrolls the loop) and eliminates the (small) overhead of looping over the tuple. This is possible, because the length of the tuple is known during compile time. You can check the difference between polynomial and evalpoly yourself via the introspectionwtools you know - e.g. @code_lowered.

Exercise
@@ -17,7 +17,7 @@ ex = :(x*$ex + p[$i]) end ex -end

You should get the same performance as evalpoly (and as @poly from Lab 7 with the added convenience of not having to spell out all the coefficients in your code like: p = @poly 1 2 3 ...).

julia> @btime genpoly($x,$p)  9.135 ns (0 allocations: 0 bytes)
+end

You should get the same performance as evalpoly (and as @poly from Lab 7 with the added convenience of not having to spell out all the coefficients in your code like: p = @poly 1 2 3 ...).

julia> @btime genpoly($x,$p)  9.146 ns (0 allocations: 0 bytes)
 1.9922945e7

Fast, Static Matrices

Another great example that makes heavy use of generated functions are static arrays. A static array is an array of fixed size which can be implemented via an NTuple. This means that it will be allocated on the stack, which can buy us a lot of performance for smaller static arrays. We define a StaticMatrix{T,C,R,L} where the paramteric types represent the matrix element type T (e.g. Float32), the number of rows R, the number of columns C, and the total length of the matrix L=C*R (which we need to set the size of the NTuple).

struct StaticMatrix{T,R,C,L} <: AbstractArray{T,2}
     data::NTuple{L,T}
 end
@@ -33,10 +33,10 @@
 Base.length(x::StaticMatrix{T,R,C,L}) where {T,R,C,L} = L
 Base.getindex(x::StaticMatrix, i::Int) = x.data[i]
 Base.getindex(x::StaticMatrix{T,R,C}, r::Int, c::Int) where {T,R,C} = x.data[R*(c-1) + r]

You can check if everything works correctly by comparing to a normal Matrix:

julia> x = rand(2,3)2×3 Matrix{Float64}:
- 0.425552  0.993995  0.994419
- 0.886609  0.304128  0.281864
julia> x[1,2]0.99399499657082
julia> a = StaticMatrix(x)2×3 Main.StaticMatrix{Float64, 2, 3, 6}: - 0.425552 0.993995 0.994419 - 0.886609 0.304128 0.281864
julia> a[1,2]0.99399499657082
+ 0.609657 0.118969 0.791168 + 0.085512 0.400444 0.458046
julia> x[1,2]0.11896904334906755
julia> a = StaticMatrix(x)2×3 Main.StaticMatrix{Float64, 2, 3, 6}: + 0.609657 0.118969 0.791168 + 0.085512 0.400444 0.458046
julia> a[1,2]0.11896904334906755
Exercise

Overload matrix multiplication between two static matrices

Base.:*(x::StaticMatrix{T,K,M},y::StaticMatrix{T,M,N})

with a generated function that creates an expression without loops. Below you can see an example for an expression that would be generated from multiplying two $2\times 2$ matrices.

:(StaticMatrix{T,2,2,4}((
     (x[1,1]*y[1,1] + x[1,2]*y[2,1]),
@@ -52,20 +52,20 @@
     z = Expr(:tuple, zs...)
     :(StaticMatrix{$T,$K,$N,$(K*N)}($z))
 end

You can check that your matrix multiplication works by multiplying two random matrices. Which one is faster?

julia> a = rand(2,3)2×3 Matrix{Float64}:
- 0.107498  0.0202973  0.479133
- 0.216173  0.825653   0.222423
julia> b = rand(3,4)3×4 Matrix{Float64}: - 0.303051 0.290764 0.239896 0.614477 - 0.0897663 0.506217 0.177517 0.947213 - 0.912412 0.809282 0.973678 0.454129
julia> c = StaticMatrix(a)2×3 Main.StaticMatrix{Float64, 2, 3, 6}: - 0.107498 0.0202973 0.479133 - 0.216173 0.825653 0.222423
julia> d = StaticMatrix(b)3×4 Main.StaticMatrix{Float64, 3, 4, 12}: - 0.303051 0.290764 0.239896 0.614477 - 0.0897663 0.506217 0.177517 0.947213 - 0.912412 0.809282 0.973678 0.454129
julia> a*b2×4 Matrix{Float64}: - 0.471566 0.429285 0.495912 0.302869 - 0.342569 0.660817 0.414995 1.01591
julia> c*d2×4 Main.StaticMatrix{Float64, 2, 4, 8}: - 0.471566 0.429285 0.495912 0.302869 - 0.342569 0.660817 0.414995 1.01591

OptionalArgChecks.jl

The package OptionalArgChecks.jl makes is possible to add checks to a function which can then be removed by calling the function with the @skip macro. For example, we can check if the input to a function f is an even number

function f(x::Number)
+ 0.681892  0.21242   0.872419
+ 0.145039  0.871972  0.18327
julia> b = rand(3,4)3×4 Matrix{Float64}: + 0.425513 0.649138 0.424303 0.822735 + 0.56624 0.380891 0.823813 0.766558 + 0.744066 0.316778 0.600927 0.292829
julia> c = StaticMatrix(a)2×3 Main.StaticMatrix{Float64, 2, 3, 6}: + 0.681892 0.21242 0.872419 + 0.145039 0.871972 0.18327
julia> d = StaticMatrix(b)3×4 Main.StaticMatrix{Float64, 3, 4, 12}: + 0.425513 0.649138 0.424303 0.822735 + 0.56624 0.380891 0.823813 0.766558 + 0.744066 0.316778 0.600927 0.292829
julia> a*b2×4 Matrix{Float64}: + 1.05957 0.799914 0.988583 0.979318 + 0.691826 0.484332 0.890014 0.841412
julia> c*d2×4 Main.StaticMatrix{Float64, 2, 4, 8}: + 1.05957 0.799914 0.988583 0.979318 + 0.691826 0.484332 0.890014 0.841412

OptionalArgChecks.jl

The package OptionalArgChecks.jl makes is possible to add checks to a function which can then be removed by calling the function with the @skip macro. For example, we can check if the input to a function f is an even number

function f(x::Number)
     iseven(x) || error("Input has to be an even number!")
     x
 end

If you are doing more involved argument checking it can take quite some time to perform all your checks. However, if you want to be fast and are completely sure that you are always passing in the correct inputs to your function, you might want to remove them in some cases. Hence, we would like to transform the IR of the function above

julia> using IRTools
julia> using IRTools: @code_ir
julia> @code_ir f(1)1: (%1, %2) @@ -356,4 +356,4 @@ Closest candidates are: ismarkbegin(::Expr) - @ Main REPL[1]:1

References

+ @ Main REPL[1]:1

References

diff --git a/dev/lecture_09/lecture/index.html b/dev/lecture_09/lecture/index.html index 423428ff..5c0a14b6 100644 --- a/dev/lecture_09/lecture/index.html +++ b/dev/lecture_09/lecture/index.html @@ -539,4 +539,4 @@ %17 = %12 + %16 %18 = Base.tuple(0, %15, %17) return %18

and it calculates the gradient with respect to the input as

julia> pb(1.0)
-(0, 1.0, 1.5403023058681398)

where the first item is gradient with parameters of the function itself.

Conclusion

The above examples served to demonstrate that @generated functions offers extremely powerful paradigm, especially if coupled with manipulation of intermediate representation. Within few lines of code, we have implemented reasonably powerful profiler and reverse AD engine. Importantly, it has been done without a single-purpose engine or tooling.

+(0, 1.0, 1.5403023058681398)

where the first item is gradient with parameters of the function itself.

Conclusion

The above examples served to demonstrate that @generated functions offers extremely powerful paradigm, especially if coupled with manipulation of intermediate representation. Within few lines of code, we have implemented reasonably powerful profiler and reverse AD engine. Importantly, it has been done without a single-purpose engine or tooling.

diff --git a/dev/lecture_10/hw/index.html b/dev/lecture_10/hw/index.html index 59968c9f..4dfb8d9e 100644 --- a/dev/lecture_10/hw/index.html +++ b/dev/lecture_10/hw/index.html @@ -8,4 +8,4 @@ w = [1.0, 2.0, 4.0, 2.0, 1.0] @btime thread_conv1d($x, $w);

On your local machine you should be able to achieve 0.6x reduction in execution time with two threads, however the automatic eval system is a noisy environment and therefore we require only 0.8x reduction therein. This being said, please reach out to us, if you encounter any issues.

HINTS:

  • start with single threaded implementation
  • don't forget to reverse the kernel
  • @threads macro should be all you need
  • for testing purposes create a simple script, that you can run with julia -t 1 and julia -t 2
+Solution:

Nothing to see here

diff --git a/dev/lecture_10/lab/index.html b/dev/lecture_10/lab/index.html index 57dae108..ab03a56d 100644 --- a/dev/lecture_10/lab/index.html +++ b/dev/lecture_10/lab/index.html @@ -401,4 +401,4 @@ get_cat_facts(n) = map(x -> query_cat_fact(), Base.OneTo(n)) @time get_cat_facts_async(10) # ~0.15s -@time get_cat_facts(10) # ~1.1s

Resources

  • parallel computing course by Julia Computing
+@time get_cat_facts(10) # ~1.1s

Resources

  • parallel computing course by Julia Computing
diff --git a/dev/lecture_10/lecture/index.html b/dev/lecture_10/lecture/index.html index b95ec9f9..a91f2eba 100644 --- a/dev/lecture_10/lecture/index.html +++ b/dev/lecture_10/lecture/index.html @@ -429,4 +429,4 @@ @elapsed foldxd(mergewith(+), files |> Map(histfile)) 86.44577969 @elapsed foldxt(mergewith(+), files |> Map(histfile)) -105.32969331

is much better.

Locks / lock-free multi-threadding

Avoid locks.

Take away message

When deciding, what kind of paralelism to employ, consider following

  • for tightly coupled computation over shared data, multi-threadding is more suitable due to non-existing sharing of data between processes
  • but if the computation requires frequent allocation and freeing of memery, or IO, separate processes are multi-suitable, since garbage collectors are independent between processes
  • Making all cores busy while achieving an ideally linear speedup is difficult and needs a lot of experience and knowledge. Tooling and profilers supporting debugging of parallel processes is not much developped.
  • Transducers thrives for (almost) the same code to support thread- and process-based paralelism.

Materials

+105.32969331

is much better.

Locks / lock-free multi-threadding

Avoid locks.

Take away message

When deciding, what kind of paralelism to employ, consider following

  • for tightly coupled computation over shared data, multi-threadding is more suitable due to non-existing sharing of data between processes
  • but if the computation requires frequent allocation and freeing of memery, or IO, separate processes are multi-suitable, since garbage collectors are independent between processes
  • Making all cores busy while achieving an ideally linear speedup is difficult and needs a lot of experience and knowledge. Tooling and profilers supporting debugging of parallel processes is not much developped.
  • Transducers thrives for (almost) the same code to support thread- and process-based paralelism.

Materials

diff --git a/dev/lecture_11/lab/index.html b/dev/lecture_11/lab/index.html index a6f928af..31c888d7 100644 --- a/dev/lecture_11/lab/index.html +++ b/dev/lecture_11/lab/index.html @@ -408,4 +408,4 @@
Exercise

Rewrite the vadd kernel with KernelAbstractions.jl

-Solution:

Fill out.

  • 2Taken from Flux.jl documentation
  • 3There may be more of them, however these are the main ones.
  • 4This comparison is not fair to CUDA C, where memory management is left to the user and all the types have to be specified. However at the end of the day the choice of a high level language makes more sense as it offers the same functionality and is far more approachable.
  • 5The number of blocks to be run are given by the grid dimension. Image taken from http://tdesell.cs.und.edu/lectures/cuda_2.pdf
  • 6Taken from https://developer-blogs.nvidia.com/wp-content/uploads/2017/01/cuda_indexing-1024x463.png
+Solution:

Fill out.

  • 2Taken from Flux.jl documentation
  • 3There may be more of them, however these are the main ones.
  • 4This comparison is not fair to CUDA C, where memory management is left to the user and all the types have to be specified. However at the end of the day the choice of a high level language makes more sense as it offers the same functionality and is far more approachable.
  • 5The number of blocks to be run are given by the grid dimension. Image taken from http://tdesell.cs.und.edu/lectures/cuda_2.pdf
  • 6Taken from https://developer-blogs.nvidia.com/wp-content/uploads/2017/01/cuda_indexing-1024x463.png
diff --git a/dev/lecture_11/lecture/index.html b/dev/lecture_11/lecture/index.html index 9fd3e816..5557767d 100644 --- a/dev/lecture_11/lecture/index.html +++ b/dev/lecture_11/lecture/index.html @@ -293,4 +293,4 @@ @benchmark CUDA.@sync @cuda threads=1024 blocks=512 reduce_grid_localmem(+, $(CUDA.rand(1024,1024)), $(CUDA.zeros(1))) @benchmark CUDA.@sync @cuda threads=512 blocks=2048 reduce_warp(+, $(CUDA.rand(1024,1024)), $(CUDA.zeros(1)), 0f0) @benchmark sum($(CUDA.rand(1024,1024))) -@benchmark sum($(rand(Float32, 1024,1024)))
kernel versionmin time
single thread56.399 ms
multiple threads with atomic reduction1.772 ms
parallel reduction33.381 μs
parallel reduction with local mem34.261 μs
parallel reduction with warps26.890 μs
default sum on GPU31.960 μs
default sum on CPU82.391 μs

What we have missed to optimize:

  • tune the launch configuration (for reduce_warp 128 threads seems mildly better with 25.790 μs)
  • avoid shared memory bank conflicts
  • analyse the access pattern to ensure all memory accesses are coalesced

The card we use GeForce RTX 2080 Ti which has a peak memory bandwidth of 616.0 GB/s. If we take our best launch configuration with 128 threads, our throughput is 32*2^20 / (25 * 10^-6) ≈ 1.342 TB/s which seems like we are like twice over the theoretical memory limit, which is weird

How Julia compiles CUDA kernels

CudaNative.jl (predecessor of CUDA.jl)[bessard18] was a seminal work as it was the first demonstrating Julia producing a native code for a different platform (backend) than that of the main processor. Specifically, while Julia is producing code for x86 (or Arm), CUDA.jl makes her to compile for PTX ISA of NVidia GPUs. CudaNative.jl has created a flexible toolchain, which was later generalized for backends, namely for AMD and Intel accelerators, and also for Berkeley packet filter[bpf].

An important but rarely approaciated fact is that CUDA.jl makes the development of kernels interactive, which speeds up the development a lot.

The following notes are based on [bessard18] and on inspection of CUDA.jl version 3.6.2. The birds eye view on the toolchain is as follows: cuda native toolchain

When we wish to launch kernel using @cuda (config...) function(args...), the julia (roughly) performs following steps.

  1. If the kernel is already compiled to PTX, go to step 4
  2. Use Julia's compiler to produce LLVM code for a given kernel using @code_llvm. This is the front-end part and its result is LLVM code in textual form (Julia's interface with its runtime part is textual).
  3. Take the LLVM code, perform optimization passes and adjustments, and compile the code of the kernel for CUDA backend. This is the back-end part.
  4. Use a runtime library provided by NVidia's CUDA to launch the compiled kernel.

Caveats of the front-end part

  • Authors state that for the integration, they require access to AST, lowered IR code, and to LLVM IR.
  • For seamless integration, the above is not sufficient, as the generated LLVM code will contain calls to Julia runtime, such as exception handling, garbage collection, and dynamic allocation of memory. Authors have introduced of configuration parameters and hooks to type inference and codegeneration in Julia v0.6, which allows to lower code without these calls (other considered option was to remove these calls in post-processing would lead to messy, fragile, and error-prone implementation.)
  • The above hooks allow to reuse most of Julia's code generation. There is a lot of functionality you got for free, like parsing, macro-expansion, optimization passes.

Caveats of the back-end part

  • @cuda (config...) function(args...) is the main gateway to launch cuda kernels. It compiles the kernel function if needed and lauch it on GPU device with config.
  • The LLVM code produced in the front-end part is compiled using LLVM NVPTX compiler, which is accessed through a wrapper LLVM.jl. Funny enough, LLVM.jl just exposes C-functions of LLVM compiler shipped with Julia.
  • The fact that LLVM can generate PTX code avoid shipping of another compiler.
  • Before LLVM code is passed to the LLVM compiler, certain optimizations are performed. For example immutable parameters are always passed by value instead of the reference.

Sources

  • 1Spectre side channel attack on wikipedia. [Url]https://en.wikipedia.org/wiki/Spectre(securityvulnerability))
  • 2Taken from url
  • bpfhttps://ebpf.io/
  • bessard18Besard, Tim, Christophe Foket, and Bjorn De Sutter. "Effective extensible programming: unleashing Julia on GPUs." IEEE Transactions on Parallel and Distributed Systems 30.4 (2018): 827-841.
+@benchmark sum($(rand(Float32, 1024,1024)))
kernel versionmin time
single thread56.399 ms
multiple threads with atomic reduction1.772 ms
parallel reduction33.381 μs
parallel reduction with local mem34.261 μs
parallel reduction with warps26.890 μs
default sum on GPU31.960 μs
default sum on CPU82.391 μs

What we have missed to optimize:

  • tune the launch configuration (for reduce_warp 128 threads seems mildly better with 25.790 μs)
  • avoid shared memory bank conflicts
  • analyse the access pattern to ensure all memory accesses are coalesced

The card we use GeForce RTX 2080 Ti which has a peak memory bandwidth of 616.0 GB/s. If we take our best launch configuration with 128 threads, our throughput is 32*2^20 / (25 * 10^-6) ≈ 1.342 TB/s which seems like we are like twice over the theoretical memory limit, which is weird

How Julia compiles CUDA kernels

CudaNative.jl (predecessor of CUDA.jl)[bessard18] was a seminal work as it was the first demonstrating Julia producing a native code for a different platform (backend) than that of the main processor. Specifically, while Julia is producing code for x86 (or Arm), CUDA.jl makes her to compile for PTX ISA of NVidia GPUs. CudaNative.jl has created a flexible toolchain, which was later generalized for backends, namely for AMD and Intel accelerators, and also for Berkeley packet filter[bpf].

An important but rarely approaciated fact is that CUDA.jl makes the development of kernels interactive, which speeds up the development a lot.

The following notes are based on [bessard18] and on inspection of CUDA.jl version 3.6.2. The birds eye view on the toolchain is as follows: cuda native toolchain

When we wish to launch kernel using @cuda (config...) function(args...), the julia (roughly) performs following steps.

  1. If the kernel is already compiled to PTX, go to step 4
  2. Use Julia's compiler to produce LLVM code for a given kernel using @code_llvm. This is the front-end part and its result is LLVM code in textual form (Julia's interface with its runtime part is textual).
  3. Take the LLVM code, perform optimization passes and adjustments, and compile the code of the kernel for CUDA backend. This is the back-end part.
  4. Use a runtime library provided by NVidia's CUDA to launch the compiled kernel.

Caveats of the front-end part

  • Authors state that for the integration, they require access to AST, lowered IR code, and to LLVM IR.
  • For seamless integration, the above is not sufficient, as the generated LLVM code will contain calls to Julia runtime, such as exception handling, garbage collection, and dynamic allocation of memory. Authors have introduced of configuration parameters and hooks to type inference and codegeneration in Julia v0.6, which allows to lower code without these calls (other considered option was to remove these calls in post-processing would lead to messy, fragile, and error-prone implementation.)
  • The above hooks allow to reuse most of Julia's code generation. There is a lot of functionality you got for free, like parsing, macro-expansion, optimization passes.

Caveats of the back-end part

  • @cuda (config...) function(args...) is the main gateway to launch cuda kernels. It compiles the kernel function if needed and lauch it on GPU device with config.
  • The LLVM code produced in the front-end part is compiled using LLVM NVPTX compiler, which is accessed through a wrapper LLVM.jl. Funny enough, LLVM.jl just exposes C-functions of LLVM compiler shipped with Julia.
  • The fact that LLVM can generate PTX code avoid shipping of another compiler.
  • Before LLVM code is passed to the LLVM compiler, certain optimizations are performed. For example immutable parameters are always passed by value instead of the reference.

Sources

  • 1Spectre side channel attack on wikipedia. [Url]https://en.wikipedia.org/wiki/Spectre(securityvulnerability))
  • 2Taken from url
  • bpfhttps://ebpf.io/
  • bessard18Besard, Tim, Christophe Foket, and Bjorn De Sutter. "Effective extensible programming: unleashing Julia on GPUs." IEEE Transactions on Parallel and Distributed Systems 30.4 (2018): 827-841.
diff --git a/dev/lecture_12/hw/b27ff9c6.svg b/dev/lecture_12/hw/d4abc442.svg similarity index 92% rename from dev/lecture_12/hw/b27ff9c6.svg rename to dev/lecture_12/hw/d4abc442.svg index 1b42f5e3..425cd20a 100644 --- a/dev/lecture_12/hw/b27ff9c6.svg +++ b/dev/lecture_12/hw/d4abc442.svg @@ -1,56 +1,56 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/lecture_12/hw/index.html b/dev/lecture_12/hw/index.html index f61e371c..c229c19e 100644 --- a/dev/lecture_12/hw/index.html +++ b/dev/lecture_12/hw/index.html @@ -82,4 +82,4 @@ # RK2 solve (t,X) = solve(prob, RK2(0.2)) plot!(p1,t,X[1,:], color=1, lw=3, alpha=0.8, label="x RK2") -plot!(p1,t,X[2,:], color=2, lw=3, alpha=0.8, label="y RK2")Example block output +plot!(p1,t,X[2,:], color=2, lw=3, alpha=0.8, label="y RK2")Example block output diff --git a/dev/lecture_12/lab/5ea3d638.svg b/dev/lecture_12/lab/4f18fd49.svg similarity index 95% rename from dev/lecture_12/lab/5ea3d638.svg rename to dev/lecture_12/lab/4f18fd49.svg index c2a6b604..727432ac 100644 --- a/dev/lecture_12/lab/5ea3d638.svg +++ b/dev/lecture_12/lab/4f18fd49.svg @@ -1,48 +1,48 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/lecture_12/lab/6729df7c.svg b/dev/lecture_12/lab/a48eac4d.svg similarity index 97% rename from dev/lecture_12/lab/6729df7c.svg rename to dev/lecture_12/lab/a48eac4d.svg index fa58e832..59d6dc9b 100644 --- a/dev/lecture_12/lab/6729df7c.svg +++ b/dev/lecture_12/lab/a48eac4d.svg @@ -1,54 +1,54 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/lecture_12/lab/dbee3586.svg b/dev/lecture_12/lab/d072df33.svg similarity index 91% rename from dev/lecture_12/lab/dbee3586.svg rename to dev/lecture_12/lab/d072df33.svg index 86be5e0f..d38dbd85 100644 --- a/dev/lecture_12/lab/dbee3586.svg +++ b/dev/lecture_12/lab/d072df33.svg @@ -1,52 +1,52 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/lecture_12/lab/index.html b/dev/lecture_12/lab/index.html index 3d5cc4aa..6634cef9 100644 --- a/dev/lecture_12/lab/index.html +++ b/dev/lecture_12/lab/index.html @@ -55,7 +55,7 @@ (t,X) = solve(prob, Euler(0.2)) plot!(p1,t,X[1,:], color=1, lw=3, alpha=0.8, label="x Euler") -plot!(p1,t,X[2,:], color=2, lw=3, alpha=0.8, label="y Euler")Example block output

As you can see in the plot above, the Euler method quickly becomes quite inaccurate because we make a step in the direction of the tangent which inevitably leads us away from the perfect solution as shown in the plot below. euler

In the homework you will implement a Runge-Kutta solver to get a much better accuracy with the same step size.

Automating GaussNums

Next you will implement your own uncertainty propagation. In the lecture you have already seen the new number type that we need for this:

struct GaussNum{T<:Real} <: Real
+plot!(p1,t,X[2,:], color=2, lw=3, alpha=0.8, label="y Euler")
Example block output

As you can see in the plot above, the Euler method quickly becomes quite inaccurate because we make a step in the direction of the tangent which inevitably leads us away from the perfect solution as shown in the plot below. euler

In the homework you will implement a Runge-Kutta solver to get a much better accuracy with the same step size.

Automating GaussNums

Next you will implement your own uncertainty propagation. In the lecture you have already seen the new number type that we need for this:

struct GaussNum{T<:Real} <: Real
     μ::T
     σ::T
 end
@@ -130,7 +130,7 @@ plot!(p, t, mu.(X[1,:])) return p - enduncertainplot (generic function with 1 method)

uncertainplot(t, X[1,:])
Example block output

Unfortunately, with this approach, we would have to define things like uncertainplot! and kwargs to the function by hand. To make plotting GaussNums more pleasant we can make use of the @recipe macro from Plots.jl. It allows to define plot recipes for custom types (without having to depend on Plots.jl). Additionally, it makes it easiert to support all the different ways of creating plots (e.g. via plot or plot!, and with support for all keyword args) without having to overload tons of functions manually. If you want to read more about plot recipies in the docs of RecipesBase.jl. An example of a recipe for vectors of GaussNums could look like this:

@recipe function plot(ts::AbstractVector, xs::AbstractVector{<:GaussNum})
+       enduncertainplot (generic function with 1 method)

uncertainplot(t, X[1,:])
Example block output

Unfortunately, with this approach, we would have to define things like uncertainplot! and kwargs to the function by hand. To make plotting GaussNums more pleasant we can make use of the @recipe macro from Plots.jl. It allows to define plot recipes for custom types (without having to depend on Plots.jl). Additionally, it makes it easiert to support all the different ways of creating plots (e.g. via plot or plot!, and with support for all keyword args) without having to overload tons of functions manually. If you want to read more about plot recipies in the docs of RecipesBase.jl. An example of a recipe for vectors of GaussNums could look like this:

@recipe function plot(ts::AbstractVector, xs::AbstractVector{<:GaussNum})
     # you can set a default value for an attribute with `-->`
     # and force an argument with `:=`
     μs = [x.μ for x in xs]
@@ -153,4 +153,4 @@
 
 # now we can easily plot multiple things on to of each other
 p1 = plot(t, X[1,:], label="x", lw=3)
-plot!(p1, t, X[2,:], label="y", lw=3)
Example block output

References

+plot!(p1, t, X[2,:], label="y", lw=3)Example block output

References

diff --git a/dev/lecture_12/lecture/index.html b/dev/lecture_12/lecture/index.html index d069d5dd..a3d0af79 100644 --- a/dev/lecture_12/lecture/index.html +++ b/dev/lecture_12/lecture/index.html @@ -72,4 +72,4 @@ setmean!(gop::GaussODEProblem,x::AbstractVector) = begin gop.mean.x0[gop.unc_in_u]=x[1:length(gop.unc_in_u)] gop.mean.θ[gop.unc_in_θ]=x[length(gop.unc_in_u).+[1:length(gop.unc_in_θ)]] -end

Constructor accepts an ODEProblem with uncertain numbers and converts it to GaussODEProblem:

  • goes through ODEProblem $x0$ and $θ$ fields and checks their types
  • replaces GaussNums in ODEProblem by ordinary numbers
  • remembers indices of GaussNum in $x0$ and $θ$
  • copies standard deviations in GaussNum to $sqΣ0$
+end

Constructor accepts an ODEProblem with uncertain numbers and converts it to GaussODEProblem:

  • goes through ODEProblem $x0$ and $θ$ fields and checks their types
  • replaces GaussNums in ODEProblem by ordinary numbers
  • remembers indices of GaussNum in $x0$ and $θ$
  • copies standard deviations in GaussNum to $sqΣ0$
diff --git a/dev/projects/index.html b/dev/projects/index.html index 30340405..4685684d 100644 --- a/dev/projects/index.html +++ b/dev/projects/index.html @@ -19,4 +19,4 @@ │ └── index.md ├── README.md # describes in short what the pkg does and how to install pkg (e.g. some external deps) and run the example ├── Project.toml # lists all the pkg dependencies -└── Manifest.toml # usually not committed to git as the requirements may be to restrictive

Make sure that

  • README.md is present and contains general information about the package. A small example is a nice to have.
  • The package can be installed trough the package manager as Pkg.add("url of the package") with all and correct dependencies. Do not register the package into an official registry if you are not willing to continue its development and maintainance.
  • Make sure that the package is covered by tests which are in the test folder. We will try to run them. There is no need for 100% percent test coverage. Tests testing the functionality are sufficient.
  • The package should have basic documentation. For small packages, it is sufficient to have documentation in readme. For larger pacakges, proper documentation with Documenter.jl is advised.

Only after all this we may look at the extent of the project and it's difficulty, which may help us in deciding between grades.

Nice to have things, which are not strictly required but obviously improves the score.

  • Ideally the project should be hosted on GitHub, which could have the continuous integration/testing set up.
  • Include some benchmark and profiling code in your examples, which can show us how well you have dealt with the question of performance.
  • Some parallelization attempts either by multi-processing, multi-threading, or CUDA. Do not forget to show the improvement.
  • Documentation with a webpage using Documenter.jl.

Former projects for your inspiration

The following is a list of great projects of past years.

+└── Manifest.toml # usually not committed to git as the requirements may be to restrictive

Make sure that

  • README.md is present and contains general information about the package. A small example is a nice to have.
  • The package can be installed trough the package manager as Pkg.add("url of the package") with all and correct dependencies. Do not register the package into an official registry if you are not willing to continue its development and maintainance.
  • Make sure that the package is covered by tests which are in the test folder. We will try to run them. There is no need for 100% percent test coverage. Tests testing the functionality are sufficient.
  • The package should have basic documentation. For small packages, it is sufficient to have documentation in readme. For larger pacakges, proper documentation with Documenter.jl is advised.

Only after all this we may look at the extent of the project and it's difficulty, which may help us in deciding between grades.

Nice to have things, which are not strictly required but obviously improves the score.

  • Ideally the project should be hosted on GitHub, which could have the continuous integration/testing set up.
  • Include some benchmark and profiling code in your examples, which can show us how well you have dealt with the question of performance.
  • Some parallelization attempts either by multi-processing, multi-threading, or CUDA. Do not forget to show the improvement.
  • Documentation with a webpage using Documenter.jl.

Former projects for your inspiration

The following is a list of great projects of past years.