Henrique’s laboratory book

For now this is only a scratch, as I’m still reading up others Emacs configs and setting up my mobile Linux environment.

Journal

Here I’ll write my daily tasks and to-dos so I can keep track of what I’m doing.

To-do

Active items:

[-] fix the vector reduction code
- [X] free all allocated memory <2019-07-22 seg>
- [X] verify malloc returns (big allocs are possible) <2019-07-15 seg>
- [X] create verbose flag (compile time!) <2019-07-15 seg>
  - only essential data should be displayed (time, #elem, block size)
  - the rest should be turned on or off at compile time
- [ ] use starpu’s push and pop to tag tasks
- [ ] see why the dag is messy
  - check this out
[-] experiment with the vector reduction code
- [X] create a dedicated experiments folder in my home folder <2019-07-21 dom>
- [ ] do some preliminary experiments to set ideal variable sizes and etc
  - refer and build upon the experiment design R code
  - given some fixed variables, play around the free variables
- [ ] create a slurm script to launch the experiments
  - given that we’ll test a lot of machines, should i create a launcher script?
- [ ] collect data in $SCRATCH folder then move to experiments folder
[ ] analyze the obtained data
- [ ] our metric is time, look for optimized parameter choices

Secondary items:

[ ] learn about paje traces and files
[ ] learn more about r data analysis and plotting
[ ] read that one ondes3d paper

Fluff items (for now):

[ ] finish my config.org file
[ ] read more about openmp and mpi

Finished items:

[X] test the timestamp thingy <2019-07-14 dom>
[X] restock the snacks <2019-07-15 seg>
[X] push the config.org file when home <2019-07-15 seg>

Daily

Here will lie my daily thoughts and daily happenings.

2019-05-02

Today the day was basically dedicated to formatting and installing my distro in my new computer. It has a 4:3 screen, which will surely be kinda funny to work with. Also my computer only have one analog video entry and 2 displayports, for some reason.

Anyway, I also researched and learned a lot about ssh while I was trying to get my public key into portal.inf.ufrgs.br. With Pablo’s and Jean’s help I fixed the permissions of my home directory in the server (the $HOME directory needs the 700 permission to work with ssh! Probably someone messed up a few years back when they created my user).

Tomorrow I’ll finish the setting-up ordeal, I hope.

2019-05-03 StarPU Hello World Examples

Before trying anything with StarPU, I tried to run the first experiment in schnorr/par, which didn’t work. The job quits with the exit code 71, to which I’ve found no information online. I’m kinda tired today but next week I’ll make sure that I talk to either Nesi or Marcelo or Matheus about it.

Also I’ve tried installing StarPU using spack in the cluster, but there was no StarPU package available.

On the other hand, I did create some folder in my user to organize things up and I’ve also set up the ssh keys of my new computer in almost every relevant website.

UPDATE: So, when I got home I continued trying things out. I’ve tried to allocate some nodes to try the simplest experiment I’ve tried earlier and, after playing around and learning Slurm commands, I’ve noticed that I cant ssh into any node because my RSA key doesn’t match the one in the cluster (or doesn’t exist at all there). Maybe that’s the culprit for me not being able to even get the simplest example running through sbatch? I’ll contact Schnorr about this.

StarPU “Hello World”

Install preliminary software

spack

See https://github.com/spack/spack to do:

git clone https://github.com/spack/spack.git
source spack/share/spack/setup-env.sh
spack find

Then, add the solverstack from the INRIA GitLab:

git clone https://gitlab.inria.fr/solverstack/spack-repo.git solverstack
spack repo add solverstack/

starpu with spack

spack info starpu

Verify options, then:

spack install starpu@master~cuda~examples~fast+fortran+fxt+mlr~mpi~nmad~opencl~openmp+poti+shared~simgrid~simgridmc~verbose

This might take some time, do it in the cluster.

Confirm the location of where starpu has been installed

spack location -i starpu

StarPU client code of two examples

There are two examples:

programa.c (simple one-task hello world)
vector_scal.c (multiply a vector by a scalar in parallel)

See contents in ./experiments/starpu/hello-world/.

Please note that we are using CMake to find the StarPU libraries.

The, do the following steps (try to understand each one).

Make sure you have spack in your PATH variable before going forward.

cd src/starpu-hello-world
mkdir -p build
cd build
cmake -DSTARPU_DIR=$(spack location -i starpu) ..
make

You’ll have two binaries: programa and vector_scal.

Verify that they have the correct libraries linked with ldd.

Run both by launching these binaries in your CLI.

2019-05-06

Today I ran the hello.slurm file from the first experiment of schnorr/par. I had to do some modifications to the script so that it would actually find the executable (as it wasn’t finding inside the folder I was running sbatch from, even though it had no trouble compiling it).

Also I’ve added info about MPI in the External Resources section, which are really just some tutorials and introductions to the matter. I found the MPI interface to be rather cumbersome with its C-like functions and inits. Doesn’t a proper C++ wrapper exist somewhere? Maybe that takes away part of the complexity of the syntax choices. I’ll look around.

Also, I’m kinda becoming really attached to my Emacs development environment. I’ve gathered quite a few nice .org configs and I’m making my own now at this link.

2019-05-08

I studied a lot of database fundamentals, as I had it’s exam by afternoon.

2019-05-09

I started the day by reading about and learning tmux, which is, as it’s called, an “terminal multiplexer”. Knowing how to use tmux will help me to run commands and close the ssh connection, leaving the session open so I can easily come back and resume the operations and tasks I was performing.

Also, I read the LLNL’s tutorial on Linux clusters and gathered a lot of new resources to complement my External resources section (besides learning a lot, obviously).

2019-05-10

Today I started the day by fixing the multiple tmux sessions while ssh‘ing. So, the issue was that, when I ssh‘ed into the GPPD front-end, I’d make a check in the .bashrc to see if there was a session opened (named “ssh_s”) and attach to it. Thing is, all nodes share the .bashrc file, and this would happen when I ssh into the nodes as well.

# Start a tmux session automatically if coming in from ssh.
if [[ -z "$TMUX" ]] && [ "$SSH_CONNECTION" != "" ]; then
    tmux attach-session -t ssh_s || tmux new-session -s ssh_s
fi

To fix this, Matheus suggested that I should add an additional check to the if statement to see the name of the host and only open a new session if the host was gppd-hpc:

# Start a tmux session automatically if coming in from ssh.
if [[ -z "$TMUX" ]] && [ "$SSH_CONNECTION" != "" ] && [ `hostname` == "gppd-hpc" ]; then
    tmux attach-session -t ssh_s || tmux new-session -s ssh_s
fi

I also furthered the development of my org configuration file for Emacs, and very soon I’ll be able to test it, initially still with Prelude and then on pure Emacs.

Besides that, I talked with professor Erika about the roles of an IC and the research process and methodologies. She was very helpful, as always. After that, I talked to Schnorr and arranged a meeting next tuesday to talk about that and some other things. I shall make a new heading in the “Meetings” to put all the topics I wish to talk about there.

2019-05-13

As of lunch time, I’ve updated the resources directory and added a new heading for tomorrow’s meeting, in which I’ve added the topics I wish to discuss.

2019-05-14

I added a bunch of info on reproducible analysis using R and I’m currently watching a video on org-mode and reproducible research while I wait for the meeting.

2019-05-16

We decided in the last meeting that I should modify the StarPU vector example to do a reduction of the generated vectors. Also I’ve proposed an object-oriented approach to the problem using C++, so what I’ll do first is set up my Emacs environment and learn CMake.

Update: Yesterday I was so tired I forgot to push. Also, I had some issues with a short circuit in my desktop. Thankfully I solved it by removing the CD drive, which probably was grounding the motherboard.

2019-05-17

My Emacs configuration file has advanced a lot in the last few days. From yesterday until today I’ve been trying to get the cmake-ide package to work. Even though I’ve been failing pretty miserably, I’m getting close.

Here’s the link to my config file, by the way.

2019-05-20

I had to scramble in the morning to finish part of an assignment that one of my group colleagues couldn’t finish and which presentation was also today. For that I couldn’t contribute or work in my scholarship project.

2019-05-21

Today I researched a bunch about CMake and how to structure a project that uses it. CMake in itself is very powerful but with it you can use something like the Ninja build system, which greatly speeds up the build process as it is asynchronous in nature.

I did advanced somewhat in the making of my CMakeLists.txt, but not enough in my opinion. I’m taking too long in small details (such as this whole CMake thing). My primary focus should be to just get it working, as the whole ideal of creating wrapper classes for the StarPU concepts will already be enough of a challenge.

In other news, I’m kinda overwhelmed emotionally right now so it’s very hard to keep my focus on things. These are personal issues, I know, but I should be clear about it, as it impacts my abilities to be effective and to make progress in my scholarship goals.

2019-05-23

Changed the project structure, finished the CMake files and thought more about the wrapper classes and their possible solutions.

2019-05-24

Today I advanced somewhat on building the wrapper classes to StarPU, but, while I read the documentation, I noticed that the task isn’t even easy to begin with. After talking to Schnorr about some questions I had, we decided that if I focus into getting the vector reduction going I could more easily start working in more complex applications of StarPU.

So, we defined that next tuesday, 28/05, I should deliver the code so that we analyze it together.

2019-05-27

I’ve modified the ./experiments/starpu/vector-reduction/vector_scal.cc code and now it should do the reduction as expected. I couldn’t test it though, as I’ve failed to link properly the StarPU libraries. I’ll keep trying tomorrow.

2019-05-28

With Nesi’s help I was able to compile my vector testing. The whole fundamentals of how should each task perform its job and, if necessary, write its results to a memory handle (which are registered so there is sharing of data between tasks) I understood. To me, it isn’t very clear how you would partition an application to take advantage of said task-based parallelism (and I think this is the important part).

If I try for long enough, I can get a working version of this code going, but then what’s the point if I don’t know how to take advantage of my know-how (in terms of “I somewhat know how to build a simple StarPU application”)? Also, I tried looking for the slides from the PCAM class but I didn’t find them.

2019-06-07

Today I’ve talked to Schnorr about my interest in staying in the group and in the new theme of the internship project (2019 - 2020).

Also I’ve discussed with him the preparations for the SIC2019. I’ll write a summary about my internship so far and the themes it encompasses (the deadline is 21/06).

2019-06-08

So far the summary has a nice looking title and authors section. Anyway, I’ve talked to Valeria yesterday and she sent me her summary for last year’s SIC. I’ll use it as reference when I start making mine.

2019-06-09

I’ve reorganized the starpu-cpp repository, which for now stays private. I have no intentions of making it public any time soon, as the StarPU project uses a custom version of the GPL-v3 and its purpose is for my benefit only. When it’s working I’ll consider making it public.

I intend to write some more of the SIC summary today, but I’ll focus on trying to finish the vector reduction code.

2019-06-14

I’ve fixed the CMakeLists from the vector reduction code and now it works! Also I’ve made some helper functions and the code is now easier to read.

2019-06-15

The vector reduction code is now working! The development cycle was greatly diminished when I installed StarPU in my computer (go figure huh).

There are some not-that-great fixes to make the code work, but in my opinion it’s pretty good.

2019-06-16

There is a much simpler way to do the data partitioning between the tasks. Here follow some links to help me in the future:

get_sub_data
how to partition data
starpu memory pin

2019-06-17

The following link is really useful when you’re introducing loads of concepts of parallel computing: https://computing.llnl.gov/tutorials/parallel_comp/

2019-06-18

Today I’ve talked to Schnorr and defined that finishing the summary text for the SIC 2019 is the objective for now. We have defined some points of improvement in the text and what the last paragraph should talk about.

2019-06-19

With Nesi’s help I finished the summary text for SIC 2019. I think there’s not much else to add, but I suppose we could add some small executions of the code? Though talking about that would require more text space.

2019-06-22

I’ve made a working vector reduction using StarPU’s vector partition and unpartition (as in using sub-handles and such). Some preliminary testing has shown that it works kinda well.

2019-06-23

Small code fix and that’s it.

2019-07-02

In order to get myself back on track, I’ll do here a to-do list of what I think should be done next.

To-do:

Experiment with the vector reduction code
Talk more with people about writing that article to WSCAD
Make a vector or matrix multiplication version

2019-07-14

So, my semester has ended! I’m updating the to-do list and getting what I need to do under control. That being said, I should recap things with either Nesi or Schnorr.

2019-07-15

It’s one thing to check the malloc returns and to create a macro to print log messages, but it’s a whole different ordeal to free all mallocs with StarPU. I’ll look into the runtime’s own ways to do this.

Also (and kinda related to the previous point) I should check out the supported data reduction mechanism that StarPU provides. If I follow the rules of the game, the malloc freeing thing shouldn’t be an issue.

2019-07-16

Today there was not much progress today but I did some reading of papers!

2019-07-22

Today I advanced a little bit more. I’m doing a bit of a late shift here in the lab, as I prefer doing this to waking up early.

2019-07-24

Today I’ll hopefully finish the slurm script.

UPDATE: I did not.

2019-07-25

Again, a slow day. I’ve noticed my focus isn’t on point lately. I’ll try to work from home the next days.

2019-07-29

Today I made some great advances to run the script that issues sbatches. I’ll try to run it when I get home (of course, using just one node while I test it).

2019-08-09

Setting up a ssh-server on my lab machine: (first name it again, but this time be creative)

https://dev.to/zduey/how-to-set-up-an-ssh-server-on-a-home-computer
https://www.makeuseof.com/tag/beginners-guide-setting-ssh-linux-testing-setup/
https://www.digitalocean.com/community/tutorials/how-to-set-up-ssh-keys-on-ubuntu-1604

Meetings

This could stay inside its respective entry in the daily journal, but I think that separating meetings from the dailies is better.

2019-04-30 Tips for ORG-Mode

See the attached file in ./attachments/init.org, or follow the update instructions here that points to the learninglab.

2019-05-14 Meeting

Topic I want to talk about:

Current learning stack/path: as exposed in the learning path
Current progression: in terms of task completion rate
Organization and discipline: assiduity, compromise and hours completed

Goals:

[ ] Change starpu hello-world vector_scal.cc to have a new task with a new code to compute the reduction of the resulting vectors. The reduction has to be the sum operation.
[ ] Implement a new starpu program to compute the dot product as defined in https://pt.wikipedia.org/wiki/Produto_escalar

Think about:

[ ] Try to remember how the LU decomposition algorithm works, and think about how to implement using tasks.
[ ] How to implement the Mandelbrot with StarPU tasks?
- Promote discussing about scheduling algorithms
- Promote discussing about load imbalance

2019-05-28 Meeting

Fixed implementation of vector_scal

To-do:

Finish the fixed implementation
- Use valgrind to verify memory leaks
- Make sure all leaks are gone
  - All numbers reported by Valgrind should be zero
Do a multi-level reduction scheme using an additional parameter that will tell you how much aggregation is carried out in each level
Think about an application you are interested in
- It can be some simulation, whatever
- By default, we go to some linear algebra factorization
Perhaps change the vector_scal problem to a vector_multiplication
- The initial task cpu_func will have two implementations, one in CPU and another for GPU (in this case, use tupi1 with 2 GPUs)
Create a SLURM script to run all experiments
- Check ERAD/RS shortcourse https://gitlab.com/schnorr/erad19 (tutorial slurm) http://gppd-hpc.inf.ufrgs.br/ http://gppd-hpc.inf.ufrgs.br/#orga79ce5a (5.2 Jobs Não-Interativos (sbatch))

cmake -DSTARPU_DIR=$(spack location -i starpu) ..

Or use stow for a more amateur approach.

2019-06-11 Meeting

See ./documents/sic-2019/summary.org.

2019-06-18 Meeting

See ./documents/sic-2019/summary.org.

2019-07-03 Meeting

Bureaucracy with next scholarship
- Deadline 22/07 (Henrique resolve)
Discussion about the SIC 2019 Poster
- 15/08 a 16/09, according to http://www.ufrgs.br/propesq1/sic2019/wp-content/uploads/2019/05/Cronograma-DIVULGA%C3%87%C3%83O-SIC-2019.pdf
Data da semana acadêmica mudou para
- De 21 a 25/10/2019

vector_reduc

[ ] Valgrind check: make the run fully clean (all zeroes at the end)
[ ] Verify all malloc calls and exit cleanly if they return zero
[ ] Remove debug messages when in production
- Keep only fundamental statistics and messages about the run like
  - elapsed time
  - number of elements
  - block size
  - …
[ ] Use startpu iteration push and pop to automatically tag tasks against your main loop iteration which basically represents the level of the reduction
- http://starpu.gforge.inria.fr/doc/html/group__API__Codelet__And__Tasks.html#gad3adbc7185e231bf23c94c76b85c3047
[ ] Try to understand why the DAG is messy

Deal with trace files from vector_reduc

Take a look at https://github.com/schnorr/starvz/tree/master/src
- Copy fxt2paje and paje_sort

Usage example:

pushd ~/svn/henrique/ic/code/starpu/vector-reduction/build/
../bin/vector_reduc 1000 50 2
popd
source ~/spack/share/spack/setup-env.sh
export PATH=$(spack location -i starpu/l43k3yq)/bin/:$PATH
wget -nc https://raw.githubusercontent.com/schnorr/starvz/master/src/fxt2paje.sh
wget -nc https://raw.githubusercontent.com/schnorr/starvz/master/src/paje_sort.sh
chmod 755 fxt2paje.sh paje_sort.sh
export PATH=$(pwd):$PATH
mkdir -p /tmp/teste/
cp /tmp/prof_file_* /tmp/teste/
cd /tmp/teste/
fxt2paje.sh
twopi dag.dot -Tpng -o x.png
pj_dump --user-defined paje.sorted.trace > paje.sorted.csv
cat paje.sorted.csv | grep ^State | grep Worker\ State | grep Reduction | grep -v "0.000000, 0.000000" > rastro.csv
cat rastro.csv

[ ] Read about pj_dump (the CSV output)
- https://github.com/schnorr/pajeng/wiki/pj_dump
[ ] Learn about http://paje.sourceforge.net/
- https://github.com/schnorr/pajeng/raw/master/doc/lang-paje/lang-paje.pdf

Read rastro.csv in R.

suppressMessages(library(tidyverse))
read_csv("/tmp/teste/rastro.csv", col_names=FALSE, col_types=cols()) %>%
    select(-X1, -X3, -X7) %>%
    rename(Thread = X2,
           Start = X4,
           End = X5,
           Duration = X6,
           State = X8) %>%
    mutate(Thread = gsub("CPU", "", Thread) %>% as.integer) %>%
    mutate(End = End - min(Start),
           Start = Start - min(Start)) %>%
    print -> df;

# A tibble: 102 x 16
   Thread Start   End Duration State    X9 X10   X11   X12     X13   X14   X15
    <int> <dbl> <dbl>    <dbl> <chr> <dbl> <chr> <chr> <chr> <dbl> <dbl> <dbl>
 1      1 0.330 0.335  0.00546 Redu…    84 V20x… bc46… 0000…    55    55     0
 2      1 0.343 0.347  0.00348 Redu…    84 V20x… bc46… 0000…    61    61     0
 3      1 0.353 0.356  0.00329 Redu…    84 V20x… bc46… 0000…    65    65     0
 4      1 0.363 0.366  0.00321 Redu…    84 V20x… bc46… 0000…    69    69     0
 5      1 0.373 0.376  0.00316 Redu…    84 V20x… bc46… 0000…    73    73     0
 6      1 0.383 0.386  0.00328 Redu…    84 V20x… bc46… 0000…    77    77     0
 7      1 0.393 0.396  0.00335 Redu…    84 V20x… bc46… 0000…    79    79     0
 8      1 0.403 0.406  0.00333 Redu…    84 V20x… bc46… 0000…    81    81     0
 9      1 0.413 0.417  0.00356 Redu…    84 V20x… bc46… 0000…    85    85     0
10      1 0.423 0.426  0.00322 Redu…    84 V20x… bc46… 0000…    89    89     0
# … with 92 more rows, and 4 more variables: X16 <dbl>, X17 <dbl>, X18 <dbl>,
#   X19 <dbl>

df %>%
    ggplot(aes(xmin=Start, xmax=End, ymin=Thread, ymax=Thread+0.9, fill=State)) +
    geom_rect() +
    theme_bw(base_size=16) +
    theme(legend.position="top",
          legend.justification="left")

2019-07-19 Meeting

Talk about possible future paths:

Partial Differential Equations
1D CFD (rod as in 3blue1brown)
Ondes3D
Gaps in the DAG (aka gapness of scheduler decisions)

About the current objective (DAG and StarPU):

Full factorial design -> CSV -> Slurm script -> execute
- Check ERAD/RS 2019 mini course “Boas práticas”

library(DoE.base)
library(tidyverse)

size = c("P", "M", "G")
nb = c("P", "M", "G")
fr = c("P", "M", "G")

fac.design (
    nfactors=3,
    replications=10,
    repeat.only=FALSE,
    blocks=1,
    randomize=TRUE,
    seed=10373,
    factor.names=list(
        Size=size,
        NumberOfBlocks=nb,
        ReductionFactor=fr)) %>%
    as_tibble %>%
    select(-Blocks) %>%
    write_csv("exp1.csv")

:

creating full factorial with 27 runs ...

For WSPPD we established that this small case study should be our object. So, we analyze the execution times for the experiment above, given the defined variables, in a bunch of partitions in the cluster.

Experiments

Here I’ll list the experiments I have or am doing at the moment.

Vector Reduction

External resources

Here I’ll categorize useful resources I’ve found while “aggressively” googling and/or reading papers and other documents.

Linux

Any useful Linux knowledge relevant to my activities should stay here.

tmux

tmux is a terminal multiplexer for Unix-like operating systems. It allows multiple terminal sessions to be accessed simultaneously in a single window. It is useful for running more than one command-line program at the same time. It can also be used to detach processes from their controlling terminals, allowing SSH sessions to remain active without being visible.

Tutorials:

https://edricteo.com/tmux-tutorial/
https://hackernoon.com/a-gentle-introduction-to-tmux-8d784c404340
https://danielmiessler.com/study/tmux/

Servers

Here lies all knowledge I don’t possess about servers and cluster and so on and so forth.

Clusters

Definition

Tutorials:

IBM From 2002 but still explains a lot of the fundamental concepts.
LLNL Huge! Includes exercises, Slurm, GPU clusters, and much more.
Wikipedia Explains pretty well in layman terms what is a cluster.

Slurm

Slurm is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters.

Documentation

Tutorials:

Documentation tutorial
LLNL’s tutorial

Useful commands:

sacct: is used to report job or job step accounting information about active or completed jobs.
salloc: is used to allocate resources for a job in real time. Typically this is used to allocate resources and spawn a shell.
sattach: is used to attach standard input, output, and error plus signal capabilities to a currently running job or job step. One can attach to and detach from jobs multiple times.
sbatch: is used to submit a job script for later execution. The script will typically contain one or more srun commands to launch parallel tasks.
sbcast: is used to transfer a file from local disk to local disk on the nodes allocated to a job.
scancel: is used to cancel a pending or running job or job step. It can also be used to send an arbitrary signal to all processes associated with a running job or job step.
sinfo: reports the state of partitions and nodes managed by Slurm. It has a wide variety of filtering, sorting, and formatting options.
smap: reports state information for jobs, partitions, and nodes managed by Slurm, but graphically displays the information to reflect network topology.
squeue: reports the state of jobs or job steps. By default, it reports the running jobs in priority order and then the pending jobs in priority order.
srun: is used to submit a job for execution or initiate job steps in real time.
strigger: is used to set, get or view event triggers. Event triggers include things such as nodes going down or jobs approaching their time limit.
sview: is a graphical user interface to get and update state information for jobs, partitions, and nodes managed by Slurm.

All command’s manuals are in man, so no worries if this is to little info.

Spack

Spack is a package management tool designed to support multiple versions and configurations of software on a wide variety of platforms and environments. It was designed for large supercomputing centers, where many users and application teams share common installations of software on clusters with exotic architectures, using libraries that do not have a standard ABI.

GitHub page
Documentation
- Tutorial

PCAD

The GPPD manages the High Performance Computation Park (PCAD) and is the group I’m part of!

Presentation

Programming

Here lies all programming and HPC-related knowledge.

MPI

Message Passing Interface (MPI) is a standardized and portable message-passing standard designed by a group of researchers from academia and industry to function on a wide variety of parallel computing architectures.

Wikipedia
LLNL’s Tutorial

C++ wrappers

I’ve gathered some info about MPI wrappers for C++ (because I like both simplicity and C++).

2012 state of affairs

Examples:

boost.mpi
mxx

So it seems to me that either the community has no interest in bindings and simplicity or things move really slowly when it comes to standards proposed by scholars and academics.

CUDA

CUDA is a parallel computing platform and application programming interface (API) model created by Nvidia.It allows software developers and software engineers to use a CUDA-enabled graphics processing unit (GPU) for general purpose processing — an approach termed GPGPU (General-Purpose computing on Graphics Processing Units).

Tutorials:

NVIDIA slides
Oxford course
LLNL’s tutorial

CMake

CMake is an open-source, cross-platform family of tools designed to build, test and package software. CMake is used to control the software compilation process using simple platform and compiler independent configuration files, and generate native makefiles and workspaces that can be used in the compiler environment of your choice.

Tutorials:

A bunch of examples
With the Ninja build system
Somewhat extensive tutorial
Empirical approach to CMake

Useful links:

Wiki page
Useful variables
All variables

StarPU

StarPU is a software tool aiming to allow programmers to exploit the computing power of the available CPUs and GPUs, while relieving them from the need to specially adapt their programs to the target machine and processing units.

Documentation

Tutorials:

Huge tutorial!

Design of parallel applications

Parallel algorithm design is not easily reduced to simple recipes. Rather, it requires the sort of integrative thought that is commonly referred to as “creativity.” However, it can benefit from a methodical approach that maximizes the range of options considered, that provides mechanisms for evaluating alternatives, and that reduces the cost of backtracking from bad choices.

Slides:

Gordon Uni
Iowa Uni
USSC

Pages:

Argonne National Laboratory

Research methodology

Everything related from writing to research methodology should stay here.

Literate programming

Literate programming is a programming paradigm introduced by Donald Knuth in which a program is given as an explanation of the program logic in a natural language, such as English, interspersed with snippets of macros and traditional source code, from which a compilable source code can be generated.

Literate programming can be easily achieved using .org files, as they provide text intertwined together with source code blocks, as well as providing a way to compile these code blocks into one or multiple source files and to execute that code natively.

Donald Knuth’s original paper is attached to this heading as a reference.

Reproducible analysis

The term reproducible research refers to the idea that the ultimate product of academic research is the paper along with the laboratory notebooks and full computational environment used to produce the results in the paper such as the code, data, etc. that can be used to reproduce the results and create new work based on the research.

Essential to research as a whole, reproducible analysis allows the researcher to establish trust, even years after arriving to results, in his conclusions. Using common methods comprising data, annotations and code such as a Jupyter notebook or a .org file using R script in code blocks, following the literate programming paradigm.

Tutorials about how this topic is dealt in the R realm:

R reproducible analysis

General culture about this sensitive topic: “The Irreproducibility Crisis of Modern Science: Causes, Consequences, and the Road to Reform” par Randall et Welser, 2018.

In French by Arnaud Legrand and colleagues: https://alegrand.github.io/bookrr/

Reproducible research

…

Documents and presentations

Here I’ll put everything related to creating quality presentations and documents overall.

Posters

O regramento é que o poster deve ser de 120cm de altura por 80cm de largura, equipados com madeira (na parte superior) e corda para fixação nos suportes. No pôster devem constar o título do trabalho, nomes dos autores e respectivas afiliações. É bem importante levar em conta que o papel do pôster é dar uma visão geral do trabalho, logo algumas dicas importantes podem ser levadas em conta, tais como pouco texto (como em uma apresentação de slides), uso de figuras para transmitir ideias, ser legível a pelo menos dois metros de distância.

Tutorials:

https://guides.nyu.edu/posters
http://hsp.berkeley.edu/sites/default/files/ScientificPosters.pdf

Project

Here’s everything about my scholarship planning and project as a whole.

Schedule

Here is the intended project schedule to me:

Activity	May	June	July
State of the art / StarPU	x	x
Experimentation	x	x
Performance analysis		x	x
Report writing			x

Learning path

ssh and systems programming
linux servers
clusters and cluster management
parallel programming
task-based programming and message passing interfaces
starpu
performance experiments
methodology of result-gathering
analysis of data
reproductible analysis
text structuring
writing of scientific reports

Files

lab-book.org

Latest commit

History

lab-book.org

File metadata and controls

Henrique’s laboratory book

Table of contents

Journal

To-do

Archive

Daily

2019-05-02

2019-05-03 StarPU Hello World Examples

StarPU “Hello World”

Install preliminary software

spack

starpu with spack

StarPU client code of two examples

2019-05-06

2019-05-08

2019-05-09

2019-05-10

2019-05-13

2019-05-14

2019-05-16

2019-05-17

2019-05-20

2019-05-21

2019-05-23

2019-05-24

2019-05-27

2019-05-28

2019-06-07

2019-06-08

2019-06-09

2019-06-14

2019-06-15

2019-06-16

2019-06-17

2019-06-18

2019-06-19

2019-06-22

2019-06-23

2019-07-02

2019-07-14

2019-07-15

2019-07-16

2019-07-22

2019-07-24

2019-07-25

2019-07-29

2019-08-09

Meetings

2019-04-30 Tips for ORG-Mode

2019-05-14 Meeting

2019-05-28 Meeting

2019-06-11 Meeting

2019-06-18 Meeting

2019-07-03 Meeting

2019-07-19 Meeting

Experiments

External resources

Linux

tmux

Servers

Clusters

Slurm

Useful commands:

Spack

PCAD

Programming

MPI

C++ wrappers

CUDA

CMake

StarPU

Design of parallel applications

Research methodology

Literate programming