Code is an increasingly important part of research. Whether R or MATLAB "snippets," integrated documents using Jupyter or RMarkdown, or more complicated workflows using research databases, instrumental measurements, and databases.
This workshop is focused on early career researchers who are working on projects and interested in improving their skills and learning new techniques. Participants will register for GitHub accounts, set up a repository, and learn how to write journal-quality documents that include all the code required to download data, build statistical tests, and publication-quality plots.
Participants will be introduced to concepts such as test-driven development and continuous integration to produce research-quality code. This workshop also includes an element meant to help build an Earth Science code cookbook. Participants will survey blog posts, code repositories, and other online resources to discuss strategies for credit and discoverability of the code they and others produce.
We will be using Github as our primary platform for tracking our code. Github is a free platform owned by Microsoft. Github is only one of a number of platforms available to host code online. You are welcome to use a different platform, but we cannot offer support during the workshop for your specific platform.
We will be using ORCID as a tool to provide you with a unique personal identifier as a researcher. ORCID provides a persistent digital identifier (an ORCID iD) that you own and control, and that distinguishes you from every other researcher. ORCID also provides tools for tracking your publications, grants and awards, and tools for linking to and discovering similar products by other researchers.
- Create a GitHub account (note: Register your account as a GitHub Student account to obtain extra benefits)
- Create an ORCID account because you should!
We will be using several pieces of software in this workshop. These are widely used software packages that will serve you for some time to come.
- R is a free statistical software package that is supported by all platforms (Windows, Mac and Linux). If you already have R, please ensure your version of R is above v4.0.
- RStudio is an Interactive Development Environment (IDE) that is widely used and includes a number of tools to help with scripting.
git
is a widely used version control software package that keeps track of changes you make in any kind of file, across multiple projects. NOTE: Most installations only need the default settings.pandoc
is a piece of software that helps users change document formats between Markdown, HTML, PDF, docx and others. Pandoc also supports embedded citations, graphics and other stuff that's cool.
Open up RStudio and, in the console, enter the following:
install.packages(c('revealjs', 'assertthat', 'jsonlite'))
Once you've installed your programs, navigate to this RMarkdown document and right-click to Save As. Save it somewhere you'll find it, and then open that file in RStudio. Once it's open, click the knit button on the toolbar:
Successfully knitting the Rmd file to an HTML output will let us know that your installation of R
, RStudio
and pandoc
are working.
It Didn't Work: Sometimes the knit button doesn't appear if the file is saved with the wrong file extension. For example, Windows likes adding .txt
to the end of text files. If this is the case, rename the file to make sure it has a .Rmd
extension.
To test whether or not git
is working, open up a terminal. Linux and MacOS users should open their Dock and type in terminal
. Windows users can open up the Windows Menu (windows key-R) and type in cmd
. Once that's done, type git --version
. You should get a result that gives you the current version of git
that you're running.
If you're having problems, feel free to ask for help in the Thesis Is Software Slack channel. This is a public channel. If you aren't a member yet, please feel free to join.
Please note that we will be following both the Code of Conduct for this repository, as well as the Geological Society of America's RISE Slide.
Time (PST) | Talks |
---|---|
10:00am | Land acknowledgment, Introductions |
10:05am | Introduce Throughput |
10:15am | Why build your thesis as software? (Why Software?) |
10:45am | An introduction to GitHub (Slides) |
11:45am | Short Nature break |
12:00pm | Git workflows & gitignore (Slides) |
12:15pm | Active work (http://bit.ly/githubrepos) |
12:45pm | Summarize & Questions |
1:00pm | Lunch-ish Break |
1:30pm | Moving to Markdown (Your Thesis is Code) |
3:00pm | Another Short Nature Break |
3:05pm | Tips & Tricks (Tips Document) |
3:45pm | Questions & Stuff |
4:00pm | End of Workshop, thanks everyone! |
This project is an open project, and contributions are welcome from any individual. All contributors to this project are bound by a code of conduct. Please review and follow this code of conduct as part of your contribution.