diff --git a/.github/workflows/static.yml b/.github/workflows/static.yml index e4fbca6..71287e6 100644 --- a/.github/workflows/static.yml +++ b/.github/workflows/static.yml @@ -1,43 +1,57 @@ -# Simple workflow for deploying static content to GitHub Pages name: Deploy static content to Pages on: - # Runs on pushes targeting the default branch push: branches: ["main"] - - # Allows you to run this workflow manually from the Actions tab workflow_dispatch: -# Sets permissions of the GITHUB_TOKEN to allow deployment to GitHub Pages permissions: contents: read pages: write id-token: write -# Allow only one concurrent deployment, skipping runs queued between the run in-progress and latest queued. -# However, do NOT cancel in-progress runs as we want to allow these production deployments to complete. concurrency: group: "pages" cancel-in-progress: false jobs: - # Single deploy job since we're just deploying - deploy: + build-and-deploy: environment: name: github-pages url: ${{ steps.deployment.outputs.page_url }} - runs-on: ubuntu-latest + runs-on: macos-latest steps: - name: Checkout uses: actions/checkout@v4 + + - name: Install pandoc and pandoc-crossref + run: | + brew install pandoc + brew install make + brew install pandoc-crossref + + - name: Setup Pandoc + uses: pandoc/actions/setup@main + # with: + # version: 2.19 # Uncomment and modify if you need a specific version + + - name: Clean build directory + run: | + rm -rf build + mkdir -p build/html + + - name: Build documentation + run: make + - name: Setup Pages uses: actions/configure-pages@v5 + - name: Upload artifact uses: actions/upload-pages-artifact@v3 with: - # Upload entire repository path: 'build/html/' + - name: Deploy to GitHub Pages id: deployment uses: actions/deploy-pages@v4 + diff --git a/.gitignore b/.gitignore index bd902b2..89cf4d0 100644 --- a/.gitignore +++ b/.gitignore @@ -3,6 +3,7 @@ build/**/*.png build/docx/ build/epub/ build/pdf/ +build/html/ # Misc *.log diff --git a/build/html/c/01-introduction.html b/build/html/c/01-introduction.html deleted file mode 100644 index b27135c..0000000 --- a/build/html/c/01-introduction.html +++ /dev/null @@ -1,776 +0,0 @@ - - - - - - - - - - Reinforcement Learning from Human Feedback - - - - -
-

Reinforcement Learning from Human Feedback

-

Nathan Lambert

-

24 May 2024

-
-
Home
-

Chapter Contents

- -
-

Introduction

-

This is the first paragraph of the introduction chapter. This is a - test of citing [1].

-

First: Images

-

This is the first subsection. Please, admire the gloriousnes of - this seagull:

-
- A cool seagull. - -
-

A bigger seagull:

-
- A cool big seagull. - -
-

Second: Tables

-

This is the second subsection.

-

Please, check First: Images - subsection.

-

Please, check this subsection.

- - - - - - - - - - - - - - - - - - - - - - -
This is an example table.
IndexName
0AAA
1BBB
-

Third: Equations

-

Formula example: \mu = \sum_{i=0}^{N} \frac{x_i}{N}

-

Now, full size:

-

\mu = \sum_{i=0}^{N} \frac{x_i}{N}

-

And a code sample:

-
def hello_world
-  puts "hello world!"
-end
-
-hello_world
-

Check these unicode characters: ǽߢð€đŋμ

-

Fourth: Cross references

-

These cross references are disabled by default. To enable them, - check the Cross - references section on the README.md file.

-

Here’s a list of cross references:

- -
- Figure 1: A cool seagull - -
-

y = mx + b \qquad{(1)}

-
- - - - - - - - - - - - - - - - - - - - - - -
Table 1: This is an example table.
IndexName
0AAA
1BBB
-
-

Bibliography

-
-
-
[1]
N. Lambert, T. K. Gilbert, and T. Zick, - “Entangled preferences: The history and risks of reinforcement - learning and human feedback,” arXiv preprint - arXiv:2310.13595, 2023.
-
-
-
- - - diff --git a/build/html/c/02-installation.html b/build/html/c/02-installation.html deleted file mode 100644 index 4d983c2..0000000 --- a/build/html/c/02-installation.html +++ /dev/null @@ -1,604 +0,0 @@ - - - - - - - - - - Reinforcement Learning from Human Feedback - - - - -
-

Reinforcement Learning from Human Feedback

-

Nathan Lambert

-

24 May 2024

-
-
Home
-

Chapter Contents

- -
-

Installation

-

This is the installation chapter. We love the book [1].

-

For further information, check the [Introduction] chapter.

-

For further information, check the this - chapter.

-

For further information, check the this chapter’s - subsection section.

-

Bibliography

-
-
-
[1]
S. J. Russell and P. Norvig, Artificial - intelligence: A modern approach. Pearson, 2016.
-
-
-
- - - diff --git a/build/html/c/03-usage.html b/build/html/c/03-usage.html deleted file mode 100644 index c9f554f..0000000 --- a/build/html/c/03-usage.html +++ /dev/null @@ -1,566 +0,0 @@ - - - - - - - - - - Reinforcement Learning from Human Feedback - - - - -
-

Reinforcement Learning from Human Feedback

-

Nathan Lambert

-

24 May 2024

-
-
Home
-

Chapter Contents

- -
-

Usage

-

This is the usage chapter.

-
- - - diff --git a/build/html/c/04-references.html b/build/html/c/04-references.html deleted file mode 100644 index 70a1a66..0000000 --- a/build/html/c/04-references.html +++ /dev/null @@ -1,570 +0,0 @@ - - - - - - - - - - Reinforcement Learning from Human Feedback - - - - -
-

Reinforcement Learning from Human Feedback

-

Nathan Lambert

-

24 May 2024

-
-
Home
-

Chapter Contents

- -
-

References

- -
- - - diff --git a/build/html/favicon.ico b/build/html/favicon.ico deleted file mode 100644 index 6d2d5a9..0000000 Binary files a/build/html/favicon.ico and /dev/null differ diff --git a/build/html/index.html b/build/html/index.html deleted file mode 100644 index 1ef75f1..0000000 --- a/build/html/index.html +++ /dev/null @@ -1,656 +0,0 @@ - - - - - - - - - - Reinforcement Learning from Human Feedback - - - - -
-

Reinforcement Learning from Human Feedback

-

Nathan Lambert

-

24 May 2024

-
-

Abstract

-

An introduction to reinforcement learning from human feedback - (RLHF). Covering the core topics, from history of preferences, to - early work in language models, to future frontiers with Direct - Preference Learning (DPO) and everything that follows. Built - iteratively, web first, with purcase available when completed.

-

Contents

-
    -
  1. Introduction
  2. -
  3. TODO
  4. -
  5. TODO
  6. -
  7. TODO
  8. - -
-
-
- - - -