Skip to content

Commit

Permalink
yegor256#532 integration puzzles
Browse files Browse the repository at this point in the history
  • Loading branch information
h1alexbel committed Jan 4, 2024
1 parent b5ce1be commit ac11559
Show file tree
Hide file tree
Showing 5 changed files with 77 additions and 10 deletions.
26 changes: 26 additions & 0 deletions .github/workflows/plantuml.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
name: plantuml
on:
push:
paths:
- '**.puml'
branches:
- master
permissions:
contents: write
jobs:
plantuml:
runs-on: ubuntu-22.04
steps:
- name: Checkout Source
uses: actions/checkout@v4
- name: Generate SVG Diagrams
uses: holowinski/plantuml-github-action@main
with:
args: -v -tsvg diagrams/*.puml
- name: Commit changes
uses: EndBug/add-and-commit@v9
with:
author_name: ${{ secrets.USERNAME }}
author_email: ${{ secrets.EMAIL }}
message: 'Diagram generated'
add: 'doc/*'
14 changes: 14 additions & 0 deletions doc/integration.puml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
@startuml
title LinerModel 0pdd Integration
participant "Git Repo" as repo
participant 0pdd
participant LinerModel as lm

0pdd -> repo
repo --> 0pdd: .0pdd.yml
alt model: true
0pdd -> lm: Puzzles
lm --> 0pdd: Ranked puzzles
0pdd --> repo: Ranked puzzles
end
@enduml
23 changes: 14 additions & 9 deletions model/README.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@
Puzzle Ranking (Linear ML Model)

<<<<<<< Updated upstream
###### Note: This is an opt-in feature
=======
The data for puzzles is pre-processed and available in `~/data/proper_pdd_data_regression.csv`. In the data, The first row is the column index, the first column is the repo id the puzzle belongs to and the last column is the output variable (`y`).
>>>>>>> Stashed changes
### Internals

The ML model is a linear model with PSO optimizer. The optimizer is used to train the model on puzzle data, the weights are stored and used to predict future puzzles.
The ML model is a linear model with PSO optimizer.
The optimizer is used to train the model on puzzle data,
the weights are stored and used to predict future puzzles.

Because of the time required, training is a non-blocking process, and puzzle prioritization uses a naive ranking approach based on puzzle estimate. Subsequent events use the linear model for prioritization.
Because of the time required, training is a non-blocking process,
and puzzle prioritization uses a naive ranking approach based on puzzle estimate.
Subsequent events use the linear model for prioritization.

The linear model is the external API for the model. It has one method `predict(...)` which accepts an array of puzzles in xml. The output of this model is an array of positional index of the input puzzles:
The linear model is the external API for the model.
It has one method `predict(...)` which accepts an array of puzzles in xml.
The output of this model is an array of positional index of the input puzzles:

```ruby
# usage
Expand All @@ -25,3 +25,8 @@ rank = LinearModel.new(repo_name, storage).predict(puzzles)
#
# rank -> array of positional index of ranked puzzles
```

### Integration

This diagram shows how this model can be integrated into 0pdd workflow:
![integration.svg](../doc/integration.svg)
19 changes: 18 additions & 1 deletion model/linear.rb
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,9 @@

#
# Linear Model
# @todo #532 Add unit-tests.
# We should add unit-tests for this class that checks puzzle ranking.
# For now its untested, don't forget to remove this puzzle.
#
class LinearModel
def initialize(repo, storage)
Expand All @@ -46,9 +49,23 @@ def initialize(repo, storage)
end
end

# ranks the puzzles using Machine-Learning
# @param puzzles XML puzzles
# @return array of positional index of the input puzzles
# @todo #532 Implement a ranked puzzles.
# Let's implement a class that will use `LinearModel` to rank puzzles.
# This class is need in order to do an integration between original 0pdd
# and model modules. Probably it can be a decorator for `Puzzles`
# that ranks XML puzzles, and then submits them into `Puzzles`.
# Don't forget to remove this puzzle.
def predict(puzzles)
weights = @storage.load # load weights for repo from s3
clf = Predictor.new(layers: [{ name: 'w1', shape: [10, 1] }, { name: 'w2', shape: [1, 1] }])
clf = Predictor.new(
layers: [
{ name: 'w1', shape: [10, 1] },
{ name: 'w2', shape: [1, 1] }
]
)
if weights.nil?
train(clf) # find weights for repo backlog of puzzles
ranks = naive_rank(puzzles) # naive rank of puzzles in each repo
Expand Down
5 changes: 5 additions & 0 deletions objects/puzzles.rb
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,11 @@

#
# Puzzles in XML/S3
# @todo #532 Implement a decorator for optional model configuration load.
# Let's implement a class that decorates `Puzzles` and
# based on presence of `model: true` attribute in YAML config, decides
# whether the puzzles should be ranked or not.
# Don't forget to remove this puzzle.
#
class Puzzles
def initialize(repo, storage)
Expand Down

0 comments on commit ac11559

Please sign in to comment.