[RFC] New Architecture for Requirement Computation #139

SamChou19815 · 2020-09-17T00:57:46Z

Current setup and issues

Currently, we compute requirement fulfillment in a quite naive way. We first go through a list of all requirements, attach courses taken by the user to them, and then utilize these local information to decide the fulfillment progress.

The current setup can solve some simple cases, but it's not scalable when we consider these complications:

Some classes cannot be double-counted. If we use a local requirement -> class[] map, it would be hard to filter them out.
Some requirements have multiple fulfillment options.
We want to have the ability for user to choose between several different fulfillment options.

New Architecture

Insights

The main issue with the current implementation is that we are looking at the whole picture only from the perspective of requirements. In reality, the "viewpoint" from a class is also important. (e.g. filter away double counting) Therefore, we natually found an abstraction for a two-way view: GRAPH.

Mathematical definition

The graph is not a general-purposed graph. Instead, we add a few restricions to the graph. Formally, let's define CoursePlanRequirementFulfillmentGraph using the standard notation: (R \union C, E), where R means a set of requirements to satisfy, C means a set of courses that is taken by user, and (r, c) \in E is a tuple denoting the edge between a requirement node r and a course node c. This graph can also be thought as a mathematical relation between R and C. Thus, the problem of computing requirements can be thought as building and refining this graph.

Graph Interface

Programmatically, we can define an interface like:

interface CoursePlanRequirementFulfillmentGraph {
  // These are the minimal required methods.
  getAllRequirements(): readonly Requirement[];
  getAllCourses(): readonly Course[];
  getAllEdges(): readonly (readonly [Requirement, Course])[];
  addEdge(requirement: Requirement, course: Course): void;
  removeEdge(requirement: Requirement, course: Course): void;
  
  // Some helper methods can be added for efficiency or usability, like
  getConnectedCoursesFromRequirement(requirement: Requirement): readonly Course[];
  getConnectedRequirementsFromCourse(course: Course): readonly Requirement[];
  existsEdge(requirement: Requirement, course: Course): boolean;
  // ...
}

Algorithm

The algorithm has a simple interface:

const computeCoursePlanRequirementGraph = (
  requirementsNeedToSatisfy: readonly Requirement[],
  coursesTakenByUser: readonly Course[],
  userChoicesOfRequirementFulfillmentOptions: readonly (readonly [Course, Decision])[])
): CoursePlanRequirementGraph => {
  // ...
}

Internally, we can now describe the new proposed computation algorithm as a series of passes on building and refining the graph:

Phase 1: Building a rough graph

In this phase, we build a coarse graph directly from the pre-computed requirement json.

When we see user takes a class c, we find a subset of requirements R' \in R such that c can potentially make progress to satisfy r \in R'.

Phase 2: Respect user's choices

User may make some decision on how courses are used to satisfy requirements. For example, whether to use MATH 4710 as an elective or as a substitude of CHEM class. In this phase, we will consider the userChoicesOfRequirementFulfillmentOptions parameter.

For each (course, decision) tuple, we inspect the graph from the perspective of course. We can call the method graph.getConnectedRequirementsFromCourse(course) to get a list of requirements that are currently associated with the course. Then we remove the edges that are incompatible with the user's decision.

The representation of decision and the compatibility check are not important in the discussion of the architecture. We can specify it in another design document.

(Note: we can also make the decision keyed by requirement, the flow would be similar).

Phase 3: Making default choices

This is a followup of phase 2. The user might not make choices for every single requirement that need some choices. For example, the user might just added MATH 4710 and has not decided whether it can be used to fulfill elective or engineering's chem requirement.

In this phase, we will take a look at all requirements that have multiple fulfillment strategies and have not been covered by user's decision. For all these requirements, we use the default strategy, call graph.getConnectedCoursesFromRequirement(requirement) to get existing edges, and remove those that are incompatible with default strategies.

Phase 4: Eliminating double counting

In this phase, we inspect each course c. We call graph.getConnectedRequirementsFromCourse(course) to get a list of requirements that can be satisfied by the course without considering double-counting constraint. Let's call this list of requirement R.

Then for all r \in R, we check whether the requirement allow a course to be used for something else. If it does allow, then we will always keep this edge. Now we are left with a subset of R which we call R'. We can then use some simple rule to pick one or zero r' \in R' to keep and discard the rest edges.

Phase 5: Profit

🎉

Some important implementation details

Cross-listed classes

In order to correctly handle cross-listed course, we need to change the course type so that all cross listed classes maps to the same class. e.g. CS 2112 might be represented as:

{
  "id": 1234567,
  "codes": ["CS 2112", "ENGRD 2112"],
  // ...
}

The requirement checking step in phase 1 also need to resolve a course code to a course object and use that to consider all cross-listed possibilities.

Ensuring user never makes bad choices

Here, a bad choice means that a choice that is invalid according to Cornell's rules. For example, using MATH 4710 as both an external spec and a free elective. We can enforce this by giving user only a list of valid choices when we let user make decisions.

The easiest but inefficient way to find a list of valid choices would be considering all invalid choices. For each potential choice, try to add the choice into the decision list and run the algorithm. The algorithm should have a bunch of sanity checks to defend against invalid choices. Then, if the algorithm doesn't throw an exception, we know the choice is valid.

Post-processing

In the end, we got a graph, but the frontend UI actually need a list like

type Fulfillments = { 
  requirement: Requirement;
  courses: Course[];
  progress: number;
  total: number;
}[]

Therefore, we need a post-process pass that flattens the graph into this list.
The step needs to take account both the graph and user's choices, so the signature would look something like

const postProcessRequirementFulfillmentGraphIntoList = (
  graph: CoursePlanRequirementFulfillmentGraph,
  decisions: readonly (readonly [Course, Decision])[])
): Fulfillments => { /* ... */ }

The text was updated successfully, but these errors were encountered:

willespencer · 2020-09-18T04:03:31Z

Hey Sam! This looks really great! And thanks for being so thorough, it really helps with my understanding 😄! Got a question for you.

When you say, 'we remove the edges that are incompatible with the user's decision,' what does this mean? As in, what type of course <-> requirement edges would be incompatible? It is just like, if they picked it for a requirement and they cannot double count it, it would be removed from other requirements?

SamChou19815 · 2020-09-18T04:22:15Z

Hey Sam! This looks really great! And thanks for being so thorough, it really helps with my understanding 😄! Got a question for you.

When you say, 'we remove the edges that are incompatible with the user's decision,' what does this mean? As in, what type of course <-> requirement edges would be incompatible? It is just like, if they picked it for a requirement and they cannot double count it, it would be removed from other requirements?

That's a good question, we might need some revision of the phases. I think we will likely only let user make two types of decisions in the system:

When there is a requirement with multiple fulfillment strategies. In this case, we let user choose between several strategies.
When there is a class that can satisfies multiple requirements, but all these requirements do not allow double count. In this case, we let user choose between several requirements.

In case 1, we need to do the computation starting from requirement nodes, and purge req-course link that belongs to a fulfillment strategy that is not chosen by the user.
In case 2, we need to do the computation starting from course nodes, and purge all other req-course link except the one chosen by the user. In this case, it's indeed double-counting aware.

Therefore, the current division of phase 2-4 might not make many sense. Case 1-2 above look very different so that they should be separate phases, and case 2 is also double-counting aware, so it kinda overlaps with phase 4.

Thus, here comes my revised phases:

Phase 1: Building a rough graph
Phase 2-1: Respect user's choices of fulfillment strategies for courses with multiple fulfillment options
Phase 2-2: Make default choices for users for courses with multiple fulfillment options
Phase 3-1: Respect user's choices of using one course to satisfy one of the several requirements that don't allow double count
Phase 3-2: Make random choices for user for the kind of choices above
Phase 4: Profit

### Summary This PR starts to implement proposal in #139 by setup the `RequirementFulfillmentGraph` data structure. I made a generic `RequirementFulfillmentGraph` that doesn't depend on a specific representation of requirement or course object, so that the structure itself can be developed independently to avoid potential changes from requirement or course representation. It contains all the methods I described in #139. All the methods are tested in Jest. Inside my implementation of graph, we need to lookup all courses by a requirement and all requirements by a course, so I setup two hash maps for O(1) lookup. `addEdge` and `removeEdge` ensures that they are in sync. Unlike Java where every object has `equals` and `hashCode`, we don't have the same luxury in JS. Therefore, I also implemented `HashMap` and `HashSet` that takes in a function `getUniqueHash` for the key type to simulate the `hashCode` method in Java. The basic idea is that we call this function to get a string that can uniquely identify a requirement/course, so that we can use that unique identifier as a key for quick lookup using JavaScript's `Map`. (samlang uses the same approach: https://github.com/SamChou19815/samlang/blob/86ffed2edcb58573b34dae94bbea83e80c591d98/samlang-core-utils/index.ts#L54-L134) ### Test Plan Added unit tests for `HashMap`, `HashSet` and `RequirementFulfillmentGraph` Added a step for `npm run test` in CI.

SamChou19815 · 2020-10-14T20:47:35Z

Everything in this issue has been implemented. Closing.

SamChou19815 mentioned this issue Sep 24, 2020

Setup requirement graph data structure #140

Merged

SamChou19815 mentioned this issue Oct 1, 2020

Implement phase 1, 2-1, 3-1 for requirement graph builder #142

Merged

SamChou19815 closed this as completed Oct 14, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] New Architecture for Requirement Computation #139

[RFC] New Architecture for Requirement Computation #139

SamChou19815 commented Sep 17, 2020

willespencer commented Sep 18, 2020 •

edited

Loading

SamChou19815 commented Sep 18, 2020 •

edited

Loading

SamChou19815 commented Oct 14, 2020

[RFC] New Architecture for Requirement Computation #139

[RFC] New Architecture for Requirement Computation #139

Comments

SamChou19815 commented Sep 17, 2020

Current setup and issues

New Architecture

Insights

Mathematical definition

Graph Interface

Algorithm

Phase 1: Building a rough graph

Phase 2: Respect user's choices

Phase 3: Making default choices

Phase 4: Eliminating double counting

Phase 5: Profit

Some important implementation details

Cross-listed classes

Ensuring user never makes bad choices

Post-processing

willespencer commented Sep 18, 2020 • edited Loading

SamChou19815 commented Sep 18, 2020 • edited Loading

SamChou19815 commented Oct 14, 2020

willespencer commented Sep 18, 2020 •

edited

Loading

SamChou19815 commented Sep 18, 2020 •

edited

Loading