Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: remove Types collection from wdl-analysis. #277

Merged
merged 4 commits into from
Dec 27, 2024

Conversation

peterhuene
Copy link
Collaborator

@peterhuene peterhuene commented Dec 26, 2024

This commit removes the Types collection from wdl-analysis.

The collection served as an arena to allocate types into which also served as providing type recursion via arena identifiers. The hope was that this would prevent a lot of small allocations from occurring during analysis when calculating expression types.

However, each document had a different Types collection, requiring "importing" types between collections. Further complicating things, an evaluation engine had its own Types collection which needed importing of types from other document collections. This was a source of bugs in the implementation of the evaluation engine.

Worse, the types collection served as a bottleneck for sharing data across asynchronous tasks in the upcoming implementation of workflow evaluation, requiring a lock.

The alternative is to use an Arc in appropriate places where type recursion occurs. After running some simple benchmarks, the removal of the types arena doesn't appear to have any real world impact on performance.

Additionally, the removal of Types allows Type to implement PartialEq rather than a custom trait that uses the arena; likewise for Display.

Similarly in wdl-engine, Value can now directly implement serialization and deserialization traits.

Before submitting this PR, please make sure:

  • You have added a few sentences describing the PR here.
  • You have added yourself or the appropriate individual as the assignee.
  • You have added at least one relevant code reviewer to the PR.
  • Your code builds clean without any errors or warnings.
  • You have added tests (when appropriate).
  • You have updated the README or other documentation to account for these
    changes (when appropriate).
  • You have added an entry to the relevant CHANGELOG.md (see
    "keep a changelog" for more information).
  • Your commit messages follow the conventional commit style.

This commit removes the `Types` collection from `wdl-analysis`.

The collection served as an arena to allocate types into which also served as
providing type recursion via arena identifiers. The hope was that this would
prevent a lot of small allocations from occurring during analysis when
calculating expression types.

However, each document had a different `Types` collection, requiring
"importing" types between collections. Further complicating things, an
evaluation "engine" had its own `Types` collection which needed importing of
types from other document collections. This was a source of bugs in the
implementation of the evaluation engine.

Worse, the types collection served as a bottleneck for sharing data across
asynchronous tasks in the upcoming implementation of workflow evaluation,
requiring a lock.

The alternative is to use an `Arc` in appropriate places where type recursion
occurs. After running some simple benchmarks, the removal of the types arena
doesn't appear to have any real world impact on performance.

Additionally, the removal of `Types` allows `Type` to implement `PartialEq`
rather than a custom trait that uses the arena; likewise for `Display`.

Similarly in `wdl-engine`, `Value` can now directly implement serialization and
deserialization traits.
Currently, the `Task` variant of the `Value` enum causes `Value` to be about
200 bytes in size due to its numerous fields.

This change puts the immutable fields of a task value behind an indirection,
reducing the size of `Value` to 40 bytes.

It also puts the `left` and `right` members of `Pair` behind a single
allocation rather than two.
@peterhuene peterhuene self-assigned this Dec 26, 2024
@peterhuene peterhuene changed the title Remove types refactor: remove Types collection from wdl-analysis. Dec 26, 2024
@peterhuene
Copy link
Collaborator Author

peterhuene commented Dec 26, 2024

No rush on this review as I'm out this week and next, but the workflow evaluation implementation will be relying on these changes so that we can more easily spawn asynchronous tasks that do WDL evaluation internally (i.e. we no longer have to synchronize a singular Types collection for things to work; plus, it just makes the wdl-analysis and wdl-engine implementations cleaner).

@peterhuene
Copy link
Collaborator Author

Looks like gauntlet caught a bug for me to track down too.

@peterhuene
Copy link
Collaborator Author

peterhuene commented Dec 26, 2024

So the gauntlet diff is actually removing false positives fixed by this PR.

The actual bug was in the now-deleted import function responsible for importing a type between two different type collections (e.g. referencing an input for a task/workflow in an imported document). The bug was that the imported type accidentally dropped the optional qualifier for the type, resulting in type coercion failures in task/workflow call statements.

For example:

bar.wdl:

version 1.1

task foo {
    input {
        Array[String]? bar
    }

    command <<<>>>
}

foo.wdl:

version 1.1

import "bar.wdl"

workflow bar {
    input {
        Array[String]? bar
    }
    
    call bar.foo { input: bar = bar }
}

Results in this false positive diagnostic:

error: type mismatch: expected type `Array[String]`, but found type `Array[String]?`
   ┌─ foo.wdl:10:33
   │
10 │     call bar.foo { input: bar = bar }
   │                           ---   ^^^ this is type `Array[String]?`
   │                           │      
   │                           this expects type `Array[String]`

I'll push a bless of the gauntlet diagnostics to rid the false positives and add a test case to ensure we don't accidentally drop the optional qualifier like this in the future.

Copy link
Member

@claymcleod claymcleod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Also fix a comment for the `Coercible` trait in `wdl-engine`.
@peterhuene peterhuene merged commit 10eb4e5 into stjude-rust-labs:main Dec 27, 2024
16 checks passed
@peterhuene peterhuene deleted the remove-types branch December 27, 2024 05:12
@peterhuene peterhuene mentioned this pull request Jan 10, 2025
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants