diff --git a/src/SUMMARY.md b/src/SUMMARY.md index 271451e..950485e 100644 --- a/src/SUMMARY.md +++ b/src/SUMMARY.md @@ -3,8 +3,8 @@ - [Version solving](./version_solving.md) - [Using the pubgrub crate](./pubgrub_crate/intro.md) - [Basic example with OfflineDependencyProvider](./pubgrub_crate/offline_dep_provider.md) - - [Writing your own dependency provider](./pubgrub_crate/custom_dep_provider.md) - - [Caching dependencies in a DependencyProvider](./pubgrub_crate/caching.md) + - [Implementing a dependency provider](./pubgrub_crate/dep_provider.md) + - [Caching dependencies](./pubgrub_crate/caching.md) - [Strategical decision making in a DependencyProvider](./pubgrub_crate/strategy.md) - [Solution and error reporting](./pubgrub_crate/solution.md) - [Writing your own error reporting logic](./pubgrub_crate/custom_report.md) @@ -12,7 +12,6 @@ - [Optional dependencies](./limitations/optional_deps.md) - [Allowing multiple versions of a package](./limitations/multiple_versions.md) - [Public and Private packages](./limitations/public_private.md) - - [Versions in a continuous space](./limitations/continuous_versions.md) - [Pre-release versions](./limitations/prerelease_versions.md) - [Internals of the PubGrub algorithm](./internals/intro.md) - [Overview of the algorithm](./internals/overview.md) diff --git a/src/internals/partial_solution.md b/src/internals/partial_solution.md index c6d491a..896c516 100644 --- a/src/internals/partial_solution.md +++ b/src/internals/partial_solution.md @@ -12,7 +12,8 @@ have already been taken (including that one if it is a decision). If we represent all assignments as a chronological vec, they would look like follows: ```txt -[ (0, root_derivation), +[ + (0, root_derivation), (1, root_decision), (1, derivation_1a), (1, derivation_1b), @@ -26,14 +27,3 @@ represent all assignments as a chronological vec, they would look like follows: The partial solution must also enable efficient evaluation of incompatibilities in the unit propagation loop. For this, we need to have efficient access to all assignments referring to the packages present in an incompatibility. - -To enable both efficient backtracking and efficient access to specific package -assignments, the current implementation holds a dual representation of the the -partial solution. One is called `history` and keeps dated (with decision levels) -assignments in an ordered growing vec. The other is called `memory` and -organizes assignments in a hashmap where they are regrouped by packages which -are the hashmap keys. It would be interresting to see how the partial solution -is stored in other implementations of PubGrub such as the one in [dart -pub][pub]. - -[pub]: https://github.com/dart-lang/pub diff --git a/src/limitations/continuous_versions.md b/src/limitations/continuous_versions.md deleted file mode 100644 index 1753705..0000000 --- a/src/limitations/continuous_versions.md +++ /dev/null @@ -1,95 +0,0 @@ -# Versions in a continuous space - -The current design of pubgrub exposes a `Version` trait demanding two -properties, (1) that there exists a lowest version, and (2) that versions live -in a discrete space where the successor of each version is known. So versions -are basically isomorph with N^n, where N is the set of natural numbers. - -## The successor design - -There is a good reason why we started with the successor design for the -`Version` trait. When building knowledge about the dependency system, pubgrub -needs to compare sets of versions, and to perform common set operations such as -intersection, union, inclusion, comparison for equality etc. In particular, -given two sets of versions S1 and S2, it needs to be able to answer if S1 is a -subset of S2 (S1 ⊂ S2). And we know that S1 ⊂ S2 if and only if S1 ∩ S2 == S1. -So checking for subsets can be done by checking for the equality between two -sets of versions. Therefore, **sets of versions need to have unique canonical -representations to be comparable**. - -We have the interesting property that we require `Version` to have a total -order. As a consequence, the most adequate way to represent sets of versions -with a total order, is to use a sequence of non intersecting segments, such as -`[0, 3] ∪ ]5, 9[ ∪ [42, +∞[`. - -> Notation: we note segments with close or open brackets depending on if the -> value at the frontier is included or excluded of the interval. It is also -> common to use a parenthesis for open brackets. So `[0, 14[` is equivalent to -> `[0, 14)` in that other notation. - -The previous set is for example composed of three segments, - -- the closed segment [0, 3] containing versions 0, 1, 2 and 3, -- the open segment ]5, 9[ containing versions 6, 7 and 8, -- the semi-open segment [42, +∞[ containing all numbers above 42. - -For the initial design, we did not want to have to deal with close or open -brackets on both interval bounds. Since we have a lowest version, the left -bracket of segments must be closed to be able to contain that lowest version. -And since `Version` does not impose any upper bound, we need to use open -brackets on the right side of segments. So our previous set thus becomes: -`[0, ?[ ∪ [?, 9[ ∪ [42, +∞[`. But the question now is what do we use in place of -the 3 in the first segment and in place of the 5 in the second segment. This is -the reason why we require the `bump()` method on the `Version` trait. If we know -the next version, we can replace 3 by bump(3) == 4, and 5 by bump(5) == 6. We -finally get the following representation `[0, 4[ ∪ [6, 9[ ∪ [42, +∞[`. And so -the `Range` type is defined as follows. - -```rust -pub struct Range { - segments: Vec>, -} -type Interval = (V, Option); -// set = [0, 4[ ∪ [6, 9[ ∪ [42, +∞[ -let set = vec![(0, Some(4)), (6, Some(9)), (42, None)]; -``` - -## The bounded interval design - -We may want however to have versions live in a continuous space. For example, if -we want to use fractions, we can always build a new fraction between two others. -As such it is impossible to define the successor of a fraction version. - -We are currently investigating the use of bounded intervals to enable continuous -spaces for versions. If it happens, this will only be in the next major release -of pubgrub, probably 0.3. The current experiments look like follows. - -```rust -/// New trait for versions. -/// Bound is core::ops::Bound. -pub trait Version: Clone + Ord + Debug + Display { - /// Returns the minimum version. - fn minimum() -> Bound; - /// Returns the maximum version. - fn maximum() -> Bound; -} - -/// An interval is a bounded domain containing all values -/// between its starting and ending bounds. -/// RangeBounds is core::ops::RangeBounds. -pub trait Interval: RangeBounds + Debug + Clone + Eq + PartialEq { - /// Create an interval from its starting and ending bounds. - /// It's the caller responsability to order them correctly. - fn new(start_bound: Bound, end_bound: Bound) -> Self; -} - -/// The new Range type is composed of bounded intervals. -pub struct Range { - segments: Vec, -} -``` - -It is certain though that the flexibility of enabling usage of continuous spaces -will come at a performance price. We just have to evaluate how much it costs and -if it is worth sharing a single implementation, or having both a discrete and a -continuous implementation. diff --git a/src/limitations/intro.md b/src/limitations/intro.md index d40a1cc..68db4cf 100644 --- a/src/limitations/intro.md +++ b/src/limitations/intro.md @@ -5,9 +5,8 @@ a dependency system with the following constraints: 1. Packages are uniquely identified. 2. Versions are in a discrete set, with a total order. -3. The successor of a given version is always uniquely defined. -4. Dependencies of a package version are fixed. -5. Exactly one version must be selected per package depended on. +3. Dependencies of a package version are fixed. +4. Exactly one version must be selected per package depended on. The fact that packages are uniquely identified (1) is perhaps the only constraint that makes sense for all common dependency systems. But for the rest @@ -15,17 +14,14 @@ of the constraints, they are all inadequate for some common real-world dependency systems. For example, it's possible to have dependency systems where order is not required for versions (2). In such systems, dependencies must be specified with exact sets of compatible versions, and bounded ranges make no -sense. Being able to uniquely define the successor of any version (3) is also a -constraint that is not a natural fit if versions have a system of pre-releases. -Indeed, what is the successor of `2.0.0-alpha`? We can't tell if that is `2.0.0` -or `2.0.0-beta` or `2.0.0-whatever`. Having fixed dependencies (4) is also not -followed in programming languages allowing optional dependencies. In Rust -packages, optional dependencies are called "features" for example. Finally, -restricting solutions to only one version per package (5) is also too -constraining for dependency systems allowing breaking changes. In cases where -packages A and B both depend on different ranges of package C, we sometimes want -to be able to have a solution where two versions of C are present, and let the -compiler decide if their usages of C in the code are compatible. +sense. Having fixed dependencies (3) is also not followed in programming +languages allowing optional dependencies. In Rust packages, optional +dependencies are called "features" for example. Finally, restricting solutions +to only one version per package (4) is also too constraining for dependency +systems allowing breaking changes. In cases where packages A and B both depend +on different ranges of package C, we sometimes want to be able to have a +solution where two versions of C are present, and let the compiler decide if +their usages of C in the code are compatible. In the following subsections, we try to show how we can circumvent those limitations with clever usage of dependency providers. diff --git a/src/limitations/multiple_versions.md b/src/limitations/multiple_versions.md index a728be7..af33609 100644 --- a/src/limitations/multiple_versions.md +++ b/src/limitations/multiple_versions.md @@ -250,7 +250,7 @@ fn get_dependencies( } }) .collect(); - Ok(Dependencies::Known(pkg_deps)) + Ok(Dependencies::Available(pkg_deps)) } Package::Proxy { source, target } => { // If this is a proxy package, it depends on a single bucket package, the target, @@ -266,7 +266,7 @@ fn get_dependencies( }), bucket_range.intersection(target_range), ); - Ok(Dependencies::Known(bucket_dep)) + Ok(Dependencies::Available(bucket_dep)) } } } diff --git a/src/limitations/optional_deps.md b/src/limitations/optional_deps.md index c3ae988..94b43d7 100644 --- a/src/limitations/optional_deps.md +++ b/src/limitations/optional_deps.md @@ -56,15 +56,13 @@ We define an `Index`, storing all dependencies (`Deps`) of every package version in a double map, first indexed by package, then by version. ```rust -// Use NumberVersion, which are simple u32 for the versions. -use pubgrub::version::NumberVersion as Version; /// Each package is identified by its name. pub type PackageName = String; /// Global registry of known packages. pub struct Index { /// Specify dependencies of each package version. - pub packages: Map>, + pub packages: Map>, } ``` @@ -127,22 +125,21 @@ pub enum Package { } ``` -Let's implement the first function required by a dependency provider, -`choose_package_version`. For that we defined the `base_pkg()` method on a -`Package` that returns the string of the base package. And we defined the -`available_versions()` method on an `Index` to list existing versions of a given -package. Then we simply called the `choose_package_with_fewest_versions` helper -function provided by pubgrub. +We'll ignore `prioritize` for this example. + +Let's implement the second function required by a dependency provider, +`choose_version`. For that we defined the `base_pkg()` method on a `Package` +that returns the string of the base package, and the `available_versions()` +method on an `Index` to list existing versions of a given package in descending +order. ```rust -fn choose_package_version, U: Borrow>>( +fn choose_version( &self, - potential_packages: impl Iterator, -) -> Result<(T, Option), Box> { - Ok(pubgrub::solver::choose_package_with_fewest_versions( - |p| self.available_versions(p.base_pkg()).cloned(), - potential_packages, - )) + package: &Self::P, + range: &Self::VS, +) -> Result, Self::Err> { + Ok(self.available_versions(p.base_pkg()).find(|version| range.contains(version)).cloned()) } ``` @@ -165,7 +162,7 @@ fn get_dependencies( match package { // If we asked for a base package, we simply return the mandatory dependencies. - Package::Base(_) => Ok(Dependencies::Known(from_deps(&deps.mandatory))), + Package::Base(_) => Ok(Dependencies::Available(from_deps(&deps.mandatory))), // Otherwise, we concatenate the feature deps with a dependency to the base package. Package::Feature { base, feature } => { let feature_deps = deps.optional.get(feature).unwrap(); @@ -174,7 +171,7 @@ fn get_dependencies( Package::Base(base.to_string()), Range::exact(version.clone()), ); - Ok(Dependencies::Known(all_deps)) + Ok(Dependencies::Available(all_deps)) }, } } diff --git a/src/limitations/prerelease_versions.md b/src/limitations/prerelease_versions.md index f4c3459..8b28d75 100644 --- a/src/limitations/prerelease_versions.md +++ b/src/limitations/prerelease_versions.md @@ -3,25 +3,19 @@ Pre-releasing is a very common pattern in the world of versioning. It is however one of the worst to take into account in a dependency system, and I highly recommend that if you can avoid introducing pre-releases in your package -manager, you should. In the context of pubgrub, pre-releases break two -fondamental properties of the solver. +manager, you should. -1. Pre-releases act similar to continuous spaces. -2. Pre-releases break the mathematical properties of subsets in a space with - total order. +In the context of pubgrub, pre-releases break the fundamental properties of the +solver that there is or isn't a version between two versions "x" and "x+1", that +there cannot be a version "(x+1).alpha.1" depending on whether an input version +had a pre-release specifier. -(1) Indeed, it is hard to answer what version comes after "1-alpha0". Is it -"1-alpha1", "1-beta0", "2"? In practice, we could say that the version that -comes after "1-alpha0" is "1-alpha0?" where the "?" character is chosen to be -the lowest character in the lexicographic order, but we clearly are on a stretch -here and it certainly isn't natural. - -(2) Pre-releases are often semantically linked to version constraints written by +Pre-releases are often semantically linked to version constraints written by humans, interpreted differently depending on context. For example, "2.0.0-beta" -is meant to exist previous to version "2.0.0". Yet, it is not supposed to be -contained in the set described by `1.0.0 <= v < 2.0.0`, and only within sets -where one of the bounds contains a pre-release marker such as -`2.0.0-alpha <= v < 2.0.0`. This poses a problem to the dependency solver +is meant to exist previous to version "2.0.0". Yet, in many versioning schemes +it is not supposed to be contained in the set described by `1.0.0 <= v < 2.0.0`, +and only within sets where one of the bounds contains a pre-release marker such +as `2.0.0-alpha <= v < 2.0.0`. This poses a problem to the dependency solver because of backtracking. Indeed, the PubGrub algorithm relies on knowledge accumulated all along the propagation of the solver front. And this knowledge is composed of facts, that are thus never removed even when backtracking happens. @@ -33,12 +27,6 @@ return nothing even without checking if a pre-release exists in that range. And this is one of the fundamental mechanisms of the algorithm, so we should not try to alter it. -Point (2) is probably the reason why some pubgrub implementations have issues -dealing with pre-releases when backtracking, as can be seen in [an issue of the -dart implementation][dart-prerelease-issue]. - -[dart-prerelease-issue]: https://github.com/dart-lang/pub/pull/3038 - ## Playing again with packages? In the light of the "bucket" and "proxies" scheme we introduced in the section @@ -71,10 +59,8 @@ exploring alternative API changes that could enable pre-releases. ## Multi-dimensional ranges -We are currently exploring new APIs where `Range` is transformed into a trait, -instead of a predefined struct with a single sequence of non-intersecting -intervals. For now, the new trait is called `RangeSet` and could be implemented -on structs with multiple dimensions for ranges. +Building on top of the `Ranges` API, we could implement a custom `VersionSet` of +multi-dimensional ranges: ```rust pub struct DoubleRange { @@ -90,23 +76,23 @@ matched to: ```rust DoubleRange { - normal_range: Range::none, - prerelease_range: Range::between("2.0.0-alpha", "2.0.0"), + normal_range: Ranges::empty(), + prerelease_range: Ranges::between("2.0.0-alpha", "2.0.0"), } ``` And the constraint `2.0.0-alpha <= v < 2.1.0` would have the same `prerelease_range` but would have `2.0.0 <= v < 2.1.0` for the normal range. -Those constraints could also be intrepreted differently since not all +Those constraints could also be interpreted differently since not all pre-release systems work the same. But the important property is that this -enable a separation of the dimensions that do not behave consistently with +enables a separation of the dimensions that do not behave consistently with regard to the mathematical properties of the sets manipulated. -All this is under ongoing experimentations, to try reaching a sweet spot -API-wise and performance-wise. If you are eager to experiment with all the -extensions and limitations mentionned in this section of the guide for your -dependency provider, don't hesitate to reach out to us in our [zulip -stream][zulip] or in [GitHub issues][issues] to let us know how it went! +All this needs more experimentation, to try reaching a sweet spot API-wise and +performance-wise. If you are eager to experiment with all the extensions and +limitations mentioned in this section of the guide for your dependency provider, +don't hesitate to reach out to us in our [zulip stream][zulip] or in [GitHub +issues][issues] to let us know how it went! [zulip]: https://rust-lang.zulipchat.com/#narrow/stream/260232-t-cargo.2FPubGrub [issues]: https://github.com/pubgrub-rs/pubgrub/issues diff --git a/src/limitations/public_private.md b/src/limitations/public_private.md index ca96023..cab95b6 100644 --- a/src/limitations/public_private.md +++ b/src/limitations/public_private.md @@ -210,7 +210,7 @@ fn get_dependencies(&self, package: &Package, version: &SemVer) -> Result, ...> { match &package.seeds { // A Constraint variant does not have any dependency - PkgSeeds::Constraint(_) => Ok(Dependencies::Known(Map::default())), + PkgSeeds::Constraint(_) => Ok(Dependencies::Available(Map::default())), // A Markers variant has dependencies to: // - one Constraint variant per seed marker // - one Markers variant per original dependency @@ -219,7 +219,7 @@ fn get_dependencies(&self, package: &Package, version: &SemVer) let seed_constraints = ...; // Figure out if there are private dependencies. let has_private = ...; - Ok(Dependencies::Known( + Ok(Dependencies::Available( // Chain the seed constraints with actual dependencies. seed_constraints .chain(index_deps.iter().map(|(p, (privacy, r))| { diff --git a/src/pubgrub_crate/caching.md b/src/pubgrub_crate/caching.md index 4df774a..1a4dc85 100644 --- a/src/pubgrub_crate/caching.md +++ b/src/pubgrub_crate/caching.md @@ -41,7 +41,10 @@ pub struct CachingDependencyProvider { } impl DependencyProvider for CachingDependencyProvider { - fn choose_package_version<...>(...) -> ... { ... } + fn choose_version(&self, package: &DP::P, ranges: &DP::VS) -> Result, DP::Err> { + ... + } + fn get_dependencies( &self, package: &P, diff --git a/src/pubgrub_crate/custom_dep_provider.md b/src/pubgrub_crate/custom_dep_provider.md deleted file mode 100644 index 60c2b3b..0000000 --- a/src/pubgrub_crate/custom_dep_provider.md +++ /dev/null @@ -1,94 +0,0 @@ -# Writing your own dependency provider - -The `OfflineDependencyProvider` is very useful for testing and playing with the -API, but would not be usable in more complex settings like Cargo for example. In -such cases, a dependency provider may need to retrieve package information from -caches, from the disk or from network requests. Then, you might want to -implement `DependencyProvider` for your own type. The `DependencyProvider` trait -is defined as follows. - -```rust -/// Trait that allows the algorithm to retrieve available packages and their dependencies. -/// An implementor needs to be supplied to the [resolve] function. -pub trait DependencyProvider { - /// Decision making is the process of choosing the next package - /// and version that will be appended to the partial solution. - /// Every time such a decision must be made, - /// potential valid packages and version ranges are preselected by the resolver, - /// and the dependency provider must choose. - /// - /// Note: the type `T` ensures that this returns an item from the `packages` argument. - fn choose_package_version, U: Borrow>>( - &self, - potential_packages: impl Iterator, - ) -> Result<(T, Option), Box>; - - /// Retrieves the package dependencies. - /// Return [Dependencies::Unknown] if its dependencies are unknown. - fn get_dependencies( - &self, - package: &P, - version: &V, - ) -> Result, Box>; - - /// This is called fairly regularly during the resolution, - /// if it returns an Err then resolution will be terminated. - /// This is helpful if you want to add some form of early termination like a timeout, - /// or you want to add some form of user feedback if things are taking a while. - /// If not provided the resolver will run as long as needed. - fn should_cancel(&self) -> Result<(), Box> { - Ok(()) - } -} -``` - -As you can see, implementing the `DependencyProvider` trait requires you to -implement two functions, `choose_package_version` and `get_dependencies`. The -first one, `choose_package_version` is called by the resolver when a new package -has to be tried. At that point, the resolver call `choose_package_version` with -a list of potential packages and their associated acceptable version ranges. -It's then the role of the dependency retriever to pick a package and a suitable -version in that range. The simplest decision strategy would be to just pick the -first package, and first compatible version. Provided there exists a method -`fn available_versions(package: &P) -> impl Iterator` for your type, -it could be implemented as follows. We discuss advanced -[decision making strategies later](./strategy.md). - -```rust -fn choose_package_version, U: Borrow>>( - &self, - potential_packages: impl Iterator, -) -> Result<(T, Option), Box> { - let (package, range) = potential_packages.next().unwrap(); - let version = self - .available_versions(package.borrow()) - .filter(|v| range.borrow().contains(v)) - .next(); - Ok((package, version.cloned())) -} -``` - -The second required method is the `get_dependencies` method. For a given package -version, this method should return the corresponding dependencies. Retrieving -those dependencies may fail due to IO or other issues, and in such cases the -function should return an error. Even if it does not fail, we want to -distinguish the cases where the dependency provider does not know the answer and -the cases where the package has no dependencies. For this reason, the return -type in case of a success is the `Dependencies` enum, defined as follows. - -```rust -pub enum Dependencies { - Unknown, - Known(DependencyConstraints), -} - -pub type DependencyConstraints = Map>; -``` - -Finally, there is an optional `should_cancel` method. As described in its -documentation, this method is regularly called in the solver loop, and defaults -to doing nothing. But if needed, you can override it to provide custom behavior, -such as giving some feedback, or stopping early to prevent ddos. Any useful -behavior would require mutability of `self`, and that is possible thanks to -interior mutability. Read on the [next section](./caching.md) for more info on -that! diff --git a/src/pubgrub_crate/custom_report.md b/src/pubgrub_crate/custom_report.md index 433b14d..681872e 100644 --- a/src/pubgrub_crate/custom_report.md +++ b/src/pubgrub_crate/custom_report.md @@ -4,54 +4,54 @@ The `DerivationTree` is a custom binary tree where leaves are external incompatibilities, defined as follows, ```rust -pub enum External { +pub enum External { /// Initial incompatibility aiming at picking the root package for the first decision. - NotRoot(P, V), - /// No versions from range satisfy given constraints. - NoVersions(P, Range), - /// Dependencies of the package are unavailable for versions in that range. - UnavailableDependencies(P, Range), + NotRoot(P, VS::V), + /// There are no versions in the given set for this package. + NoVersions(P, VS), /// Incompatibility coming from the dependencies of a given package. - FromDependencyOf(P, Range, P, Range), + FromDependencyOf(P, VS, P, VS), + /// The package is unusable for reasons outside pubgrub. + Custom(P, VS, M), } ``` and nodes are derived incompatibilities, defined as follows. ```rust -pub struct Derived { +#[derive(Debug, Clone)] +pub struct Derived { /// Terms of the incompatibility. - pub terms: Map>, - /// Indicate if that incompatibility is present multiple times - /// in the derivation tree. - /// If that is the case, it has a unique id, provided in that option. - /// Then, we may want to only explain it once, - /// and refer to the explanation for the other times. + pub terms: Map>, + /// Indicate if the incompatibility is present multiple times in the derivation tree. + /// + /// If that is the case, the number is a unique id. We may want to only explain this + /// incompatibility once, then refer to the explanation for the other times. pub shared_id: Option, /// First cause. - pub cause1: Box>, + pub cause1: Arc>, /// Second cause. - pub cause2: Box>, + pub cause2: Arc>, } ``` -The `terms` hashmap contains the terms of the derived incompatibility. The rule -is that terms of an incompatibility are terms that cannot be all true at the -same time. So a dependency can for example be expressed with an incompatibility -containing a positive term, and a negative term. For example, `"root"` at -version 1 depends on `"a"` at version 4, can be expressed by the incompatibility -`{root: 1, a: not 4}`. A missing version can be expressed by an incompatibility -with a single term. So for example, if version 4 of package `"a"` is missing, it -can be expressed with the incompatibility `{a: 4}` which forbids that version. -If you want to write your own reporting logic, I'd highly suggest a good -understanding of incompatibilities by reading first the section of this book on -internals of the PubGrub algorithm. +The `terms` hashmap contains the version sets of the derived incompatibility. +The rule is that terms of an incompatibility are terms that cannot be all true +at the same time. So a dependency can for example be expressed with an +incompatibility containing a positive term, and a negative term. For example, +`"root"` at version 1 depends on `"a"` at version 4, can be expressed by the +incompatibility `{root: 1, a: not 4}`. A missing version can be expressed by an +incompatibility with a single term. So for example, if version 4 of package +`"a"` is missing, it can be expressed with the incompatibility `{a: 4}` which +forbids that version. If you want to write your own reporting logic, we +recommend reading first the section of this book on internals of the PubGrub +algorithm to understand the internals of incompatibilities. -The root of the derivation tree is usually a derived incompatibility containing -a single term such as `{ "root": 1 }` if we were trying to solve dependencies -for package `"root"` at version 1. Imagine we had one simple dependency on `"a"` -at version 4, but somehow that version does not exist. Then version solving -would fail and return a derivation tree looking like follows. +The root of the derivation tree is a derived incompatibility containing a single +term such as `{"root": 1}` if we were trying to solve dependencies for package +`"root"` at version 1. Imagine we had one simple dependency on `"a"` at version +4, but somehow that version does not exist. Then version solving would fail and +return a derivation tree looking like follows. ```txt Derived { root: 1 } @@ -91,4 +91,4 @@ For ease of processing, the `DerivationTree` duplicates such nodes in the tree, but their `shared_id` attribute will hold a `Some(id)` variant. In error reporting, you may want to check if you already gave an explanation for a shared derived incompatibility, and in such cases maybe use line references instead of -re-explaning the same thing. +re-explaining the same thing. diff --git a/src/pubgrub_crate/dep_provider.md b/src/pubgrub_crate/dep_provider.md new file mode 100644 index 0000000..2aa433e --- /dev/null +++ b/src/pubgrub_crate/dep_provider.md @@ -0,0 +1,138 @@ +# Implementing a dependency provider + +The `OfflineDependencyProvider` is very useful for testing and playing with the +API, but is not sufficient in complex setting such as Cargo. In those cases, a +dependency provider may need to retrieve package information from caches, from +the disk or from network requests. + +PubGrub is generic over all its internal types. You need: + +- A package type `P` that implements `Clone + Eq + Hash + Debug + Display`, for + example `String` +- A version type `V` that implements `Debug + Display + Clone + Ord`, for + example `SemanticVersion` +- A version set type `VS` that implements + `VersionSet + Debug + Display + Clone + Eq`, for example + `version_ranges::Ranges`. `VersionSet` is defined as: + + ```rust + pub trait VersionSet: Debug + Display + Clone + Eq { + type V: Debug + Display + Clone + Ord; + + // Constructors + + /// An empty set containing no version. + fn empty() -> Self; + + /// A set containing only the given version. + fn singleton(v: Self::V) -> Self; + + // Operations + + /// The set of all version that are not in this set. + fn complement(&self) -> Self; + + /// The set of all versions that are in both sets. + fn intersection(&self, other: &Self) -> Self; + + /// Whether the version is part of this set. + fn contains(&self, v: &Self::V) -> bool; + } + ``` + +- A package priority `Priority that implements `Ord + + Clone`, for example `usize` +- A type for custom incompatibilities `Incompatibility` that implements + `Eq + Clone + Debug + Display`, for example `String` +- The error type returned from the `DependencyProvider` implements + `Error + 'static`, for example `anyhow::Error` + +While PubGrub is generic to encourage bringing your own types tailored to your +use, it also provides some convenience types. For versions, we provide the +`SemanticVersion` and the `version_ranges::Ranges` types. `SemanticVersion` +implements `Version` for versions expressed as `Major.Minor.Patch`. `u32` also +fulfills all requirements of a version. `Ranges` represents multiple intervals +of a continuous ranges with inclusive or exclusive bounds, e.g. +`>=1.3.0,<2 || >4,<=6 || >=7.1`. + +`DependencyProvider` requires implementing three functions: + +```rust +pub trait DependencyProvider { + fn prioritize( + &self, + package: &Self::P, + range: &Self::VS, + package_conflicts_counts: &PackageResolutionStatistics, + ) -> Self::Priority; + + fn choose_version( + &self, + package: &Self::P, + range: &Self::VS, + ) -> Result, Self::Err>; + + fn get_dependencies( + &self, + package: &Self::P, + version: &Self::V, + ) -> Result, Self::Err>; + +} +``` + +`prioritize` determines the order in which versions are chosen for packages. + +Decisions are always made for the highest priority package first. The order of +decisions determines which solution is chosen and can drastically change the +performances of the solver. If there is a conflict between two package versions, +the package with the higher priority is preserved and the lower priority gets +discarded. Usually, you want to decide more certain packages (e.g. those with a +single version constraint) and packages with more conflicts first. This function +has potentially the highest impact on resolver performance. + +Example: + +- The root package depends on A and B +- A 2 depends on C 2 +- B 2 depends on C 1 +- A 1 has no dependencies +- B 1 has no dependencies + +If we always try the highest version first, prioritization determines the +solution: If A has higher priority, the solution will be A 2, B 1, if B has +higher priority, the solution will be A 1, B 2. + +The `package_conflicts_counts` argument provides access to some other heuristics +that are production users have found useful. Although the exact meaning/efficacy +of those arguments may change. + +The function is called once for each new package and then cached until we detect +a (potential) change to `range`, otherwise it is cached, assuming that the +priority only depends on the arguments to this function. + +If two packages have the same priority, PubGrub will bias toward a breadth first +search. + +Once the highest priority package has been determined, we want to make a +decision for it and need a version. `choose_version` returns the next version +for to try for this package, which in the current partial solution is +constrained by `range`. If you are trying to solve for the latest version, +return the highest version of the package contained in `range`. + +Finally, the solver needs to know the dependency of that version to determine +which packages to try next or if there are any conflicts. `get_dependencies` +returns the dependencies of a given package version. Usually, you return +`Ok(Dependencies::Available(rustc_hash::FxHashMap::from(...)))`, where the map +is the list of dependencies with their version ranges. You can also return +`Dependencies::Unavailable` if the dependencies could not be retrieved, but you +consider this error non-fatal, for example because the version use an +unsupported metadata format. + +Aside from the required methods, there is an optional `should_cancel` method. +This method is regularly called in the solver loop, currently once per decision, +and defaults to doing nothing. If needed, you can override it to provide custom +behavior, such as giving some feedback, or stopping early to prevent ddos. Any +useful behavior would require mutability of `self`, and that is possible thanks +to interior mutability. Read on the [next section](./caching.md) for more info +on that! diff --git a/src/pubgrub_crate/intro.md b/src/pubgrub_crate/intro.md index b8e1778..1f8808b 100644 --- a/src/pubgrub_crate/intro.md +++ b/src/pubgrub_crate/intro.md @@ -1,28 +1,4 @@ # Using the pubgrub crate -PubGrub is generic over all its internal types. You need: - -- implementing `Clone + Eq + Hash + Debug + Display`, and any version type - implementing our `Version` trait, defined as follows. - -```rust -pub trait Version: Clone + Ord + Debug + Display { - fn lowest() -> Self; - fn bump(&self) -> Self; -} -``` - -The `lowest()` trait method should return the lowest version existing, and -`bump(&self)` should return the smallest version stricly higher than the current -one. - -For convenience, we already provide the `SemanticVersion` type, which implements -`Version` for versions expressed as `Major.Minor.Patch`. We also provide the -`NumberVersion` implementation of `Version`, which is basically just a newtype -for non-negative integers, 0, 1, 2, etc. - -> Note that the complete semver specification also involves pre-release and -> metadata tags, not handled in our `SemanticVersion` simple type. - -Now that we know the `Package` and `Version` trait requirements, let's explain -how to actually use `pubgrub` with a simple example. +We will start with a simple example using the `OfflineDependencyProvider`, then +show how to use the full `DependencyProvider` features step-by-step. diff --git a/src/pubgrub_crate/offline_dep_provider.md b/src/pubgrub_crate/offline_dep_provider.md index 5df9a02..7933e5f 100644 --- a/src/pubgrub_crate/offline_dep_provider.md +++ b/src/pubgrub_crate/offline_dep_provider.md @@ -13,42 +13,40 @@ of the interface. For this scenario our direct dependencies are `menu` and We can model that scenario as follows. ```rust -use pubgrub::solver::{OfflineDependencyProvider, resolve}; -use pubgrub::version::NumberVersion; -use pubgrub::range::Range; +use pubgrub::{resolve, OfflineDependencyProvider, Ranges}; // Initialize a dependency provider. -let mut dependency_provider = OfflineDependencyProvider::<&str, NumberVersion>::new(); +let mut dependency_provider = OfflineDependencyProvider::<&str, Ranges>::new(); // Add all known dependencies. dependency_provider.add_dependencies( - "user_interface", 1, [("menu", Range::any()), ("icons", Range::any())], + "user_interface", + 1u32, + [("menu", Ranges::full()), ("icons", Ranges::full())], ); -dependency_provider.add_dependencies("menu", 1, [("dropdown", Range::any())]); -dependency_provider.add_dependencies("dropdown", 1, [("icons", Range::any())]); -dependency_provider.add_dependencies("icons", 1, []); +dependency_provider.add_dependencies("menu", 1u32, [("dropdown", Ranges::full())]); +dependency_provider.add_dependencies("dropdown", 1u32, [("icons", Ranges::full())]); +dependency_provider.add_dependencies("icons", 1u32, []); // Run the algorithm. -let solution = resolve(&dependency_provider, "user_interface", 1).unwrap(); +let solution = resolve(&dependency_provider, "user_interface", 1u32).unwrap(); ``` -As we can see in the previous code example, the key function of PubGrub version -solver is `resolve`. It takes as arguments a dependency provider, as well as the -package and version for which we want to solve dependencies, here package -`"user_interface"` at version 1. - -The dependency provider must be an instance of a type implementing the -`DependencyProvider` trait defined in this crate. That trait defines methods -that the resolver can call when looking for packages and versions to try in the -solver loop. For convenience and for testing purposes, we already provide an -implementation of a dependency provider called `OfflineDependencyProvider`. As -the names suggest, it doesn't do anything fancy and you have to pre-register all -known dependencies with calls to -`add_dependencies(package, version, vec_of_dependencies)` before being able to -use it in the `resolve` function. - -Dependencies are specified with a `Range`, ranges being version constraints -defining sets of versions. In most cases, you would use `Range::between(v1, v2)` -which means any version higher or equal to `v1` and strictly lower than `v2`. In -the previous example, we just used `Range::any()` which basically means "any -version". +The key function of the PubGrub version solver is `resolve`. Besides the +dependency provider, it takes the root package and its version as +(`"user_interface"` at version 1) as arguments. + +The resolve function gets package metadata from the `DependencyProvider` trait +defined in this crate. That trait defines methods that the resolver can call +when looking for packages and versions to try in the solver loop. For +convenience and for testing purposes, we already provide an implementation of a +dependency provider called `OfflineDependencyProvider`. As the names suggest, it +doesn't do anything fancy and you have to pre-register all known dependencies +with calls to `add_dependencies(package, version, vec_of_dependencies)` before +being able to use it in the `resolve` function. + +In the `OfflineDependencyProvider`, dependencies are specified with `Ranges`, +defining version constraints. In most cases, you would use +`Range::between(v1, v2)` which means any version higher or equal to `v1` and +strictly lower than `v2`. In the previous example, we just used `Range::full()` +which means that any version is acceptable. diff --git a/src/pubgrub_crate/solution.md b/src/pubgrub_crate/solution.md index dc9536f..8beb66a 100644 --- a/src/pubgrub_crate/solution.md +++ b/src/pubgrub_crate/solution.md @@ -5,17 +5,16 @@ versions satisfying all the constraints of direct and indirect dependencies. Sometimes however, there is no solution because dependencies are incompatible. In such cases, the algorithm returns a `PubGrubError::NoSolution(derivation_tree)` where the provided derivation tree -is a custom binary tree containing the chain of reasons why there is no -solution. +is a binary tree containing the chain of reasons why there is no solution. -All the items in the tree are called incompatibilities and may be of two types, -either "external" or "derived". Leaves of the tree are external -incompatibilities, and nodes are derived. External incompatibilities express -facts that are independent of the way this algorithm is implemented such as - -- dependencies: package "a" at version 1 depends on package "b" at version 4 -- missing dependencies: dependencies of package "a" are unknown -- absence of version: there is no version of package "a" higher than version 5 +The items in the tree are called incompatibilities and may be either external. +Leaves of the tree are external and custom incompatibilities, and nodes are +derived. External incompatibilities express facts that are independent of the +way this algorithm is implemented such as package "a" at version 1 depends on +package "b" at version 4, or that there is no version of package "a" higher than +version 5. Custom incompatibilities are a user provided generic type parameter +that can express missing versions, such as that dependencies of package "a" are +not in cache, but the user requested an offline resolution. In contrast, derived incompatibilities are obtained during the algorithm execution by deduction, such as if "a" depends on "b" and "b" depends on "c", @@ -27,9 +26,7 @@ is human-friendly is not an easy task. For convenience, this crate provides a `String` explanation of the failure. You may use it as follows. ```rust -use pubgrub::solver::resolve; -use pubgrub::report::{DefaultStringReporter, Reporter}; -use pubgrub::error::PubGrubError; +use pubgrub::{resolve, DefaultStringReporter, PubGrubError, Reporter}; match resolve(&dependency_provider, root_package, root_version) { Ok(solution) => println!("{:?}", solution), diff --git a/src/pubgrub_crate/strategy.md b/src/pubgrub_crate/strategy.md index 8ad8140..ee394b5 100644 --- a/src/pubgrub_crate/strategy.md +++ b/src/pubgrub_crate/strategy.md @@ -7,30 +7,14 @@ corresponding valid ranges of versions. Then, there is some freedom regarding which of those package versions to choose. The strategy employed to choose such package and version cannot change the -existence of a solution or not, but can drastically change the performances of -the solver, or the properties of the solution. The documentation of -[Pub](https://github.com/dart-lang/pub) -([Dart](https://github.com/dart-lang/language)'s package manager that uses -PubGrub under the hood) -[states the following](https://github.com/dart-lang/pub/blame/SDK-2.10.0-64.0.dev/doc/solver.md#L446-L449): - -> Pub chooses the latest matching version of the package with the fewest -> versions that match the outstanding constraint. This tends to find conflicts -> earlier if any exist, since these packages will run out of versions to try -> more quickly. But there's likely room for improvement in these heuristics. +existence of a solution or not, but can drastically change the performance of +the solver, or the properties of the solution. In our implementation of PubGrub, decision making responsibility is divided into two pieces. The resolver takes care of making a preselection for potential packages and corresponding ranges of versions. Then it's the dependency provider that has the freedom of employing the strategy of picking a single package -version within the `choose_package_version` method. - -```rust -fn choose_package_version, U: Borrow>>( - &self, - potential_packages: impl Iterator, -) -> Result<(T, Option), Box>; -``` +version through `prioritize` and `choose_version`. ## Picking a package @@ -40,7 +24,7 @@ potential packages, it will continue to be proposed as such until we pick it, or a conflict shows up and the solution is backtracked before needing it. Imagine that one such potential package is limited to a range containing no -existing version, we are heading directly to a conflict! So we have better +existing version, we are heading directly to a conflict! So we are better dealing with that conflict as soon as possible, instead of delaying it for later since we will have to backtrack anyway. Consequently, we always want to pick first a conflictual potential package with no valid version. Similarly, @@ -48,29 +32,15 @@ potential packages with only one valid version give us no choice and limit options going forward, so we might want to pick such potential packages before others with more version choices. Generalizing this strategy to picking the potential package with the lowest number of valid versions is a rather good -heuristic performance-wise. - -This strategy is the one employed by the `OfflineDependencyProvider`. For -convenience, we also provide a helper function -`choose_package_with_fewest_versions` directly embedding this strategy. It can -be used directly in `choose_package_version` if provided a helper function to -retrieve existing versions of a package -`list_available_versions: Fn(&P) -> Iterator`. +heuristic performance-wise. This strategy is the one employed by the +`OfflineDependencyProvider`. You can use the `PackageResolutionStatistics` +passed into `prioritize` for a heuristic for conflicting-ness: The more conflict +a package had, the higher its priority should be. ## Picking a version -By default, the version returned by the helper function -`choose_package_with_fewest_versions` is the first compatible one in the -iterator returned by `list_available_versions` for the chosen package. So you -can order the iterator with preferred versions first and they will be picked by -the solver. This is very convenient to easily switch between a dependency -provider that picks the most recent compatible packages and one that chooses -instead the oldest compatible versions. Such behavior may be desirable for -checking that dependencies lower bounds still pass the code tests for example. - -In general, letting the dependency provider choose a version in -`choose_package_version` provides a great deal of flexibility and enables things -like +In general, letting the dependency provider choose a version in `choose_version` +provides a great deal of flexibility and enables things like - choosing the newest versions, - choosing the oldest versions, diff --git a/src/testing/benchmarking.md b/src/testing/benchmarking.md index e2b9302..c82793f 100644 --- a/src/testing/benchmarking.md +++ b/src/testing/benchmarking.md @@ -1,20 +1,23 @@ # Benchmarking -Performance optimization is a tedious but very rewarding practice if done right. -It requires rigor and sometime arcane knowledge of lol level details. If you are -interested in performance optimization for pubgrub, we suggest reading first -[The Rust Performance Book][perf-book] +If you are interested in performance optimization for pubgrub, we suggest +reading [The Rust Performance Book][perf-book]. +[Microbenchmarks generally do not represent real world performance](https://youtu.be/eh3VME3opnE&t=315), +so it's best to start with a slow case in [uv](https://github.com/astral-sh/uv) +or [cargo](https://github.com/Eh2406/pubgrub-crates-benchmark) and look at a +profile, e.g. with [samply](https://github.com/mstange/samply). [perf-book]: https://nnethercote.github.io/perf-book/ -## Side note about code layout and performance +A first step is optimizing IO/network requests, caching and type size in the +downstream code, which is usually the bottleneck over the pubgrub algorithm. +Using `Arc` and splitting types into large and small variants often helps a lot. +The next step is to optimize prioritization, to reduce the amount of work that +pubgrub has to do, here pubgrub should give better information to make this +easier for users. These usually need to be done before pubgrub itself becomes a +bottleneck. -Changing your username has an impact on the performances of your code. This is -not clickbait I promess, but simply an effect of layout changes. Se before -making any assumption on performance improvement or regression try making sure -that measures are actually reflecting the intent of the code changes and not -something else. It was shown that code layout changes can produce ±40% in -performance changes. +## Side note about code layout and performance I highly recommend watching the talk ["Performance Matters" by Emery Berger][perf-talk] presented at strangeloop 2019 for more information on "sound diff --git a/src/testing/property.md b/src/testing/property.md index 3396379..9ddbb61 100644 --- a/src/testing/property.md +++ b/src/testing/property.md @@ -7,6 +7,9 @@ located in the `tests/` directory. Property tests are co-located both with unit tests and integration tests depending on required access to some private implementation details. +`version_ranges` additionally exposes `version_ranges::proptest_strategy` to +help testing both pubgrub and user code. + ## Examples We have multiple example cases inside `tests/examples.rs`. Those mainly come