Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for package type queries in both local and remote sources #5991

Open
wants to merge 7 commits into
base: dev
Choose a base branch
from

Conversation

glopesdev
Copy link

@glopesdev glopesdev commented Aug 25, 2024

Bug

Fixes: NuGet/Home#8915

Description

nuget.org allows filtering by package type using the packageType query parameter since SearchQueryService/3.5.0. This parameter has not been officially surfaced by the client API, although relevant infrastructure has already been introduced in the SearchFilter class:

public IEnumerable<string> PackageTypes { get; set; } = Enumerable.Empty<string>();

Upon inspection of relevant source code, it looks like the relevant bits had already been introduced in PackageSearchResourceV3 but using the old query parameter name packageTypeFilter. The fix is simply replacing the below query parameter with packageType:

if (filters.PackageTypes != null
&& filters.PackageTypes.Any())
{
var types = string.Join("&",
filters.PackageTypes.Select(
s => "packageTypeFilter=" + s));
queryString += "&" + types;
}

For completeness, we also add package type filtering functionality to LocalPackageSearchResource by filtering over the package types available through package.Nuspec.GetPackageTypes().

PR Checklist

  • Meaningful title, helpful description and a linked NuGet/Home issue
  • Added tests
  • Link to an issue or pull request to update docs if this PR changes settings, environment variables, new feature, etc.

@glopesdev glopesdev requested a review from a team as a code owner August 25, 2024 19:25
@dotnet-policy-service dotnet-policy-service bot added the Community PRs created by someone not in the NuGet team label Aug 25, 2024
Comment on lines 143 to 146
var types = string.Join("&",
filters.PackageTypes.Select(
s => "packageTypeFilter=" + s));
s => "packageType=" + s));
queryString += "&" + types;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joelverhagen are multiple package types defined as &packageType=type1&packageType=type2 or &packageType=type1+type2?

Additionally, are the semantics that the package must contain all package types, or just one? This is important, to make sure that local folder feed sources behave in the same way.

the docs are not sufficiently specific about multiple values.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only a single package type filter is supported. The package type parameter is plumbed through in our search service as a single string:
https://github.com/NuGet/NuGetGallery/blob/dc7abf2bd145d91596b032c19e1c9abe8b276956/src/NuGet.Services.AzureSearch/SearchService/SearchParametersBuilder.cs#L169-L172

PRs on the docs are welcome if they are unclear!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding multiple package types on the package, the package type parameter matches the package if ANY of the package types match, per:

The packageType parameter is used to further filter the search results to only packages that have at least one package type matching the package type name.

Emphasis mine.

Copy link
Author

@glopesdev glopesdev Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joelverhagen @zivkan Thank you both for the feedback. In that case, it seems that the API surface of SearchFilter might be slightly inconsistent with current SearchQueryService/3.5.0 capabilities, since SearchFilter allows for multiple package types to be specified.

How do you prefer to address this? Shall PackageSearchResourceV3 and LocalPackageSearchResource throw if more than one package type is provided? Or shall we deprecate the entire PackageTypes property and introduce a new PackageType property with a single value?

Personally I would prefer not touching the current surface API, since this feature is very much needed and breaking the surface API might delay adoption and confuse existing API consumers (having two properties vary their names by a single letter is usually very much discouraged).

@joelverhagen Is there any expectation that multiple package type filters will ever be supported? That might inform whether the deprecation route might be preferred.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no expectation that NuGet.org should support multiple package type filters. If this is needed, we would need to design, implement, and document a protocol enhancement. I am not aware of any other package source that implements multiple package type filters.

IMHO, PackageTypes should be deprecated and PackageType should be introduced, to match the current state of the protocol. But I do not work on the client side much so I may be missing something.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know of any plans to add multiple package type support. It's not even clear to what existing scenario could benefit from such a filter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I would prefer not touching the current surface API

To be fair, this PR is breaking server APIs, even if it's not breaking the assembly's API. If there are any servers (that are not nuget.org) that use the current packageTypeFilter, this PR will break clients using those servers. There's evidence that (in 2018) neither nuget.org nor myget implemented packageTypeFilter, but there's no way to know if some other server uses it. On the other hand, the V2 protocol isn't documented, and packageTypeFilter was never part of the official V3 protocol.

I'm trying to rush reviewing a bunch of old PRs that I haven't reviewed in a long time, so I haven't yet formed an opinion about breaking APIs (about changing NuGet.Protocol to remove the plural PackageTypes and have a singular PackageType). There's a case to be made that NuGet.Client should implement the official protocol spec.

@dotnet-policy-service dotnet-policy-service bot added Status:No recent activity PRs that have not had any recent activity and will be closed if the label is not removed and removed Status:No recent activity PRs that have not had any recent activity and will be closed if the label is not removed labels Sep 3, 2024
@dotnet-policy-service dotnet-policy-service bot added the Status:No recent activity PRs that have not had any recent activity and will be closed if the label is not removed label Sep 11, 2024
@nkolev92 nkolev92 added the Merge next release PRs that should not be merged until the dev branch targets the next release label Sep 19, 2024
@dotnet-policy-service dotnet-policy-service bot removed the Status:No recent activity PRs that have not had any recent activity and will be closed if the label is not removed label Sep 19, 2024
@dotnet-policy-service dotnet-policy-service bot added the Status:No recent activity PRs that have not had any recent activity and will be closed if the label is not removed label Sep 27, 2024
@nkolev92 nkolev92 removed the Merge next release PRs that should not be merged until the dev branch targets the next release label Sep 27, 2024
@dotnet-policy-service dotnet-policy-service bot removed the Status:No recent activity PRs that have not had any recent activity and will be closed if the label is not removed label Sep 27, 2024
@microsoft-github-policy-service microsoft-github-policy-service bot added the Status:No recent activity PRs that have not had any recent activity and will be closed if the label is not removed label Oct 4, 2024
@fforjan
Copy link

fforjan commented Nov 18, 2024

any update on this ?
@zivkan or @nkolev92 ?

@microsoft-github-policy-service microsoft-github-policy-service bot removed the Status:No recent activity PRs that have not had any recent activity and will be closed if the label is not removed label Nov 18, 2024
@fforjan
Copy link

fforjan commented Nov 18, 2024

the reason is that I could not find a way to filter on a package type for a search, i'm not aware of any workaround

@microsoft-github-policy-service microsoft-github-policy-service bot added the Status:No recent activity PRs that have not had any recent activity and will be closed if the label is not removed label Nov 25, 2024
@zivkan
Copy link
Member

zivkan commented Nov 27, 2024

Sorry again for the delay. The lead up to .NET 9 GA, and the days following, were very busy.

Anyway, if someone in the NuGet team was doing this, I would ask them for the following:

  1. The protocol docs say that the server needs to publish the SearchQueryService/3.5.0 resource in the service index for package type filtering to be available, so I think PackageSearchResourveV3 should throw a NotSupportedException when a package type is supplied, and the server does not have the SearchQueryService/3.5.0 resource.
  2. Clients need to know when a server supports package type filtering, so add a boolean property so that clients can make decisions and avoid unexpected exceptions.
  3. As stated here, I think we should break the existing API because the server protocol doesn't support multiple package type filters. I don't see the point in maintaining an API which never worked, and therefore is very unlikely that anyone actually uses. I used to be much more conservative in not breaking public APIs, but NuGet.Client has accumulated too much tech debt, and public APIs are one contributing factor. Alternatively throw an exception when more than one package type is provided, but this doesn't make me feel good. I don't see multi-package type filtering being added in the future.

I don't consider myself "good" at open source. After all, look at how much time passes between my comments here. I also don't know if it's reasonable to expect the same quality of contribution from external people as team members who work in the code every day. What I am confident about is that when something doesn't work as expected, NuGet will get the support requests, not the original contributor, which is why I typically don't approve PRs from external contributions unless they're close to the quality that I'd expect of a team member. And I'm working on higher priority changes, so I don't have the time to take over this PR and implement the 3 things I listed above.

@microsoft-github-policy-service microsoft-github-policy-service bot removed the Status:No recent activity PRs that have not had any recent activity and will be closed if the label is not removed label Nov 27, 2024
@microsoft-github-policy-service microsoft-github-policy-service bot added the Status:No recent activity PRs that have not had any recent activity and will be closed if the label is not removed label Dec 4, 2024
@glopesdev
Copy link
Author

I think we should break the existing API because the server protocol doesn't support multiple package type filters.

@zivkan I was trying to do this and started getting the below analyzer error when declaring a single string property:

public string PackageType { get; set; }

RS0016 All public types and members should be declared in PublicAPI.txt. This draws attention to API changes in the code reviews and source control history, and helps prevent breaking changes.

Upon investigation there are actually three different PublicAPI.Shipped.txt files (net472, netcoreapp5.0, and netstandard2.0) where I found the currrent API declared with the syntax:

~NuGet.Protocol.Core.Types.SearchFilter.PackageTypes.get -> System.Collections.Generic.IEnumerable<string>
~NuGet.Protocol.Core.Types.SearchFilter.PackageTypes.set -> void

Trying to follow the pattern for other properties I changed it to:

~NuGet.Protocol.Core.Types.SearchFilter.PackageType.get -> string
~NuGet.Protocol.Core.Types.SearchFilter.PackageType.set -> void

As expected this now raises the same RS0016 error for the existing PackageTypes property, but still does not let me specify the new PackageType property.

Is there anything obvious I might be missing? Also, should I be adding the new API to the Unshipped.txt files instead? This made me wonder whether we want to mark the old property PackageTypes as obsolete instead of simply removing it. Is there a convention or guidelines somewhere on the recommended approach to introducing such breaking changes?

Apologies in advance for not being very clear on the use and syntax of the PublicAPI.txt files.

@microsoft-github-policy-service microsoft-github-policy-service bot removed the Status:No recent activity PRs that have not had any recent activity and will be closed if the label is not removed label Dec 17, 2024
@zivkan
Copy link
Member

zivkan commented Dec 18, 2024

I also don't know the syntax for the public API files. When using an IDE, the analyzer will have a codefix (available from the "lightbulb" button), which will automatically add a syntactically correct line into the Unshipped file.

Unfortunately the analyzer (or roslyn itself) only works on one target framework at a time, so you need to repeat codefix for each target framework, as we documented here: https://github.com/NuGet/NuGet.Client/blob/dev/docs/nuget-sdk.md#development

@microsoft-github-policy-service microsoft-github-policy-service bot added Status:No recent activity PRs that have not had any recent activity and will be closed if the label is not removed and removed Status:No recent activity PRs that have not had any recent activity and will be closed if the label is not removed labels Dec 26, 2024
@glopesdev glopesdev force-pushed the dev-glopesdev-packagetype-queries branch from 287cdc3 to cf99bb6 Compare December 28, 2024 15:25
@glopesdev
Copy link
Author

glopesdev commented Dec 28, 2024

@zivkan Thanks for the feedback, no worries about any delay.

  1. The protocol docs say that the server needs to publish the SearchQueryService/3.5.0 resource in the service index for package type filtering to be available, so I think PackageSearchResourveV3 should throw a NotSupportedException when a package type is supplied, and the server does not have the SearchQueryService/3.5.0 resource.

This is a good point, and made me go back to double-check whether SearchQueryService/3.5.0 was actually requested as an endpoint. It turns out that it had been left out entirely from the list of versions in ServiceTypes. I have now fixed this in 1e4abdf.

However, I am a bit at a loss at how to implement the check you requested, since the way that search query service endpoints are accessed right now is not via the "resource" abstraction. Instead, the API requests a list of URI endpoints via the GetServiceEntryUris extension method:

var endpoints = serviceIndex.GetServiceEntryUris(ServiceTypes.SearchQueryService);
var httpSourceResource = await source.GetResourceAsync<HttpSourceResource>(token);

Because PackageSearchResourceV3 works directly on the list of searchEndpoints, there is no information about versions that can be accessed in the current version of the code.

My feeling is perhaps we should refactor the entire approach to use GetServiceEntries directly, which returns a list of ServiceIndexEntry objects. Inside this class there is a ClientVersion property which I hope corresponds to the version of the SearchQueryService (can you confirm this is true?).

  1. Clients need to know when a server supports package type filtering, so add a boolean property so that clients can make decisions and avoid unexpected exceptions.

Given the above, I am going to interpret this as exposing a new boolean property in PackageSearchResourceV3, which will be true iff at least one of the endpoints is 3.5.0 compatible. We would then throw the exception only if none of the provided server endpoints supports 3.5.0.

Does this sound acceptable?

  1. As stated here, I think we should break the existing API because the server protocol doesn't support multiple package type filters. I don't see the point in maintaining an API which never worked, and therefore is very unlikely that anyone actually uses. I used to be much more conservative in not breaking public APIs, but NuGet.Client has accumulated too much tech debt, and public APIs are one contributing factor. Alternatively throw an exception when more than one package type is provided, but this doesn't make me feel good. I don't see multi-package type filtering being added in the future.

Done, see cf99bb6.

@zivkan
Copy link
Member

zivkan commented Dec 28, 2024

However, I am a bit at a loss at how to implement the check you requested, since the way that search query service endpoints are accessed right now is not via the "resource" abstraction. Instead, the API requests a list of URI endpoints via the GetServiceEntryUris extension method:

var endpoints = serviceIndex.GetServiceEntryUris(ServiceTypes.SearchQueryService);
var httpSourceResource = await source.GetResourceAsync<HttpSourceResource>(token);

Since PackageSearchResourceV3Provider gets the service entries, it can check the versions. Then PackageSearchResourceV3 can have a constructor parameter, or init property.

My feeling is perhaps we should refactor the entire approach to use GetServiceEntries directly, which returns a list of ServiceIndexEntry objects. Inside this class there is a ClientVersion property which I hope corresponds to the version of the SearchQueryService (can you confirm this is true?).

Something I don't like about the version strings we use is that it implies that newer client versions support all past features (making it "impossible" to deprecate features), and it implies that if a server wants to support a particular feature, it must also implement all past versions as well, even if they have features they don't want. This makes sense for some resources and features like package metadata (registration) where older versions needed the response to be uncompressed, and newer clients support gzip and deflate. I mean, that really should have been done via the Accept-Encoding header, but nuget.org's server architecture has a limitation that prevents that from being technically viable and therefore the people working on NuGet at the time decided to bake it into the service index resources instead.

Anyway, I'm getting way off topic. What I'm trying to get at, is that going forward we probably need to treat the service index entries as feature flags, not as a client version. bool supportsPackageTypeFilter = entries.Contains("3.5.0"), bool supportsSomeFutureFeature = entries.Contains("9.8.7"). Max resource version should not imply older features are supported.

Given the above, I am going to interpret this as exposing a new boolean property in PackageSearchResourceV3
...

Does this sound acceptable?

Yes, that's exactly what I was thinking.

@glopesdev
Copy link
Author

glopesdev commented Dec 29, 2024

Something I don't like about the version strings we use is that it implies that newer client versions support all past features (making it "impossible" to deprecate features), and it implies that if a server wants to support a particular feature, it must also implement all past versions as well, even if they have features they don't want.

The protocol docs say that the server needs to publish the SearchQueryService/3.5.0 resource in the service index for package type filtering to be available, so I think PackageSearchResourveV3 should throw a NotSupportedException when a package type is supplied, and the server does not have the SearchQueryService/3.5.0 resource.

@zivkan I've been thinking more about these two points and I admit I am now conflicted on this. My personal use case is reproducing a NuGet client much like the VS code package manager. In these types of front-end, the user may select among a large number of package sources for search, including local folders, old v3 and v2 clients, or the "All" option which works as a mixed package source which can run search across all of the above simultaneously.

At that point, what should happen if the user runs a filtered search on "All" and by chance it happens that one of the sources does not support package type filtering? Should an unhandled exception bubble up to the user in this case? And what should the expected resolution be from a user point of view? Should we really disallow running package type filtering on the "All" source entirely whenever the user adds a single package source which does not support it? This would mean that to fix the situation the user would have to understand pretty deeply the NuGet service index protocol and modify by hand the package source list to get things working again.

I am really fearful this could create confusion and further error reports. Also it makes the abstraction of the SearchFilter class start to leak a bit. This class represents filtering options over all possible package sources for both v2 and v3 protocols. If suddenly the user of SearchFilter needs to worry about the exact contents of the service index entry list I feel the API will become even more cumbersome and complicated to use.

Not sure about others, but I know for example that we wouldn't use it in this case, and would simply default to having no filter and instead use our current workaround of wrapping the resource and modify query strings directly so we retain the simplicity.

What might be the negative impact of simply ignoring the filter if the resource does not support it? Maybe @joelverhagen has some thoughts on this since related decisions were impacting the SDK, e.g. dotnet/sdk#12038?

@glopesdev
Copy link
Author

glopesdev commented Dec 29, 2024

At that point, what should happen if the user runs a filtered search on "All" and by chance it happens that one of the sources does not support package type filtering?

I guess answering myself, one option might be we leave it to the client front-end to adjust its own implementation of the "All" feed to avoid launching queries to package sources which do not support package type filtering.

@zivkan
Copy link
Member

zivkan commented Jan 2, 2025

one option might be we leave it to the client front-end to adjust its own implementation of the "All" feed to avoid launching queries to package sources which do not support package type filtering.

Yes, this was exactly my intention.

If suddenly the user of SearchFilter needs to worry about the exact contents of the service index entry list I feel the API will become even more cumbersome and complicated to use.

That's the point of the boolean property. The front ends check it before setting the type filter.

What might be the negative impact of simply ignoring the filter if the resource does not support it?

Customers expecting the results to only contain packages matching the type filter, but getting packages that do not match the type filter, and then getting confused, frustrated, and/or reporting bugs to us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Community PRs created by someone not in the NuGet team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature ask client side -- Surface PackageType query parameter in the NuGet.org search API
5 participants