Subsystem: Search 🔎
This job updates the search auxiliary files used by the search service. This also updates the downloads and owners data in the Azure Search "search" index.
Note that the "excluded packages" list is currently not updated by this job. For more information, see the
ExcludedPackages.v1.json
documentation.
You can run this job using:
NuGet.Jobs.Auxiliary2AzureSearch.exe -Configuration path\to\your\settings.json
This job is a singleton. Only a single instance of the job should be running per Azure Search resource.
The easiest way to run the tool if you are on the nuget.org team is to use the DEV environment resources:
- Install the certificate used to authenticate as our client AAD app registration into your
CurrentUser
certificate store. - Clone our internal
NuGetDeployment
repository. - Update your cloned copy of the DEV Auxiliary2AzureSearch appsettings.json file to authenticate using the certificate you installed:
{
...
"KeyVault_VaultName": "PLACEHOLDER",
"KeyVault_ClientId": "PLACEHOLDER",
"KeyVault_CertificateThumbprint": "PLACEHOLDER",
"KeyVault_ValidateCertificate": true,
"KeyVault_StoreName": "My",
"KeyVault_StoreLocation": "CurrentUser"
...
}
- Update the
-Configuration
CLI option to point to the DEV Azure Search settings:NuGetDeployment/src/Jobs/NuGet.Jobs.Cloud/Jobs/Auxiliary2AzureSearch/DEV/northcentralus/a/appsettings.json
As an alternative to using nuget.org's DEV resources, you can also run this tool using your personal Azure resources.
Run the Db2AzureSearch
tool.
Once you've created your Azure resources, you can create your settings.json
file. There's a few PLACEHOLDER
values you will need to fill in yourself:
- The
GalleryDb:ConnectionString
setting is the connection string to your Gallery DB. - The
SearchServiceName
setting is the name of your Azure Search resource. For example, use the namefoo-bar
for the Azure Search service with URLhttps://foo-bar.search.windows.net
. - The
SearchServiceApiKey
setting is an admin key that has write permissions to the Azure Search resource. Make sure the Azure Search resource you're connecting to has API keys enabled (either in parallel with managed identities "RBAC" access or with managed identities authentication disabled). - The
AuxiliaryDataStorageContainer
andStorageConnectionString
settings are the connection strings to your Azure Blob Storage account.
{
"GalleryDb": {
"ConnectionString": "PLACEHOLDER"
},
"Auxiliary2AzureSearch": {
"AzureSearchBatchSize": 1000,
"MaxConcurrentBatches": 1,
"MaxConcurrentVersionListWriters": 32,
"SearchServiceName": "PLACEHOLDER",
"SearchServiceApiKey": "PLACEHOLDER",
"SearchIndexName": "search-000",
"HijackIndexName": "hijack-000",
"StorageConnectionString": "PLACEHOLDER",
"StorageContainer": "v3-azuresearch-000",
"StoragePath": "",
"DownloadsV1JsonUrl": "PLACEHOLDER",
"MinPushPeriod": "00:00:10",
"MaxDownloadCountDecreases": 30000,
"EnablePopularityTransfers": true,
"Scoring": {
"PopularityTransfer": 0.99
}
}
}
At a high-level, here's how Auxiliary2AzureSearch works:
- Update verified packages
- Get the "old" list of verified package IDs from search auxiliary storage
- Get the "new" list of verified package IDs from Gallery DB
- Replace the verified package auxiliary file if needed
- Update downloads
- Get the "old" downloads data from search auxiliary storage
- Get the "new" downloads data from statistics auxiliary storage via URL
- Determine which packages have download changes
- Get the "old" popularity transfers data from search auxiliary storage
- Get the "new" popularity transfers data from Gallery DB
- Determine which packages have popularity transfer changes
- Update Azure Search documents in the "search" index to reflect the latest downloads and popularity transfers
- Save the "new" downloads data to the search auxiliary storage
- Save the "new" popularity transfers data to search auxiliary storage
- Update owners
- Get the "old" owners data from search auxiliary storage
- Get the "new" owners data from Gallery DB
- Update Azure Search documents in the "search" index to reflect the ownership changes, if any
- Track ownership changes in search auxiliary storage
- Save the "new" owners data to the search auxiliary storage