Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NVD download not setting read timeout for HTTP5 Client #7418

Open
danshome opened this issue Feb 17, 2025 · 28 comments
Open

NVD download not setting read timeout for HTTP5 Client #7418

danshome opened this issue Feb 17, 2025 · 28 comments
Labels

Comments

@danshome
Copy link

danshome commented Feb 17, 2025

[ X] I checked the issues list for existing open or closed reports of the same problem.

Describe the bug
When updating the NVD CVE feed, the download process hangs indefinitely after successfully downloading a portion of the records (e.g., at 120,000/281,554). Despite repeated warnings such as “Retrying request /rest/json/cves/2.0?resultsPerPage=2000&startIndex=130000 : 3rd time”, the update never times out or fails gracefully, leaving the build stuck. If you kill the process, then you have to start over from scratch, it would nice if it could restart after the last batch it downloaded.

Version of dependency-check used
The problem occurs using version 12.1.0 of the dependency-check CLI (or specify the plugin if applicable).

Log file
Please see the full log output here:

09:38:35 [INFO] Checking for updates
09:38:57 [INFO] NVD API has 281,554 records in this update
09:40:11 [INFO] Downloaded 10,000/281,554 (4%)
09:40:56 [INFO] Downloaded 20,000/281,554 (7%)
09:41:48 [INFO] Downloaded 30,000/281,554 (11%)
09:42:27 [WARNING] Retrying request /rest/json/cves/2.0?resultsPerPage=2000&startIndex=34000 : 3rd time
09:43:20 [INFO] Downloaded 40,000/281,554 (14%)
09:43:58 [WARNING] Retrying request /rest/json/cves/2.0?resultsPerPage=2000&startIndex=42000 : 3rd time
09:44:31 [INFO] Downloaded 50,000/281,554 (18%)
09:45:41 [INFO] Downloaded 60,000/281,554 (21%)
09:47:00 [INFO] Downloaded 70,000/281,554 (25%)
09:48:16 [INFO] Downloaded 80,000/281,554 (28%)
09:49:09 [INFO] Downloaded 90,000/281,554 (32%)
09:50:41 [INFO] Downloaded 100,000/281,554 (36%)
09:51:06 [WARNING] Retrying request /rest/json/cves/2.0?resultsPerPage=2000&startIndex=130000 : 3rd time
09:52:33 [INFO] Downloaded 110,000/281,554 (39%)
09:53:41 [INFO] Downloaded 120,000/281,554 (43%). <<--- **It's been hung here for 5 hours.

To Reproduce
Steps to reproduce the behavior:

Run a dependency-check scan that triggers an update of the NVD API feed.
Observe that the update process begins and successfully downloads a portion of the records.
Notice repeated warnings about retrying requests after a certain record count.
The process hangs indefinitely (e.g., stuck at 120,000/281,554 records) without triggering a timeout.

Expected behavior
The update process should respect a configured read timeout so that if the NVD API stops sending data, the process fails gracefully with an appropriate error message rather than hanging indefinitely. If the read times out, it should retry again; no sleep is necessary since it's already hung long enough to get a read timeout.

Additional context
It appears that while there is a CONNECTION_READ_TIMEOUT setting in URLConnectionFactory, the NvdApiDataSource and related code use Apache Http5 Client without setting the response timeout (via setResponseTimeout). As a result, if the API stops sending data, the connection never times out. A fix would involve configuring a proper response timeout when using the Http5 client and retry logic.

@danshome danshome added the bug label Feb 17, 2025
@danshome
Copy link
Author

Note that there are several comments in #7406 where several others have experienced the same hang.

@devsecops-pe
Copy link

The same thing happens to us too, we are unable to go beyond 36% of the update.

@costas80
Copy link

Same here. Running v12.1.0 via Maven and keep on getting stuck:

...
[INFO] Downloaded 210,000/281,633 (75%)

@pyropunk
Copy link

pyropunk commented Feb 18, 2025

using 12.1. And it only gets to 11%
INFO: Finished configuration in 51 ms. [INFO] Checking for updates [INFO] NVD API has 281,634 records in this update [INFO] Downloaded 10,000/281,634 (4%) [INFO] Downloaded 20,000/281,634 (7%) [INFO] Downloaded 30,000/281,634 (11%)

@donlouie
Copy link

I encountered the same issue, with progress halting at 43%, even after using a new NVD API Key.

Image

@Benjamin-deToy
Copy link

is there a solution to this yet?

@costas80
Copy link

As a workaround I followed the DB caching guide to avoid full DB downloads. As long as there are no DB format changes requiring a full DB download again this avoids the issue.

@jeremylong
Copy link
Collaborator

@costas80 many database changes - as long as you are using the H2 database most upgrades can update the schema as well without requiring a purge.

@mjanczykowski
Copy link

I have caching but I'm unable to finish the initial download. Max 60%...

@jeremylong
Copy link
Collaborator

To prevent isssues like this in the future consider setting up a mirror for the NVD data. For instance, this one is used to build the dependency-check github action.

There is an issue at the NVD side due to the volume of requests. The more people that use mirrors or use some other mechanism to keep the data directory persisted between executions of dependency-check the better.

@kwin
Copy link
Contributor

kwin commented Feb 19, 2025

Is there any example on how to set up caching with and for GitHub Actions and the maven plugin?

@jeremylong
Copy link
Collaborator

Here is an action that creates a cache in the repositories gh-pages: https://github.com/dependency-check/DependencyCheck_Builder/blob/main/.github/workflows/cache.yml

Then you just point the datafeed URL to the pages URL.

@ADegele
Copy link

ADegele commented Feb 19, 2025

@jeremylong Is there a way to make the NVD-Update to be saved in Chunks? If the updater would commit changes to the local H2-DB every 10000 records, we could just restart the update when it fails, and we don't need to start from zero when we downloaded half of the whole DB previously.

@kwin
Copy link
Contributor

kwin commented Feb 19, 2025

Is it ok to reference from https://dependency-check.github.io/DependencyCheck_Builder/nvd_cache/ or should everyone set up its own cache?

@jeremylong
Copy link
Collaborator

I know people do reference the builder cache. It is currently maintained.

@danshome
Copy link
Author

danshome commented Feb 19, 2025

To prevent isssues like this in the future consider setting up a mirror for the NVD data. For instance, this one is used to build the dependency-check github action.

There is an issue at the NVD side due to the volume of requests. The more people that use mirrors or use some other mechanism to keep the data directory persisted between executions of dependency-check the better.

@jeremylong There is a behavioral difference now because the Apache HTTP/5 client is being used. I believe you had a read timeout before, but since the newer version uses the HTTP/5 client and does not set the response timeout, it defaults to 0, which means it will hang indefinitely. I bet this is exhausting the NVD server-side connections because they don't seem to be timing out the connection either.

@Benjamin-deToy
Copy link

I know people do reference the builder cache. It is currently maintained.

I am attempting a switch to the builder cache... and am getting the following error with the download. The only configuration setting I changed was the NVD_API_DATAFEED_URL. What am I missing here?

| [2025-02-19 15:41:45.774] [ERROR] org.owasp.dependencycheck.data.update.nvd.api.DownloadTask - Error downloading NVD CVE - https://dependency-check.github.io/DependencyCheck_Builder/nvd_cache/nvdcve-2020.json.gz Reason: Cannot invoke "org.owasp.dependencycheck.utils.Settings.getString(String)" because "this.settings" is null
| [2025-02-19 15:41:45.774] [ERROR] org.owasp.dependencycheck.data.update.nvd.api.DownloadTask - Error downloading NVD CVE - https://dependency-check.github.io/DependencyCheck_Builder/nvd_cache/nvdcve-2019.json.gz Reason: Cannot invoke "org.owasp.dependencycheck.utils.Settings.getString(String)" because "this.settings" is null
| [2025-02-19 15:41:45.775] [ERROR] org.owasp.dependencycheck.Engine - The execution of the download was interrupted
| org.owasp.dependencycheck.data.update.exception.UpdateException: The execution of the download was interrupted
| 	at org.owasp.dependencycheck.data.update.NvdApiDataSource.processDownload(NvdApiDataSource.java:283)
| 	at org.owasp.dependencycheck.data.update.NvdApiDataSource.processDatafeed(NvdApiDataSource.java:172)
| 	at org.owasp.dependencycheck.data.update.NvdApiDataSource.update(NvdApiDataSource.java:115)
| 	at org.owasp.dependencycheck.Engine.doUpdates(Engine.java:903)
| 	at org.owasp.dependencycheck.Engine.initializeAndUpdateDatabase(Engine.java:708)
| 	at org.owasp.dependencycheck.Engine.analyzeDependencies(Engine.java:634)
| 	at nvd.task.check$scan_and_analyze$fn__793.invoke(check.clj:52)
| 	at nvd.task.check$scan_and_analyze.invokeStatic(check.clj:51)
| 	at nvd.task.check$scan_and_analyze.invoke(check.clj:46)
| 	at nvd.task.check$impl.invokeStatic(check.clj:89)
| 	at nvd.task.check$impl.invoke(check.clj:81)
| 	at nvd.task.check$_main.invokeStatic(check.clj:148)
| 	at nvd.task.check$_main.doInvoke(check.clj:100)

@aikebah
Copy link
Collaborator

aikebah commented Feb 19, 2025

@Benjamin-deToy Looks like a missing initialisation... using the gradle plugin?

@aikebah
Copy link
Collaborator

aikebah commented Feb 19, 2025

@Benjamin-deToy it's another piece of code that uses this project's engine I think (check.clj)... at the same point where the code initializes the DependencyCheck Engine with settings it should also initialize the DownloadTask with the same settings (this is a one-off initialisation once per JVM as the Downloader is a singleton used by various HTTP communication tasks)

Downloader.getInstance().configure(settings);

How such a static initializer call translates to Clojure I'll have to leave to the clojure specialists over at nvd-clojure.

@donlouie
Copy link

Hi all, is there any example of how to set up caching for dependency check in an Azure DevOps Pipeline?

@eerik-te
Copy link

Hi all, is there any example of how to set up caching for dependency check in an Azure DevOps Pipeline?

I would also appreciate this information. Currently initial download gets stuck for me so I assume cache never gets done.

@aikebah
Copy link
Collaborator

aikebah commented Feb 20, 2025

@donlouie @eerik-te Best to check within the azuredevops repo https://github.com/dependency-check/azuredevops/
for any clues on that, dependency-check/azuredevops#142 popped up for me on a quick look, did not go through the thread itself.

@alejandro-isar
Copy link

I made the change to use the builder cache and it worked fine for me yesterday, but today I am running some more tests and I am getting errors like this one when trying to download the database from the builder cache. Any idea of what could be happening? @jeremylong @aikebah

[ERROR] Failed to process CVE-2013-6910 org.owasp.dependencycheck.data.nvdcve.DatabaseException: Unable to retrieve id for new vulnerability for 'CVE-2013-6910' at org.owasp.dependencycheck.data.nvdcve.CveDB.updateOrInsertVulnerability(CveDB.java:1399) at org.owasp.dependencycheck.data.nvdcve.CveDB.updateVulnerability(CveDB.java:1098) at org.owasp.dependencycheck.data.update.nvd.api.NvdApiProcessor.updateCveDb(NvdApiProcessor.java:119) at org.owasp.dependencycheck.data.update.nvd.api.NvdApiProcessor.call(NvdApiProcessor.java:102) at org.owasp.dependencycheck.data.update.nvd.api.NvdApiProcessor.call(NvdApiProcessor.java:40) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:840) Caused by: org.h2.jdbc.JdbcSQLNonTransientConnectionException: The database has been closed [90098-232] at org.h2.message.DbException.getJdbcSQLException(DbException.java:690) at org.h2.message.DbException.getJdbcSQLException(DbException.java:489) at org.h2.message.DbException.get(DbException.java:212) at org.h2.engine.SessionLocal.getTransaction(SessionLocal.java:1616) at org.h2.engine.SessionLocal.startStatementWithinTransaction(SessionLocal.java:163

@aikebah
Copy link
Collaborator

aikebah commented Feb 20, 2025

The database has been closed [90098-232]

makes me suspect there is an earlier error reported as well that indicates errors. But typically this type of errror is likely to indicate a corrupted datafile.

@danshome
Copy link
Author

@jeremylong @aikebah With a debugger attached, I can't reproduce the problem, but the issue seems to be in the processApi method. I think I'm getting close to figuring it out.

@danshome
Copy link
Author

danshome commented Feb 20, 2025

@jeremylong @aikebah I tried testing #7437. It didn't help the situation.

I attached with a debugger and could see the problem was happening somewhere in the processApi in NvdApiDataSource.

I asked ChatGPT to see if it could find any areas where hangs might occur, and with the new code it created, I was finally able to get the database downloaded very smoothly with no timeouts or issues.

I've attached a patch file that allowed me to get my first successful download. I've wiped out the H2 multiple times and can download with no problems. As soon as the patch is reverted, the download issues return.

I don't know the codebase well enough to try to submit a pull request where the new configuration parameters are properly integrated, but here is a patch file that works, but it has hardcoded values for some of the configuration parameters that really should be plugin configuration parameters.

Patch details...

processApi.patch

Introduced Retry Logic and Overall Timeout:

  • Wrapped the API update process in a retry loop (up to 5 attempts) to prevent indefinite hangs.
  • Defined an overall update timeout of 30 minutes to ensure the process does not block forever.
  • Added exponential backoff between retries (starting at 5 seconds and capped at 30 seconds) to reduce pressure on the API when issues persist.

Added Per-Batch Processing Timeouts:

  • Set a default processing timeout of 60 seconds for each batch (using a constant instead of the unresolved NVD_API_PROCESSING_TIMEOUT).
  • If a batch times out, the corresponding Future is canceled and a TimeoutException is thrown to trigger a retry.

Improved Executor and Thread Management:

  • Created a dedicated single-thread executor (updateExecutor) to run the update process and a fixed thread pool executor for processing tasks (using PROCESSING_THREAD_POOL_SIZE defined externally).
  • Declared the Future future variable outside the try block so that it can be referenced in catch clauses for proper cancellation.
  • Ensured both executors are shut down gracefully using try-finally blocks with shutdown() and awaitTermination(), and falling back to shutdownNow() if necessary.

Enhanced Exception Handling and Logging:

  • Logged detailed error messages on timeouts and execution exceptions.
  • Propagated InterruptedExceptions properly by restoring the interrupt flag and wrapping them in an UpdateException.
  • Improved clarity in exception reporting by distinguishing between API-specific errors (such as 403/404 errors) and other execution failures.

Refactored Pre-Update and API Builder Configuration:

Moved pre-update checks (e.g., last update timestamp and valid duration) inside the retry loop so that each attempt re-evaluates whether an update is necessary.
Consolidated API builder configuration, including setting the endpoint, API key (with different delays if the key is missing), results-per-page, max retry count, and delay settings.

General Code Clean-Up:

  • Removed obsolete or problematic references (e.g., the unresolved symbol for processing timeout) and replaced them with well-defined constants.
  • Updated import statements to include required classes (Callable, TimeoutException, TimeUnit).

@devsecops-pe
Copy link

Hi @danshome, how did you install this patch?

@danshome
Copy link
Author

danshome commented Feb 21, 2025

Hi @danshome, how did you install this patch?

@devsecops-pe It's a patch to the source code; it's not something you install. You'd have to download the source code, apply the patch, and build it before deploying it to an internal repository like Nexus. I've only tested it locally, and it seems to be more stable, but it could just be a red herring since when I reverted to 12.1.0 later yesterday, I was able to get a download. I think the best option, until @jeremylong or someone who knows the codebase better than I do has time to look at it and integrate it, is that you should check out #7446 or refer to @jeremylong's post from two days ago.

Here is an action that creates a cache in the repositories gh-pages: https://github.com/dependency-check/DependencyCheck_Builder/blob/main/.github/workflows/cache.yml

Then you just point the datafeed URL to the pages URL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests