-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fetching Error with Custom Index #2099
Comments
Thanks! So, looking at the error, I think it's printing the entire fragment. Is the fragment |
I'll try spinning this up myself. |
I'm having trouble getting |
I do not see a # in error message url. It starts with 11eb not #11eb. copying just last part of error exactly besides clipping at end, Caused by: Unexpected fragment (expected If there’s an extra verbose mode/log file to find response I could try to share that cleaned. Or any other steps I could try that may reveal more info. |
Oh yes, sorry -- my question is about what's actually present in the HTML file that the index is using. Like... if you go to https://pypi.org/simple/requests/, and view source, you'll see that every entry looks like:
Notice how the end of the URL is |
Not too familiar with rust, but am familiar with pip/python so I added breakpoints inside pip to take a look at where/how it processes the index html file. That logic lives around here. The response content looks like
If there are any other pip intermediate variables that'd be helpful I can add those too. edit: I think it's url is relative unlike pypi case. pip ends up concatenating href value to the index to the form the full url. |
That's perfect, thanks. Does |
It returns None. |
Okay, cool. (Relative URL is totally fine, by the way.) So pip is effectively ignoring the fragment, while we're raising an error. It feels like a bug in pypiserver or in its configuration, since it looks like a hash, but there should be a hash algorithm at the start, like: <a href="0.2.997/internal_lib-0.2.997-py3-none-any.whl#sha256=9599d3a5a9cbb4203f91395c93a06004eba3c78e0af0dc653b8cbed3eef77176" data-requires-python=">=3.7">internal_lib-0.2.997-py3-none-any.whl</a> The code in pypiserver seems to be roughly here: https://github.com/pypiserver/pypiserver/blob/50c7a78f4f4e288d023d667873b4cbac44e0915c/pypiserver/backend.py#L267. But I imagine that's much harder for you to debug. |
If it's a common thing, we could just loosen the validation and make these warnings for a missing hash prefix. |
Mostly just trying to figure out how the server got into this state, e.g., whether there's an older pypiserver version that did this consistently. |
Yeah it's easy for me to get any pip/response info, but our internal index server unsure I have access (nor awareness) to debug. Hopefully I can get one of security engineers that works on our index server to see if they are aware why the prefix is missing. edit: My personal guess is our setup has proxy internal server and I have a feeling we only ever use one hash type (say sha256) so it's unnecessary to include which type it is. The proxy might also already be doing validation. That's gut guess though as I don't have full knowledge on index setup. |
If you can get any more info I'd really appreciate it! |
Hmm, I found the code for index server where it constructs that url It's up to you on whether uv should be as permissive as pip or to close this as expected. I'm unsure if pip's choice was intentionally permissive (maybe other custom index servers have similar issues), or it's accidental. |
The good news is fixing fragment fixes the original error. The bad news is it still fails to install with a different later error message. As before I'm comparing Here's the new stack trace,
It does look like uv makes more progress and some requests work out. Like before if there's any pip intermediate values that'd be helpful for debugging I can find them. I'll try to find exact urls that pip requests. I notice uv requests url with fragment and maybe pip is requesting the url without the fragment? |
Nice! To confirm, pip is hitting a URL like |
https://github.com/pypa/pip/blob/0ad4c94be74cc24874c6feb5bb3c2152c398a18e/src/pip/_internal/network/download.py#L116 Yup. Pip sends request without the fragment. I suspect that will fix this error and fingers crossed uv would work for me then. |
Oh wow, I had no idea. Okay, that's a bug on our side then. |
I'll take a look now (but need to log off in a few, so may not resolve it in the next few mins). |
Continuing the discussion here, I think at moment my current hypothesis is inconsistent authentication headers between different requests and that logging request headers or pointing me to where key requests (index/wheel) fetch are being sent and I'll try to learn basic rust to do some print debugging. |
@hmc-cs-mdrissi - Here's an example branch that should log at least the headers: #2175 |
Thank you. Here are the new logs, Error Logs
Highlighting header logs,
is successful request (index). The unsuccessful request (wheel),
Very unsure now as both requests have an authorization and failed request url is perfect match with pip's request. Only other noticeable diff I see is method type is different (HEAD vs GET). Could authorization for different http request types potentially differ? Or maybe index we have doesn't support HEAD request? |
Perhaps the index is returning 404 on HEAD request? (You're seeing a 404 now, right?) |
Yes the error is 404. I'm asking security if they're aware whether our index server supports HEAD requests. My confidence is low that's issue, just most noticeable difference in pip vs uv's request I see. |
Ok I successfully confirmed this is likely issue. If I use curl to send a get request it works. The same request (headers) with HEAD request returns 404. I don't think our index server supports HEAD is the issue. |
Okay, let me see if I can add fallback. |
@hmc-cs-mdrissi - Could you try #2186? |
It succeeded. I am now able to install internal-lib. I'll try a more complex command next (install several internal-libraries), but very promising. |
Hmm, I tested installing our full environment of ~80 dependencies with ~400 transitive using your PR. Some installs work, some give 404 errors still. Failed install logs, this one uses actual public package. Some other public packages work, so something specific about path/requests it takes. Failed Install
Good Install Run
|
What do you mean by "public package" Is the index is a proxy, and the "failed" example should be proxying to PyPI? |
Index server supports both public and private (internal) packages. For public ones it proxies. I've gotten some public packages and some private ones to install successfully. Some fail. I still don't know what distinguishes failing package vs one that works fine. I also double checked and package that failed is technically internal (it's public but got patched and re-uploaded internally so looks like internal). Although maybe fact it's both internal and package name exists on pypi might have some relevance (or just luck need to try more packages). I need to review logs first on any ideas. edit: Both runs (good and bad) have,
edit: I found another purely internal package that failed to install while other pure internal ones succeed. |
I made a mistake. We have two index servers, staging and prod version. The staging only contains some internal packages and it's what I was using to test (since there was other sha256= fix earlier). Failed install issue was because some packages were missing. After swapping to using prod index server I think it mostly works. It got pretty far, and eventually failed here
Which looks quite similar to a different open issue and I think that package is several hundred megabytes. Unsure why pip doesn't timeout though and if there's some large wheel optimization/laxer timeout needed. |
## Summary We have at least one reported case of this happening. It's preferable IMO to move on rather than fail hard despite sub-pbar registry behavior. Closes #2099.
Oh sorry, this just got closed since I merged the change that fixed the HEAD requests for you. Do you want to file anything else as separate issues? |
I think closing it is fine. You've resolved main issue and one leftover issue already has an open issue. My guess is any massive (500 MB wheel) may run into some timeouts? Thank you very much for debugging all of this with me. I understand company internal indices are challenge to reproduce and see where issue lies. |
Thank you! It's really hard to debug internal indices but I'm grateful when people can be so actively involved in the process. The way I think of it is: if you experienced this, odds are someone else will too, so it's a great use of time whenever we can find a source of unreliability like the 404-on-HEAD thing here. |
Hi, I'm looking forward to try uv. At work we use custom index server for security built on pypicloud. When I try to install an internal library with --index-url I receive an error message of,
I clipped the fragment but it's exactly 64 character ascii string. My exact command was,
The same command using pip works. The verbose logs are (clipping out some company specific names),
I'll try to ask the security team that maintains custom index if there's any more information I can provide. Hopefully this reproduces with any
pypicloud
index. Unsure if relevant, but I also know files are hosted in GCS. The error does seem to imply it's able to fetch some html response.I installed uv today, uv --version shows
uv 0.1.13 (9ce5170e6 2024-02-29)
.The text was updated successfully, but these errors were encountered: