Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove constraint that zip64 field values be greater than 0 #319

Open
wants to merge 1 commit into
base: development
Choose a base branch
from

Conversation

JetForMe
Copy link

Fixes #318

Changes proposed in this PR

In ZIP64 archives, certain standard values (like compressed and uncompressed file sizes, and offsets within the archive) are set to 0xFFFFFFFF, and actual values are provided in the ZIP64 extra fields. The code was requiring these values be > 0. If a value was 0, the non-ZIP64 value was returned (0xFFFFFFFF), and thus almost guaranteed to be incorrect.

This patch removes that test for compressedSize, uncompressedSize, and relativeOffsetOfLocalHeader. While it’s unlikely for a file to have zero length, it’s not impossible, and it’s very likely for a local header to be at offset 0.

Tests performed

I tested this by seeing if I could iterate over the entries of the file in the referenced bug. Before, the iterator would immediately return nothing, now it returns the expected three entries. I didn’t want to add the referenced file to the repo, and wasn’t sure of the best way to add this test.

Further info for the reviewer

None.

Open Issues

None.

In ZIP64 archives, certain standard field values (like file sizes and offsets within the archive) are set to 0xFFFFFFFF, and actual values are provided in the ZIP64 extra fields. The code was requiring these values be > 0. If a value was 0, the non-ZIP64 value was returned (0xFFFFFFFF), and thus almost guaranteed to be incorrect.

This patch removes that test for `compressedSize`, `uncompressedSize`, and `relativeOffsetOfLocalHeader`. While it’s unlikely for a file to have zero length, it’s not impossible, and it’s very likely for a local header to be at offset 0.
@weichsel
Copy link
Owner

weichsel commented May 2, 2024

Thanks for looking into this issue.

There seem to be archives that require those size > 0 safe guards though (e.g. testExtractCompressedZIP64Entries fails for me). I haven't looked into the details, but simply removing those constraints won't work.

@JetForMe
Copy link
Author

JetForMe commented May 12, 2024

There seem to be archives that require those size > 0 safe guards though (e.g. testExtractCompressedZIP64Entries fails for me). I haven't looked into the details, but simply removing those constraints won't work.

How do you create testExtractCompressedZIP64Entries.zip? It doesn't look like a valid ZIP64 file. macOS can unzip it, but testing the archive reveals issues:

% unzip -vT testExtractCompressedZIP64Entries.zip

note:  didn't find end-of-central-dir signature at end of central dir.

Looking at it in HexEdit, I see the EOCD (PKWare App Note section 4.3.16):

Field Value Notes
Signature 50 4B 05 06 Correct
Number of this disk 00 00 Should be 0xFFFF for ZIP64
Number of disk with start of CD 00 00 Should be 0xFFFF for ZIP64
Total number of entries in CD on this disk 01 00 Should be 0xFFFF for ZIP64
Total number of entries in CD 01 00 Should be 0xFFFF for ZIP64
Size of CD 77 00 00 00 Should be 0xFFFFFFFF for ZIP64
Offset of CD FF FF FF FF Correct
Comment length 00 00 Correct

The Zip64 end of central directory locator (4.3.15) appears just before this:

Field Value Notes
Signature 50 4B 06 07 Correct
Disk with start of EOCD64 00 00 00 00 Correct
Start of EOCD64 DA 01 00 00 00 00 00 00 (474) Correct
Total number of disks 01 00 00 00 Correct

The EOCDR64 (4.3.14) is:

Field Value Notes
Signature 50 4B 06 06 Correct
Size of EOCDR64 2C 00 00 00 00 00 00 00 (44) Correct
Version made by 1E 03
Version required 2D 00
Number of this disk 00 00 00 00 Correct
Number of disk with start of CD 00 00 00 00 Correct
Total number of entries this disk 01 00 00 00 00 00 00 00 Correct
Total number of entries 01 00 00 00 00 00 00 00 Correct
Size of the CD 77 00 00 00 00 00 00 00 (119) Correct
Offset to start of CD 63 01 00 00 00 00 00 00 (355) Correct
Extensible data None

The CD (4.3.12) is:

Field Value Notes
Signature 50 4B 01 02
Version made by 1E 03
Version required 2D 00
Flags 00 00
Compression Method 08 00
Last mod time 10 75
Last mod date A6 52
CRC-32 B7 9B 29 DC
Compressed size F0 00 00 00 (240) Should be 0xFFFFFFFF for ZIP64
Uncompressed size FF FF FF FF Correct
File name length 25 00 (37)
Extra field length 24 00 (36)
File comment length 00 00
Disk number start 00 00
Internal file attrs 00 00
External file attrs 00 00 A4 81
Local header rel offset 00 00 00 00
Filename 74 65 73 74 45 78 74 72 61 63 74 43 6F 6D 70 72 65 73 73 65 64 5A 49 50 36 34 45 6E 74 72 69 65 73 2E 70 6E 67
Extra field 55 54 05 00 03 4F 81 93 60 75 78 0B 00 01 04 F6 01 00 00 04 14 00 00 00 01 00 08 00 D0 03 00 00 00 00 00 00
File comment None

The extra field breaks down as:

Field Value Notes
Header ID 55 54 Extended timestamp
Data Size 05 00 (5)
Data 1 03 4F 81 93 60
Header ID 2 75 78 Unix UID/GID info
Data size 2 0B 00 (11)
Data 2 01 04 F6 01 00 00 04 14 00 00 00
Header ID 3 01 00 Zip64 extra info
Data Size 3 08 00 (8)
Uncompressed size D0 03 00 00 00 00 00 00 (976) Correct. But there should be more fields for this extra field (compressed size, offset, disk number)

I'm going to use this information to see if I can't refine how my patch gets the values.

@JetForMe JetForMe closed this May 12, 2024
@JetForMe JetForMe reopened this May 12, 2024
@weichsel
Copy link
Owner

weichsel commented Jan 6, 2025

How do you create testExtractCompressedZIP64Entries.zip? It doesn't look like a valid ZIP64 file. macOS can unzip it, but testing the archive reveals issues:

The ZIP64 support was provided by @Ckitakishi . It seems that the sheer existence of ZIP64 extended information is enough to treat the archive as ZIP64 (taking precedence over formally setting versionNeededToExtract).
@Ckitakishi:

  • Was this relaxation needed for archives created by other libraries/apps? Is this also the reason for the additional size check?
  • Do you recall how the test asset was created?

@JetForMe:
One thing I noticed with the archive you attached to the referenced issue:
When printing ZIP contents via zipinfo -v GO-M8010-6-P1.3mf, I get the following output for the last 2 entries:
There are an extra -8 bytes preceding this file. Very likely that we can't deal with that (and not sure if we should 🙈).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

When I try to iterate over the entries of the attached file, nothing happens
2 participants