SparseML Compression Pt 1: saving w/compression configs #2177

Satrat · 2024-03-12T20:02:54Z

This initial PR for safetensors compression sets up the CompressionConfig and ModelCompressor registries and implements bitmask compression on save. See the corresponding internal docs PR for design details

To be implemented in follow-up PRs

bitmask decompression
saving/loading/inferring sparsity config from model
SparseAutoModel load/save interface

Example

from sparseml.transformers import SparseAutoModelForCausalLM
from sparseml.transformers.compression import BitmaskConfig, BitmaskCompressor
from safetensors import safe_open
import os

MODEL_PATH = "zoo:llama2-7b-gsm8k_llama2_pretrain-pruned50.oneshot"
OUTPUT_PATH = "./test_compress_output"

model = SparseAutoModelForCausalLM.from_pretrained(MODEL_PATH)

sparsity_config = BitmaskConfig()
compressor = BitmaskCompressor(config=sparsity_config)

model_state_dict = model.state_dict()
sparse_state_dict = compressor.compress(model_state_dict)


model.save_pretrained(OUTPUT_PATH, safe_serialization=True, state_dict=sparse_state_dict)

safetensors_path = os.path.join(OUTPUT_PATH, "model-00001-of-00002.safetensors")
with safe_open(safetensors_path, framework="pt", device=0) as f:
    test_name = "model.layers.4.self_attn.k_proj.weight"
    bitmask = f.get_tensor(test_name + ".bitmask")
    shape = f.get_tensor(test_name + ".shape")
    values = f.get_tensor(test_name + ".compressed")
    row_offsets = f.get_tensor(test_name + ".row_offsets")
    print(f"bitmask: {bitmask}")
    print(f"shape: {shape}")
    print(f"values: {values}")
    print(f"row offsets: {row_offsets}")

dbogunowicz

Couple of nitpicks. All in all great job, definitely well done on a bite-sized scope of this PR

setup.py

src/sparseml/transformers/compression/compressors/sparse_bitmask.py

src/sparseml/transformers/compression/config/sparse_bitmask.py

tests/sparseml/transformers/compression/test_bitmask.py

src/sparseml/transformers/compression/config/sparse_bitmask.py

…into tensor_compression

mgoin

Nice job Sara, this looks good to me for saving. Just some comments on state dict conflicts and tensor devices

src/sparseml/transformers/compression/compressors/sparse_bitmask.py

src/sparseml/utils/pytorch/utils.py

src/sparseml/pytorch/model_load/helpers.py

src/sparseml/transformers/__init__.py

src/sparseml/transformers/compression/README.md

dbogunowicz

🎅

Co-authored-by: dbogunowicz <[email protected]>

Sara Adkins added 7 commits March 12, 2024 14:57

initial classes

45a16ed

WIP

a7cee23

compression working

e1549e8

unit tests and README

92ba386

docstrings

f061d78

README and fix test

40a75a9

add bitmask source

522813c

Satrat requested review from rahul-tuli, dbogunowicz, mgoin, horheynm, bfineran and dsikka March 12, 2024 20:02

Merge branch 'main' into tensor_compression

c6d0b4d

dbogunowicz previously approved these changes Mar 13, 2024

View reviewed changes

setup.py Show resolved Hide resolved

src/sparseml/transformers/compression/compressors/sparse_bitmask.py Outdated Show resolved Hide resolved

src/sparseml/transformers/compression/config/sparse_bitmask.py Outdated Show resolved Hide resolved

mgoin reviewed Mar 13, 2024

View reviewed changes

tests/sparseml/transformers/compression/test_bitmask.py Show resolved Hide resolved

src/sparseml/transformers/compression/config/sparse_bitmask.py Outdated Show resolved Hide resolved

cleanup

d2a8a78

Satrat dismissed dbogunowicz’s stale review via d2a8a78 March 15, 2024 14:41

Sara Adkins added 6 commits March 15, 2024 14:43

dtype tests

1096700

Merge branch 'main' into tensor_compression

1749b28

oops fix test

013d17b

Merge branch 'tensor_compression' of github.com:neuralmagic/sparseml …

813c8e7

…into tensor_compression

tests

41223bb

add bfloat16

2c6eeba

Satrat requested review from mgoin and dbogunowicz March 15, 2024 15:08

Satrat mentioned this pull request Mar 15, 2024

SparseML Compression Pt 2: Load compressed weights #2184

Merged

bfineran previously approved these changes Mar 18, 2024

View reviewed changes

mgoin reviewed Mar 18, 2024

View reviewed changes

src/sparseml/transformers/compression/compressors/sparse_bitmask.py Outdated Show resolved Hide resolved

src/sparseml/transformers/compression/compressors/sparse_bitmask.py Show resolved Hide resolved

warn on conflicts, store device

e369710

Satrat dismissed bfineran’s stale review via e369710 March 19, 2024 14:49

Merge branch 'main' into tensor_compression

e7fb048

Satrat requested review from bfineran and mgoin March 19, 2024 14:50

mgoin reviewed Mar 19, 2024

View reviewed changes

src/sparseml/utils/pytorch/utils.py Outdated Show resolved Hide resolved

remove unneeded file

aee4575

Satrat requested a review from mgoin March 20, 2024 01:55

Satrat mentioned this pull request Mar 20, 2024

SparseML Compression Pt 3: SparseAutoModel interface & inferring params #2190

Merged

dbogunowicz reviewed Mar 20, 2024

View reviewed changes

src/sparseml/pytorch/model_load/helpers.py Outdated Show resolved Hide resolved

dbogunowicz reviewed Mar 20, 2024

View reviewed changes

src/sparseml/transformers/__init__.py Show resolved Hide resolved

dbogunowicz reviewed Mar 20, 2024

View reviewed changes

src/sparseml/transformers/compression/README.md Outdated Show resolved Hide resolved

dbogunowicz previously approved these changes Mar 20, 2024

View reviewed changes

Satrat dismissed dbogunowicz’s stale review via 1ba94ed March 20, 2024 15:31

Sara Adkins and others added 2 commits March 20, 2024 11:31

Update src/sparseml/pytorch/model_load/helpers.py

1ba94ed

Co-authored-by: dbogunowicz <[email protected]>

Update README.md

e5e1215

Satrat requested a review from dbogunowicz March 20, 2024 15:32

Merge branch 'main' into tensor_compression

3958525

dbogunowicz approved these changes Mar 20, 2024

View reviewed changes

mgoin approved these changes Mar 20, 2024

View reviewed changes

Merge branch 'main' into tensor_compression

6a303a1

bfineran approved these changes Mar 20, 2024

View reviewed changes

Satrat merged commit dead8b5 into main Mar 20, 2024
13 of 14 checks passed

Satrat deleted the tensor_compression branch March 20, 2024 17:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SparseML Compression Pt 1: saving w/compression configs #2177

SparseML Compression Pt 1: saving w/compression configs #2177

Satrat commented Mar 12, 2024 •

edited

Loading

dbogunowicz left a comment

mgoin left a comment

dbogunowicz left a comment

SparseML Compression Pt 1: saving w/compression configs #2177

SparseML Compression Pt 1: saving w/compression configs #2177

Conversation

Satrat commented Mar 12, 2024 • edited Loading

To be implemented in follow-up PRs

Example

dbogunowicz left a comment

Choose a reason for hiding this comment

mgoin left a comment

Choose a reason for hiding this comment

dbogunowicz left a comment

Choose a reason for hiding this comment

Satrat commented Mar 12, 2024 •

edited

Loading