Skip to content
View i-colbert's full-sized avatar

Block or report i-colbert

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

LLM KV cache compression made easy

Python 384 26 Updated Feb 13, 2025

Gate-level simulator for efficient hardware-software co-design.

Rust 7 2 Updated Jan 28, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 9,476 904 Updated Feb 14, 2025

Fast Matrix Multiplications for Lookup Table-Quantized LLMs

C++ 227 8 Updated Feb 13, 2025

Recipes to scale inference-time compute of open models

Python 992 94 Updated Jan 16, 2025

Fast Hadamard transform in CUDA, with a PyTorch interface

C 141 21 Updated May 24, 2024

The Fastest Deep Reinforcement Learning Library

C++ 745 28 Updated Dec 20, 2024

Multi-platform nightly builds of open source digital design and verification tools

Shell 950 87 Updated Feb 13, 2025

The PULP Ara is a 64-bit Vector Unit, compatible with the RISC-V Vector Extension Version 1.0, working as a coprocessor to CORE-V's CVA6 core

C 395 134 Updated Feb 13, 2025

PyTorch implementation of bayesian neural network [torchbnn]

Python 509 80 Updated Jul 25, 2024

Model zoo for the Quantized ONNX (QONNX) model format

11 3 Updated Feb 9, 2025

A framework for few-shot evaluation of language models.

Python 7,777 2,090 Updated Feb 13, 2025

ONNXim is a fast cycle-level simulator that can model multi-core NPUs for DNN inference

C++ 88 19 Updated Feb 10, 2025

LLM training in simple, raw C/CUDA

Cuda 25,523 2,931 Updated Oct 2, 2024

An Open Workflow to Build Custom SoCs and run Deep Models at the Edge

SystemVerilog 72 10 Updated Feb 7, 2025

An Industrial Grade Federated Learning Framework

Python 5,802 1,559 Updated Nov 19, 2024

Tools for diffing and merging of Jupyter notebooks.

TypeScript 2,697 161 Updated Sep 21, 2024

Brevitas: neural network quantization in PyTorch

Python 1 Updated Jul 10, 2024

LLVM-Based Pipeline Compiler

C++ 174 115 Updated Feb 7, 2025

Algorithmic solutions to optimize inference for convolution-based image upsampling. Coded for clarity, not speed.

Jupyter Notebook 10 2 Updated Aug 26, 2022
Python 1,023 95 Updated Jan 4, 2024
C++ 307 87 Updated Dec 20, 2024
Python 284 27 Updated Feb 13, 2025

A retargetable MLIR-based machine learning compiler and runtime toolkit.

C++ 2,983 658 Updated Feb 13, 2025

Pytorch implementation of data structures and modules from InstantNGP

Python 27 3 Updated Jul 26, 2022

Training PyTorch models with differential privacy

Jupyter Notebook 1,750 358 Updated Feb 12, 2025

Code for "TD-MPC2: Scalable, Robust World Models for Continuous Control"

Python 444 103 Updated Feb 6, 2025

Voltron: Language-Driven Representation Learning for Robotics

Python 217 23 Updated Jul 9, 2023

QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX

Python 135 40 Updated Feb 13, 2025
Next
Showing results