Stars
Gate-level simulator for efficient hardware-software co-design.
SGLang is a fast serving framework for large language models and vision language models.
Fast Matrix Multiplications for Lookup Table-Quantized LLMs
Recipes to scale inference-time compute of open models
Fast Hadamard transform in CUDA, with a PyTorch interface
Multi-platform nightly builds of open source digital design and verification tools
The PULP Ara is a 64-bit Vector Unit, compatible with the RISC-V Vector Extension Version 1.0, working as a coprocessor to CORE-V's CVA6 core
PyTorch implementation of bayesian neural network [torchbnn]
Model zoo for the Quantized ONNX (QONNX) model format
A framework for few-shot evaluation of language models.
ONNXim is a fast cycle-level simulator that can model multi-core NPUs for DNN inference
An Open Workflow to Build Custom SoCs and run Deep Models at the Edge
An Industrial Grade Federated Learning Framework
Tools for diffing and merging of Jupyter notebooks.
fabianandresgrob / brevitas
Forked from Xilinx/brevitasBrevitas: neural network quantization in PyTorch
Algorithmic solutions to optimize inference for convolution-based image upsampling. Coded for clarity, not speed.
A retargetable MLIR-based machine learning compiler and runtime toolkit.
Pytorch implementation of data structures and modules from InstantNGP
Training PyTorch models with differential privacy
Code for "TD-MPC2: Scalable, Robust World Models for Continuous Control"
Voltron: Language-Driven Representation Learning for Robotics
QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX