[NEWS.2024/04/29] Our paper is released!
[NEWS.2024/05/02] 🎉🎉🎉Congratulations to Vision Mamba on being accepted in ICML 2024.
📢NOTE: If you have any questions, please don't hesitate to contact us at any of the following emails: [email protected], [email protected], [email protected].
Mamba, a novel state space model, has gained recognition across diverse domains for its exceptional performance and efficient computational complexity. By addressing the limitations inherent in traditional visual foundation architectures, Mamba emerges as a promising contender poised to catalyze advancements in the field of computer vision.
⭐ This repository hosts a curated collection of literature associated with Mamba models in computer vision. Feel free to star and fork. For further details, refer to the following paper:
A Survey on Vision Mamba: Models, Applications and Challenges
Rui Xu, Shu Yang, Yihui Wang, Bo Du, Hao Chen
SMART Lab, The Hong Kong University of Science and Technology
If you find this repository is useful for you, please cite our paper:
@misc{2024vision_mamba,
title={A Survey on Vision Mamba: Models, Applications and Challenges},
author={Rui Xu and Shu Yang and Yihui Wang and Bo Du and Hao Chen},
year={2024},
eprint={},
archivePrefix={arXiv 2404.18861},
primaryClass={}
}
![image](https://private-user-images.githubusercontent.com/57466105/333389971-5d8ad736-d978-4bc7-a714-779f65bba661.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk0MzQ1NzIsIm5iZiI6MTczOTQzNDI3MiwicGF0aCI6Ii81NzQ2NjEwNS8zMzMzODk5NzEtNWQ4YWQ3MzYtZDk3OC00YmM3LWE3MTQtNzc5ZjY1YmJhNjYxLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTMlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjEzVDA4MTExMlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWU0OGU4NWZjODFjM2VkM2ZkMzU2YjhiYjY0MDgzNTQyNjRkOGU0MWFjYzY1NWFjZjNkZTZhOWNlNTFjZjU2YjAmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.g8-oULR8Jpyz1y7iD8PHdmps04Vif2KkrgzXL-_DLUg)
Detailed Performance Comparison
Date | Paper | Figure | Link | Code |
---|---|---|---|---|
Arxiv 24.01.17 (ICML24) | Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model | ![]() |
Link | Code |
Arxiv 24.01.18 | VMamba: Visual State Space Model | ![]() ![]() |
Link | Code |
Arxiv 24.02.08 | Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data | ![]() |
Link | Code |
Arxiv 24.03.14 | LocalMamba: Visual State Space Model with Windowed Selective Scan | ![]() |
Link | Code |
Arxiv 24.03.15 | EfficientVMamba: Atrous Selective Scan for Light Weight Visual Mamba | ![]() |
Link | Code |
Arxiv 24.03.22 | SiMBA: Simplified Mamba-based Architecture for Vision and Multivariate Time series | ![]() |
Link | Code |
Arxiv 24.03.26 | PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition | ![]() |
Link | Code |
Arxiv 24.05.23 | Multi-Scale VMamba: Hierarchy in Hierarchy Visual State Space Model | ![]() ![]() |
Link | Code |
Arxiv 24.05.23 | Scalable Visual State Space Model with Fractal Scanning | ![]() |
Link | |
Arxiv 24.05.23 | Mamba-R: Vision Mamba ALSO Needs Registers | ![]() |
Link | Code |
Arxiv 24.05.29 | Vim-F: Visual State Space Model Benefiting from Learning in the Frequency Domain | ![]() |
Link | Code |
Arxiv 24.06.11 | Autoregressive Pretraining with Mamba in Vision | ![]() ![]() |
Link | Code |
Date | Paper | Link |
---|---|---|
Arxiv 24.04.15 | State Space Model for New-Generation Network Alternative to Transformers: A Survey | Link |
Arxiv 24.04.24 | A Survey on Visual Mamba | Link |
Arxiv 24.04.24 | Mamba-360: Survey of State Space Models as Transformer Alternative for Long Sequence Modelling: Methods, Applications, and Challenges | Link |
Arxiv 24.05.07 | Vision Mamba: A Comprehensive Survey and Taxonomy | Link |
Arxiv 24.06.05 | Computation-Efficient Era: A Comprehensive Survey of State Space Models in Medical Image Analysis | Link |
Date | Paper | Figure | Link | Code | Task |
---|---|---|---|---|---|
Arxiv 24.02.06 | U-shaped Vision Mamba for Single Image Dehazing | ![]() |
Link | Code | Dehazing/Low Light Enhancement/Deraining |
Arxiv 24.02.08 | Scalable Diffusion Models with State Space Backbone | ![]() |
Link | Code | Image Generation |
Arxiv 24.02.23 | MambaIR: A Simple Baseline for Image Restoration with State-Space Model | ![]() |
Link | Code | Super-resolution/Denoising |
Arxiv 24.03.04 | MiM-ISTD: Mamba-in-Mamba for Efficient Infrared Small Target Detection | ![]() |
Link | Code | Infrared Image Segmentation |
Arxiv 24.03.13 | Activating Wider Areas in Image Super-Resolution | ![]() |
Link | Super-resolution | |
Arxiv 24.03.18 | VmambaIR: Visual State Space Model for Image Restoration | ![]() |
Link | Code | Image Restoration |
Arxiv 24.03.20 | ZigMa: A DiT-style Zigzag Mamba Diffusion Model | ![]() |
Link | Code | Generation |
Arxiv 24.03.27 | Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction | ![]() |
Link | 3D Reconstruction | |
Arxiv 24.03.29 | Learning Enriched Features via Selective State Spaces Model for Efficient Image Deblurring | ![]() |
Link | Image Deblurring | |
Arxiv 24.04.04 | InsectMamba: Insect Pest Classification with State Space Model | ![]() |
Link | Image Classification | |
Arxiv 24.04.09 | MambaAD: Exploring State Space Models for Multi-class Unsupervised Anomaly Detection | ![]() |
Link | code | Anomaly Detection |
Arxiv 24.04.11 | DGMamba: Domain Generalization via Generalized State Space Model | ![]() |
Link | Code | Domain Generalization |
Arxiv 24.04.15 | FreqMamba: Viewing Mamba from a Frequency Perspective for Image Deraining | ![]() |
Link | Deraining | |
Arxiv 24.04.17 | CU-Mamba: Selective State Space Models with Channel Learning for Image Restoration | ![]() |
Link | Denoising/Deblurring | |
Arxiv 24.04.22 | MambaUIE: Unraveling the Ocean's Secrets with Only 2.8 FLOPs | ![]() |
Link | Code | Image Enhancement |
Arxiv 24.05.03 | FER-YOLO-Mamba: Facial Expression Detection and Classification Based on Selective State Space | ![]() |
Link | Code | Emotion recognition & Facial Expression Recognition & Detection |
Arxiv 24.05.05 | DVMSR: Distillated Vision Mamba for Efficient Super-Resolution | ![]() |
Link | Code | Super-Resolution |
Arxiv 24.05.05 | SMCD: High Realism Motion Style Transfer via Mamba-based Diffusion | ![]() |
Link | Motion Style Transfer | |
Arxiv 24.05.06 | Retinexmamba: Retinex-based Mamba for Low-light Image Enhancement | ![]() |
Link | Code | Image Enhancement |
Arxiv 24.05.07 | VMambaCC: A Visual State Space Model for Crowd Counting | ![]() |
Link | Crowd Counting | |
Arxiv 24.05.14 | WaterMamba: Visual State Space Model for Underwater Image Enhancement | ![]() |
Link | Image Enhancement | |
Arxiv 24.05.16 | IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation Model | ![]() |
Link | Code | Infrared Image Super-resolution |
Arxiv 24.05.23 | Efficient Visual State Space Model for Image Deblurring | ![]() |
Link | Code | Image Deblurring |
Arxiv 24.05.23 | DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis | ![]() |
Link | Code | Generation |
Arxiv 24.05.25 | Scaling Diffusion Mamba with Bidirectional SSMs for Efficient Image and Video Generation | ![]() |
Link | Generation | |
Arxiv 24.05.26 | Image Deraining with Frequency-Enhanced State Space Model | ![]() |
Link | Image Deraining | |
Arxiv 24.05.28 | MambaVC: Learned Visual Compression with Selective State Spaces | ![]() |
Link | Code | Visual Compression |
Arxiv 24.05.29 | FourierMamba: Fourier Learning Integration with State Space Models for Image Deraining | ![]() |
Link | Image Deraining | |
Arxiv 24.06.03 | LLEMamba: Low-Light Enhancement via Relighting-Guided Mamba with Deep Unfolding Network | ![]() |
Link | Low-Light Enhancement | |
Arxiv 24.06.06 | MambaDepth: Enhancing Long-range Dependency for Self-Supervised Fine-Structured Monocular Depth Estimation | ![]() |
Link | Depth Estimation | |
Arxiv 24.06.09 | Mamba YOLO: SSMs-Based YOLO For Object Detection | ![]() |
Link | Code | Object Detection |
Date | Paper | Figure | Link | Code | Task |
---|---|---|---|---|---|
Arxiv 24.02.19 | Pan-Mamba: Effective pan-sharpening with State Space Model | ![]() |
Link | Code | Pan-sharpening |
Arxiv 24.03.28 | RSMamba: Remote Sensing Image Classification with State Space Model | ![]() |
Link | Code | Remote Sensing Images Classification |
Arxiv 24.04.02 | Samba: Semantic Segmentation of Remotely Sensed Images with State Space Model | ![]() |
Link | Code | Semantic Segmentation |
Arxiv 24.04.03 | RS3Mamba: Visual State Space Model for Remote Sensing Images Semantic Segmentation | ![]() |
Link | Code | Semantic Segmentation |
Arxiv 24.04.03 | RS-Mamba for Large Remote Sensing Image Dense Prediction | ![]() |
Link | Code | Semantic Segmentation/Change Detection |
Arxiv 24.04.04 | ChangeMamba: Remote Sensing Change Detection with Spatio-Temporal State Space Model | ![]() |
Link | Code | Change Detection/Building Damage Assessment |
Arxiv 24.04.12 | SpectralMamba: Efficient Mamba for Hyperspectral Image Classification | ![]() |
Link | Code | Hyperspectral Image Classification |
Arxiv 24.04.15 | HSIDMamba: Exploring Bidirectional State-Space Models for Hyperspectral Denoising | ![]() |
Link | Hyperspectral Denoising | |
Arxiv 24.04.28 | S2Mamba: A Spatial-spectral State Space Model for Hyperspectral Image Classification | ![]() |
Link | Code | Hyperspectral Image Classification |
Arxiv 24.04.29 | Spectral-Spatial Mamba for Hyperspectral Image Classification | ![]() |
Link | Hyperspectral Image Classification | |
Arxiv 24.05.02 | SOAR: Advancements in Small Body Object Detection for Aerial Imagery Using State Space Models and Programmable Gradients | ![]() |
Link | Code | Detection |
Arxiv 24.05.02 | SSUMamba: Spatial-Spectral Selective State Space Model for Hyperspectral Image Denoising | ![]() |
Link | Code | Hyperspectral Image Denoising |
Arxiv 24.05.08 | Frequency-Assisted Mamba for Remote Sensing Image Super-Resolution | ![]() |
Link | Super Resolution | |
Arxiv 24.05.13 | GMSR:Gradient-Guided Mamba for Spectral Reconstruction from RGB Images | ![]() |
Link | Code | Spectral Reconstruction from RGB Images |
Arxiv 24.05.14 | Rethinking Scanning Strategies with Vision Mamba in Semantic Segmentation of Remote Sensing Imagery: An Experimental Study | ![]() |
Link | Semantic Segmentation | |
Arxiv 24.05.16 | RSDehamba: Lightweight Vision Mamba for Remote Sensing Satellite Image Dehazing | ![]() |
Link | Dehazing | |
Arxiv 24.05.17 | CM-UNet: Hybrid CNN-Mamba UNet for Remote Sensing Image Semantic Segmentation | ![]() |
Link | Code | Semantic Segmentation |
Arxiv 24.05.20 | Mamba-in-Mamba: Centralized Mamba-Cross-Scan in Tokenized Mamba Model for Hyperspectral Image Classification | ![]() |
Link | Code | Hyperspectral Image Classification |
Arxiv 24.05.21 | 3DSS-Mamba: 3D-Spectral-Spatial Mamba for Hyperspectral Image Classification | ![]() |
Link | Hyperspectral Image Classification | |
Arxiv 24.06.06 | CDMamba: Remote Sensing Image Change Detection with Mamba | ![]() |
Link | Code | Change Detection |
Arxiv 24.06.09 | HDMba: Hyperspectral Remote Sensing Imagery Dehazing with State Space Model | ![]() |
Link | Code | Dehazing |
Arxiv 24.06.11 | DualMamba: A Lightweight Spectral-Spatial Mamba-Convolution Network for Hyperspectral Image Classification | ![]() |
Link | Hyperspectral Image Classification | |
Arxiv 24.06.01 | Dual Hyperspectral Mamba for Efficient Spectral Compressive Imaging | ![]() |
Link | Code | Spectral Compressive Imaging |
Date | Paper | Figure | Link | Code | Task |
---|---|---|---|---|---|
Arxiv 24.01.09 | U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation | ![]() |
Link | Code | 2D Medical Segmentation/ 3D Medical Segmentation |
Arxiv 24.01.24 | SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation | ![]() |
Link | Code | 3D Medical Segmentation |
Arxiv 24.02.04 | VM-UNet: Vision Mamba UNet for Medical Image Segmentation | ![]() |
Link | Code | 2D Medical Segmentation |
Arxiv 24.02.05 | nnMamba: 3D Biomedical Image Segmentation, Classification and Landmark Detection with State Space Model | ![]() |
Link | Code | 3D Medical Segmentation |
Arxiv 24.02.05 | Swin-UMamba: Mamba-based UNet with ImageNet-based pretraining | ![]() |
Link | Code | 2D Medical Segmentation |
Arxiv 24.02.07 | Mamba-UNet: UNet-Like Pure Visual Mamba for Medical Image Segmentation | ![]() |
Link | Code | 2D Medical Segmentation |
Arxiv 24.02.09 | FD-Vision Mamba for Endoscopic Exposure Correction | ![]() |
Link | Code | Endoscopic Exposure Correction |
Arxiv 24.02.11 | Semi-Mamba-UNet: Pixel-Level Contrastive Cross-Supervised Visual Mamba-based UNet for Semi-Supervised Medical Image Segmentation | ![]() |
Link | Code | 2D Medical Segmentation |
Arxiv 24.02.13 | P-Mamba: Marrying Perona Malik Diffusion with Mamba for Efficient Pediatric Echocardiographic Left Ventricular Segmentation | ![]() |
Link | 2D Medical Segmentation | |
Arxiv 24.02.16 | Weak-Mamba-UNet: Visual Mamba Makes CNN and ViT Work Better for Scribble-based Medical Image Segmentation | ![]() |
Link | Code | 2D Medical Segmentation |
Arxiv 24.02.28 | MambaMIR: An Arbitrary-Masked Mamba for Joint Medical Image Reconstruction and Uncertainty Estimation | ![]() |
Link | Code | Medical Image Reconstruction/Uncertainty Estimation |
Arxiv 24.03.06 | MedMamba: Vision Mamba for Medical Image Classification | ![]() |
Link | Code | 2D Medical Classification |
Arxiv 24.03.08 | LightM-UNet: Mamba Assists in Lightweight UNet for Medical Image Segmentation | ![]() |
Link | Code | 2D Medical Segmentation/ 3D Medical Segmentation |
Arxiv 24.03.08 | MamMIL: Multiple Instance Learning for Whole Slide Images with State Space Models | ![]() |
Link | Cancer Subtyping | |
Arxiv 24.03.11 (MICCAI24) | MambaMIL: Enhancing Long Sequence Modeling with Sequence Reordering in Computational Pathology | ![]() |
Link | Code | Cancer Subtyping/ Survival Prediction |
Arxiv 24.03.12 | Large Window-based Mamba UNet for Medical Image Segmentation: Beyond Convolution and Self-attention | ![]() |
Link | Code | 2D Medical Segmentation/ 3D Medical Segmentation |
Arxiv 24.03.13 | MD-Dose: A Diffusion Model based on the Mamba for Radiotherapy Dose Prediction | ![]() |
Link | Code | Radiation Dose Prediction (Segmentation) |
Arxiv 24.03.14 | VM-UNET-V2 Rethinking Vision Mamba UNet for Medical Image Segmentation | ![]() |
Link | Code | 2D Medical Segmentation |
Arxiv 24.03.20 | H-vmunet: High-order Vision Mamba UNet for Medical Image Segmentation | ![]() |
Link | Code | 2D Medical Segmentation |
Arxiv 24.03.20 | ProMamba: Prompt-Mamba for polyp segmentation | ![]() |
Link | 2D Medical Segmentation | |
Arxiv 24.03.25 | CMViM: Contrastive Masked Vim Autoencoder for 3D Multi-modal Representation Learning for AD classification | ![]() |
Link | Alzheimer’s disease Classification (CT/MRI) | |
Arxiv 24.03.26 | Integrating Mamba Sequence Model and Hierarchical Upsampling Network for Accurate Semantic Segmentation of Multiple Sclerosis Legion | ![]() |
Link | 2D Medical Segmentation (2D MRI) | |
Arxiv 24.03.26 | Serpent: Scalable and Efficient Image Restoration via Multi-scale Structured State Space Models | ![]() |
Link | Image Resotration | |
Arxiv 24.03.26 | Rotate to Scan: UNet-like Mamba with Triplet SSM Module for Medical Image Segmentation | ![]() |
Link | 2D Medical Segmentation | |
Arxiv 24.03.29 | UltraLight VM-UNet: Parallel Vision Mamba Significantly Reduces Parameters for Skin Lesion Segmentation | ![]() |
Link | Code | 2D Medical Segmentation |
Arxiv 24.04.01 | T-Mamba: Frequency-Enhanced Gated Long-Range Dependency for Tooth 3D CBCT Segmentation | ![]() |
Link | Code | 3D Medical Segmentation (Tooth) |
Arxiv 24.04.10 | ViM-UNet: Vision Mamba for Biomedical Segmentation | ![]() |
Link | Code | 2D Medical Segmentation (Cell/Neurite) |
Arxiv 24.04.19 | Vim4Path: Self-Supervised Vision Mamba for Histopathology Images | ![]() |
Link | Code | Cancer Subtyping |
Arxiv 24.04.26 | Optimizing Universal Lesion Segmentation: State Space Model-Guided Hierarchical Networks with Feature Importance Adjustment | ![]() |
Link | Universal Lesion Segmentation | |
Arxiv 24.04.26 | Sparse Reconstruction of Optical Doppler Tomography Based on State Space Model | ![]() |
Link | ODT Sparse Reconstruction | |
Arxiv 24.05.05 | AC-MAMBASEG: An adaptive convolution and Mamba-based architecture for enhanced skin lesion segmentation | ![]() |
Link | Code | Skin Lesion Segmentation |
Arxiv 24.05.08 | HC-Mamba: Vision MAMBA with Hybrid Convolutional Techniques for Medical Image Segmentation | ![]() |
Link | 2D Medical Segmentation | |
Arxiv 24.05.09 | VM-DDPM: Vision Mamba Diffusion for Medical Image Synthesis | ![]() |
Link | Medical Image Generation | |
Arxiv 24.05.24 | MUCM-Net: A Mamba Powered UCM-Net for Skin Lesion Segmentation | ![]() |
Link | Code | Medical Image Segmentation |
Arxiv 24.05.25 | UU-Mamba: Uncertainty-aware U-Mamba for Cardiac Image Segmentation | ![]() |
Link | Medical Image Segmentation | |
Arxiv 24.05.27 | TokenUnify: Scalable Autoregressive Visual Pre-training with Mixture Token Prediction | ![]() |
Link | Code | Pre-training/Medical Image Segmentation |
Arxiv 24.05.27 | Enhancing Global Sensitivity and Uncertainty Quantification in Medical Image Reconstruction with Monte Carlo Arbitrary-Masked Mamba | ![]() |
Link | Medical Image Reconstruction | |
Arxiv 24.05.28 | Cardiovascular Disease Detection from Multi-View Chest X-rays with BI-Mamba | ![]() |
Link | Code | CVD Risk Prediction |
Arxiv 24.06.01 | SAM-VMNet: Deep Neural Networks For Coronary Angiography Vessel Segmentation | ![]() |
Link | Medical Image Segmentation | |
Arxiv 24.06.05 | Combining Graph Neural Network and Mamba to Capture Local and Global Tissue Spatial Relationships in Whole Slide Images | ![]() |
Link | Code | Cancer Subtyping/Survival Prediction |
Arxiv 24.06.09 | Vision Mamba: Cutting-Edge Classification of Alzheimer's Disease with 3D MRI Scans | ![]() |
Link | 3D Medical Classification | |
Arxiv 24.06.09 | Convolution and Attention-Free Mamba-based Cardiac Image Segmentation | ![]() |
Link | Code | Medical Image Segmentation |
Arxiv 24.06.10 | MHS-VM: Multi-Head Scanning in Parallel Subspaces for Vision Mamba | ![]() |
Link | Code | Medical Image Segmentation |
Date | Paper | Figure | Link | Code | Task |
---|---|---|---|---|---|
Arxiv 24.01.25 | Vivim: a Video Vision Mamba for Medical Video Object Segmentation | ![]() |
Link | Code | Medical Video Segmentation |
Arxiv 24.03.11 | VideoMamba: State Space Model for Efficient Video Understanding | ![]() |
Link | Code | Action Recognition/Video Understanding/Text-to-video Retrieval |
Arxiv 24.03.12 | SSM Meets Video Diffusion Models: Efficient Video Generation with Structured State Spaces | ![]() |
Link | Code | Video Generation |
Arxiv 24.03.14 | Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding | ![]() |
Link | Code | Action Recognition/Action Localization/... |
Arxiv 24.04.09 | RhythmMamba: Fast Remote Physiological Measurement with Arbitrary Length Videos | ![]() |
Link | Code | Remote photoplethysmography Prediction |
Arxiv 24.04.11 | Simba: Mamba augmented U-ShiftGCN for Skeletal Action Recognition in Videos | ![]() |
Link | Skeleton Action Recognition | |
Arxiv 24.05.05 | Matten: Video Generation with Mamba-Attention | ![]() |
Link | Video Generation | |
Arxiv 24.05.30 | DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark | ![]() |
Link | Code | AI-Generated Video Detection |
Date | Paper | Figure | Link | Code | Task |
---|---|---|---|---|---|
Arxiv 24.02.16 | PointMamba: A Simple State Space Model for Point Cloud Analysis | ![]() |
Link | Code | Classification, Part Segmentation |
Arxiv 24.03.01 | Point Cloud Mamba: Point Cloud Learning via State Space Model | ![]() |
Link | Code | Classification, Part Segmentation, Semantic Segmentation |
Arxiv 24.03.11 | Point Mamba: A Novel Point Cloud Backbone Based on State Space Model with Octree-Based Ordering Strategy | ![]() |
Link | Code | Classification, Semantic Segmentation |
Arxiv 24.04.08 | 3DMambaIPF: A State Space Model for Iterative Point Cloud Filtering via Differentiable Rendering | ![]() |
Link | Point Cloud Filtering | |
Arxiv 24.04.10 | 3DMambaComplete: Exploring Structured State Space Model for Point Cloud Completion | ![]() |
Link | Point Cloud Completion | |
Arxiv 24.04.23 | Mamba3D: Enhancing Local Features for 3D Point Cloud Analysis via State Space Model | ![]() |
Link | Classification, Part Segmentation | |
Arxiv 24.05.09 | Rethinking Efficient and Effective Point-based Networks for Event Camera Classification and Regression: EventMamba | ![]() |
Link | Classification, Regression | |
Arxiv 24.05.13 | OverlapMamba: Novel Shift State Space Model for LiDAR-based Place Recognition | ![]() |
Link | Code | LiDAR Place Recognition |
Arxiv 24.05.23 | MAMBA4D: Efficient Long-Sequence Point Cloud Video Understanding with Disentangled Spatial-Temporal State Space Models | ![]() |
Link | Point Cloud Video Understanding | |
Arxiv 24.05.24 | PoinTramba: A Hybrid Transformer-Mamba Framework for Point Cloud Analysis | ![]() |
Link | Code | Classification, Part Segmentation |
Arxiv 24.05.27 | LCM: Locally Constrained Compact Point Cloud Model for Masked Point Modeling | ![]() |
Link | Classification, Part Segmentation, Object Detection | |
Arxiv 24.06.07 | Efficient 3D Shape Generation via Diffusion Mamba with Bidirectional SSMs | ![]() |
Link | Code | Generation |
Arxiv 24.06.10 | PointABM: Integrating Bidirectional State Space Model with Multi-Head Self-Attention for Point Cloud Analysis | ![]() |
Link | Classification |
Date | Paper | Figure | Link | Code | Task | Modality |
---|---|---|---|---|---|---|
Arxiv 24.01.25 | MambaMorph: a Mamba-based Framework for Medical MR-CT Deformable Registration | ![]() |
Link | Code | Registration | MRI & CT |
Arxiv 24.03.07 | InstructGIE: Towards Generalizable Image Editing | ![]() |
Link | Image Editing | Image & Text | |
Arxiv 24.03.12 | Motion Mamba: Efficient and Long Sequence Motion Generation with Hierarchical and Bidirectional Selective SSM | ![]() |
Link | Code | Text-to-Motion Generation | Motion & Text |
Arxiv 24.03.20 | VL-Mamba: Exploring State Space Models for Multimodal Learning | ![]() |
Link | Code | MLLM tasks | Image & Text |
Arxiv 24.03.21 | Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference | ![]() |
Link | Code | MLLM tasks | Image & Text |
Arxiv 24.03.26 | ReMamber: Referring Image Segmentation with Mamba Twister | ![]() |
Link | Referring Image Segmentation | Image & Text | |
Arxiv 24.04.01 | SpikeMba: Multi-Modal Spiking Saliency Mamba for Temporal Video Grounding | ![]() |
Link | Temporal Video Grounding | Video & Text | |
Arxiv 24.04.05 | Sigma: Siamese Mamba Network for Multi-Modal Semantic Segmentation | ![]() |
Link | Code | Semantic Segmentation | RGB Images & Depth/Thermal Images |
Arxiv 24.04.07 | VMambaMorph: a Multi-Modality Deformable Image Registration Framework based on Visual State Space Model with Cross-Scan Module | ![]() |
Link | Code | Registration | MRI & CT |
Arxiv 24.04.11 | SurvMamba: State Space Model with Multi-grained Multi-modal Interaction for Survival Prediction | ![]() |
Link | Cancer Subtyping/Survival Prediction | WSIs & Gene | |
Arxiv 24.04.11 | FusionMamba: Efficient Image Fusion with State Space Model | ![]() |
Link | Pansharpening | HISR Images & LRMS Images | |
Arxiv 24.04.12 | MambaDFuse: A Mamba-based Dual-phase Model for Multi-modality Image Fusion | ![]() |
Link | Multi-modality Image Fusion | RGB & Thermal Images, MRI & CT/PET/SPECT | |
Arxiv 24.04.14 | Fusion-Mamba for Cross-modality Object Detection | ![]() |
Link | Visible-infrared Images Fusion | RGB Images & Infrared Images | |
Arxiv 24.04.14 | A Novel State Space Model with Local Enhancement and State Sharing for Image Fusion | ![]() |
Link | Pansharpening | HISR Images & LRMS Images | |
Arxiv 24.04.15 | FusionMamba: Dynamic Feature Enhancement for Multimodal Image Fusion with Mamba | ![]() |
Link | Code | Image Fusion | RGB & Infrared Images, MRI & CT/PET/SPECT, PC & GFP |
Arxiv 24.04.17 | Text-controlled Motion Mamba: Text-Instructed Temporal Grounding of Human Motion | ![]() |
Link | Temporal Grounding | Motion & Text | |
Arxiv 24.04.25 | CFMW: Cross-modality Fusion Mamba for Multispectral Object Detection under Adverse Weather Conditions | ![]() |
Link | Code | Visible-infrared Images Fusion | RGB Images & Infrared Images |
Arxiv 24.04.27 | Revisiting Multi-modal Emotion Learning with Broad State Space Models and Probability-guidance Fusion | ![]() |
Link | Multi-modal Emotion Recognition | Text & Video & Audio | |
Arxiv 24.04.28 | Mamba-FETrack: Frame-Event Tracking via State Space Model | ![]() |
Link | Code | RGB-Event Tracking | RGB Frames & Event |
Arxiv 24.04.29 | RSCaMa: Remote Sensing Image Change Captioning with State Space Model | ![]() |
Link | Code | Image Captioning | Remote Sensing Image & Text |
Arxiv 24.04.30 | CLIP-Mamba: CLIP Pretrained Mamba Models with OOD and Hessian Evaluation | ![]() |
Link | Code | OOD | Image & Text |
Arxiv 24.05.22 | I2I-Mamba: Multi-modal medical image synthesis via selective state space modeling | ![]() |
Link | Code | Medical Image Generation | MRI/CT |
Arxiv 24.05.24 | Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models | ![]() |
Link | Code | Large Language and Vision Model | Image & Text (Qestion/Rationale) |
Arxiv 24.05.29 | Coupled Mamba: Enhanced Multi-modal Fusion with Coupled State Space Model | ![]() |
Link | multi-modal sentiment analysis | Text & Video & Audio | |
Arxiv 24.05.31 | S4Fusion: Saliency-aware Selective State Space Model for Infrared Visible Image Fusion | ![]() |
Link | Image Fusion | RGB Images & Infrared Images | |
Arxiv 24.06.02 | MGI: Multimodal Contrastive Pre-training of Genomic and Medical Imaging | ![]() |
Link | Multimodal Contrastive Pre-training | Medical Image & Genomic | |
Arxiv 24.06.03 | Dimba: Transformer-Mamba Diffusion Models | ![]() |
Link | Code | Text to Image Generation | Image & Text |
Arxiv 24.06.06 | RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation | ![]() |
Link | Code | Robot Reasoning and Manipulation | Image & Text |
Date | Paper | Figure | Link | Code | Task |
---|---|---|---|---|---|
Arxiv 24.02.24 | Res-VMamba: Fine-Grained Food Category Visual Classification Using Selective State Space Models with Deep Residual Learning | ![]() |
Link | Code | Food Classification |
Arxiv 24.03.08 | Motion-Guided Dual-Camera Tracker for Low-Cost Skill Evaluation of Gastric Endoscopy | ![]() |
Link | Code | Endoscope Tip Tracking |
Arxiv 24.03.14 | MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space Models | ![]() |
Link | Gesture Synthesis | |
Arxiv 24.03.15 | On the low-shot transferability of [V]-Mamba? | ![]() |
Link | Few-shot Learning | |
Arxiv 24.03.22 | Music to Dance as Language Translation using Sequence Models | ![]() |
Link | Code | Music-to-Dance |
Arxiv 24.05.08 | Traj-LLM: A New Exploration for Empowering Trajectory Prediction with Pre-trained Large Language Models | ![]() |
Link | Trajectory Prediction with LLM |
Date | Paper | Link |
---|---|---|
Arxiv 24.03.03 | The Hidden Attention of Mamba Models | Link |
Arxiv 24.03.16 | Understanding Robustness of Visual State Space Models for Image Classification | Link |
Arxiv 24.05.13 | MambaOut: Do We Really Need Mamba for Vision? | Link |
Arxiv 24.05.26 | Demystify Mamba in Vision: A Linear Attention Perspective | Link |
Date | Paper | Figure | Link | Code |
---|---|---|---|---|
Arxiv 24.05.20 | Is Mamba Compatible with Trajectory Optimization in Offline Reinforcement Learning? | ![]() |
Link | |
Arxiv 24.05.31 | Decision Mamba: Reinforcement Learning via Hybrid Selective Sequence Modeling | ![]() |
Link | |
Arxiv 24.06.04 | Mamba as Decision Maker: Exploring Multi-scale Sequence Modeling in Offline Reinforcement Learning | ![]() |
Link | Code |
Arxiv 24.06.08 | Decision Mamba: A Multi-Grained State Space Model with Self-Evolution Regularization for Offline RL | ![]() |
Link |
Date | Paper | Figure | Link | Code |
---|---|---|---|---|
Arxiv 24.05.22 | HeteGraph-Mamba: Heterogeneous Graph Learning via Selective State Space Model | ![]() |
Link |
Date | Paper | Figure | Link | Code |
---|---|---|---|---|
Arxiv 24.04.02 | SPMamba: State-space model is all you need in speech separation | ![]() |
Link | |
Arxiv 24.05.02 | TRAMBA: A Hybrid Transformer and Mamba Architecture for Practical Audio and Bone Conduction Speech Super Resolution and Enhancement on Mobile and Wearable Platforms | ![]() |
Link | |
Arxiv 24.05.10 | An Investigation of Incorporating Mamba for Speech Enhancement | ![]() |
Link | |
Arxiv 24.05.20 | SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model | ![]() |
Link | Code |
Arxiv 24.05.21 | Mamba in Speech: Towards an Alternative to Self-Attention | ![]() |
Link | |
Arxiv 24.05.22 | Audio Mamba: Pretrained Audio State Space Model For Audio Tagging | Link | Code | |
Arxiv 24.06.04 | Audio Mamba: Selective State Spaces for Self-Supervised Audio Representations | ![]() |
Link | Code |
Arxiv 24.06.05 | Audio Mamba: Bidirectional State Space Model for Audio Representation Learning | ![]() |
Link | Code |
Arxiv 24.06.10 | RawBMamba: End-to-End Bidirectional State Space Model for Audio Deepfake Detection | ![]() |
Link | Code |
Date | Paper | Figure | Link | Code |
---|---|---|---|---|
Arxiv 24.04.23 | Integrating Mamba and Transformer for Long-Short Range Time Series Forecasting | ![]() |
Link | |
Arxiv 24.04.24 | Bi-Mamba+: Bidirectional Mamba for Time Series Forecasting | ![]() |
Link | |
Arxiv 24.05.11 | DTMamba : Dual Twin Mamba for Time Series Forecasting | ![]() |
Link | |
Arxiv 24.05.25 | Time-SSM: Simplifying and Unifying State Space Models for Time Series Forecasting | ![]() |
Link | |
Arxiv 24.05.26 | MambaTS: Improved Selective State Space Models for Long-term Time Series Forecasting | ![]() |
Link | |
Arxiv 24.06.06 | Chimera: Effectively Modeling Multivariate Time Series with 2-Dimensional State Space Models | ![]() |
Link | |
Arxiv 24.06.06 | TSCMamba: Mamba Meets Multi-View Learning for Time Series Classification | ![]() |
Link | |
Arxiv 24.06.08 | C-Mamba: Channel Correlation Enhanced State Space Models for Multivariate Time Series Forecasting | ![]() |
Link | Code |