Skip to content

Commit

Permalink
code refactor
Browse files Browse the repository at this point in the history
  • Loading branch information
voidful committed Feb 15, 2024
1 parent a2d44b1 commit 8ca1154
Show file tree
Hide file tree
Showing 72 changed files with 78 additions and 174 deletions.
22 changes: 11 additions & 11 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
*.wav
/AudCodec/test/**/
!/AudCodec/test/sample1_16k.wav
!/AudCodec/test/sample2_22k.wav
!/AudCodec/test/sample3_48k.wav
!/AudCodec/test/sample4_16k.wav
!/AudCodec/test/sample5_16k.wav
!/AudCodec/test/sample6_48k.wav
!/AudCodec/test/sample7_16k.wav
!/AudCodec/test/sample8_16k.wav
!/AudCodec/test/sample9_48k.wav
!/AudCodec/test/sample10_16k.wav
/SoundCodec/test/**/
!/SoundCodec/test/sample1_16k.wav
!/SoundCodec/test/sample2_22k.wav
!/SoundCodec/test/sample3_48k.wav
!/SoundCodec/test/sample4_16k.wav
!/SoundCodec/test/sample5_16k.wav
!/SoundCodec/test/sample6_48k.wav
!/SoundCodec/test/sample7_16k.wav
!/SoundCodec/test/sample8_16k.wav
!/SoundCodec/test/sample9_48k.wav
!/SoundCodec/test/sample10_16k.wav

__pycache__/
40 changes: 30 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,17 @@
# Codec-SUPERB: Audio Codec Speech Processing Universal Performance Benchmark

![Overview](AudCodec/img/Overview.png)
![Overview](SoundCodec/img/Overview.png)

Codec-SUPERB is a comprehensive benchmark designed to evaluate audio codec models across a variety of speech tasks. Our
goal is to facilitate community collaboration and accelerate advancements in the field of speech processing by
preserving and enhancing speech information quality.


## Table of Contents

- [Introduction](#introduction)
- [Key Features](#key-features)
- [Installation](#installation)
- [Usage](#usage)
- [Benchmarking](#benchmarking)
- [Contribution](#contribution)
- [License](#license)

Expand All @@ -26,17 +24,22 @@ in audio quality and processing efficiency.
## Key Features

### Out-of-the-Box Codec Interface

Codec-SUPERB offers an intuitive, out-of-the-box codec interface that allows for easy integration and testing of various
codec models, facilitating quick iterations and experiments.

### Multi-Perspective Leaderboard
Codec-SUPERB's unique blend of multi-perspective evaluation and an online leaderboard drives innovation in audio codec research by providing a comprehensive assessment and fostering competitive transparency among developers.

Codec-SUPERB's unique blend of multi-perspective evaluation and an online leaderboard drives innovation in audio codec
research by providing a comprehensive assessment and fostering competitive transparency among developers.

### Standardized Environment

We ensure a standardized testing environment to guarantee fair and consistent comparison across all models. This
uniformity brings reliability to benchmark results, making them universally interpretable.

### Unified Datasets

We provide a collection of unified datasets, curated to test a wide range of speech processing scenarios. This ensures
that models are evaluated under diverse conditions, reflecting real-world applications.

Expand All @@ -50,13 +53,31 @@ pip install -r requirements.txt

## Usage

Detailed instructions on how to use Codec-SUPERB, including preparing your codec model and executing benchmark tests,
can be found in the `docs` directory.
### [Leaderboard]()

### Out of the Box Codec Interface

## Benchmarking
```python
from SoundCodec import codec
import torchaudio

Codec-SUPERB supports a comprehensive suite of speech tasks, from speech recognition to audio quality assessment, each
designed to rigorously evaluate the capabilities of audio codec models.
# get all available codec
print(codec.list_codec())
# load codec by name, use encodec as example
encodec_24k_6bps = codec.load_codec('encodec_24k_6bps')

# load audio
waveform, sample_rate = torchaudio.load('sample audio')
resampled_waveform = waveform.numpy()[-1]
data_item = {'audio': {'array': resampled_waveform,
'sampling_rate': sample_rate}}

# extract unit
sound_unit = encodec_24k_6bps.extract_unit(data_item).unit

# sound synthesis
decoded_waveform = encodec_24k_6bps.synth(sound_unit, local_save=False)['audio']['array']
```

## Contribution

Expand All @@ -67,7 +88,6 @@ enhancing the benchmarking framework. Please see `CONTRIBUTING.md` for more deta

This project is licensed under the MIT License - see the `LICENSE` file for details.


## Reference Audio Codec Repositories:

- https://github.com/ZhangXInFD/SpeechTokenizer
Expand Down
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
from torch.nn.utils import spectral_norm
from torch.nn.utils import weight_norm
from librosa.util import normalize
from AudCodec.base_codec.general import save_audio, ExtractedUnit
from SoundCodec.base_codec.general import save_audio, ExtractedUnit

LRELU_SLOPE = 0.1

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import nlp2
import torch
from AudCodec.base_codec.general import save_audio, ExtractedUnit
from SoundCodec.base_codec.general import save_audio, ExtractedUnit


class BaseCodec:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from AudCodec.base_codec.general import save_audio, ExtractedUnit
from SoundCodec.base_codec.general import save_audio, ExtractedUnit
import torch
from audiotools import AudioSignal

Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import torch
from AudCodec.base_codec.general import save_audio, ExtractedUnit
from SoundCodec.base_codec.general import save_audio, ExtractedUnit


class BaseCodec:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import torch
from transformers import AutoModel, AutoProcessor
from AudCodec.base_codec.general import save_audio, ExtractedUnit
from SoundCodec.base_codec.general import save_audio, ExtractedUnit


class BaseCodec:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
import torch
import os

from AudCodec.base_codec.general import save_audio, ExtractedUnit
from SoundCodec.base_codec.general import save_audio, ExtractedUnit
from audiotools import AudioSignal


Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import numpy

from AudCodec.base_codec.general import save_audio, ExtractedUnit
from SoundCodec.base_codec.general import save_audio, ExtractedUnit
import torchaudio
import torch
import nlp2
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@


def load_codec(codec_name):
module = __import__(f"codec.{codec_name}", fromlist=[codec_name])
module = __import__(f"SoundCodec.codec.{codec_name}", fromlist=[codec_name])
return module.Codec()


Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import json
import nlp2
from AudCodec.base_codec.academicodec import BaseCodec
from SoundCodec.base_codec.academicodec import BaseCodec


class Codec(BaseCodec):
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import json
import nlp2
from AudCodec.base_codec.academicodec import BaseCodec
from SoundCodec.base_codec.academicodec import BaseCodec


class Codec(BaseCodec):
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import json
import nlp2
from AudCodec.base_codec.academicodec import BaseCodec
from SoundCodec.base_codec.academicodec import BaseCodec


class Codec(BaseCodec):
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from AudCodec.base_codec.audiodec import BaseCodec
from SoundCodec.base_codec.audiodec import BaseCodec
import nlp2


Expand Down
2 changes: 1 addition & 1 deletion AudCodec/codec/dac_16k.py → SoundCodec/codec/dac_16k.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from AudCodec.base_codec.descript_audio_codec import BaseCodec
from SoundCodec.base_codec.descript_audio_codec import BaseCodec


class Codec(BaseCodec):
Expand Down
2 changes: 1 addition & 1 deletion AudCodec/codec/dac_24k.py → SoundCodec/codec/dac_24k.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from AudCodec.base_codec.descript_audio_codec import BaseCodec
from SoundCodec.base_codec.descript_audio_codec import BaseCodec


class Codec(BaseCodec):
Expand Down
2 changes: 1 addition & 1 deletion AudCodec/codec/dac_44k.py → SoundCodec/codec/dac_44k.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from AudCodec.base_codec.descript_audio_codec import BaseCodec
from SoundCodec.base_codec.descript_audio_codec import BaseCodec


class Codec(BaseCodec):
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from AudCodec.base_codec.encodec import BaseCodec
from SoundCodec.base_codec.encodec import BaseCodec

class Codec(BaseCodec):
def config(self):
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from AudCodec.base_codec.encodec import BaseCodec
from SoundCodec.base_codec.encodec import BaseCodec

class Codec(BaseCodec):
def config(self):
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from AudCodec.base_codec.encodec import BaseCodec
from SoundCodec.base_codec.encodec import BaseCodec

class Codec(BaseCodec):
def config(self):
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from AudCodec.base_codec.encodec import BaseCodec
from SoundCodec.base_codec.encodec import BaseCodec

class Codec(BaseCodec):
def config(self):
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from AudCodec.base_codec.encodec import BaseCodec
from SoundCodec.base_codec.encodec import BaseCodec

class Codec(BaseCodec):
def config(self):
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import nlp2

from AudCodec.base_codec.funcodec import BaseCodec
from SoundCodec.base_codec.funcodec import BaseCodec


class Codec(BaseCodec):
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import nlp2

from AudCodec.base_codec.funcodec import BaseCodec
from SoundCodec.base_codec.funcodec import BaseCodec


class Codec(BaseCodec):
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import nlp2

from AudCodec.base_codec.funcodec import BaseCodec
from SoundCodec.base_codec.funcodec import BaseCodec


class Codec(BaseCodec):
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import nlp2

from AudCodec.base_codec.funcodec import BaseCodec
from SoundCodec.base_codec.funcodec import BaseCodec


class Codec(BaseCodec):
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import nlp2

from AudCodec.base_codec.funcodec import BaseCodec
from SoundCodec.base_codec.funcodec import BaseCodec


class Codec(BaseCodec):
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import nlp2

from AudCodec.base_codec.funcodec import BaseCodec
from SoundCodec.base_codec.funcodec import BaseCodec


class Codec(BaseCodec):
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from AudCodec.base_codec.speech_tokenizer import BaseCodec
from SoundCodec.base_codec.speech_tokenizer import BaseCodec
import nlp2

class Codec(BaseCodec):
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import torch
import torchaudio

from AudCodec.codec import list_codec, load_codec
from SoundCodec.codec import list_codec, load_codec

if __name__ == '__main__':
for sample_file in ['sample1_16k.wav', 'sample2_22k.wav', 'sample3_48k.wav', 'sample4_16k.wav',
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import torchaudio

from AudCodec.codec import list_codec, load_codec
from SoundCodec.codec import list_codec, load_codec

if __name__ == '__main__':
for sample_file in ['sample1_16k.wav', 'sample2_22k.wav', 'sample3_48k.wav', 'sample4_16k.wav',
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
import torch

from benchmarking import compute_metrics
from AudCodec.codec import load_codec
from SoundCodec.codec import load_codec
import torchaudio
import numpy as np

Expand Down
2 changes: 1 addition & 1 deletion benchmarking.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
from datasets import load_dataset, load_from_disk
from collections import defaultdict
from audiotools import AudioSignal
from AudCodec.base_codec.general import pad_arrays_to_match
from SoundCodec.base_codec.general import pad_arrays_to_match
from metrics import get_metrics
import psutil
from tqdm.contrib.concurrent import process_map
Expand Down
2 changes: 1 addition & 1 deletion dataset_checker.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
import itertools
import numpy as np
from datasets import load_dataset
from AudCodec.codec import list_codec
from SoundCodec.codec import list_codec


def load_datasets(dataset_name, splits):
Expand Down
6 changes: 3 additions & 3 deletions dataset_creator.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
import argparse
from datasets import DatasetDict, Audio, load_from_disk
from AudCodec.codec import load_codec, list_codec
from AudCodec.dataset import load_dataset
from AudCodec.dataset.general import extract_unit
from SoundCodec.codec import load_codec, list_codec
from SoundCodec.dataset import load_dataset
from SoundCodec.dataset.general import extract_unit


def run_experiment(dataset_name):
Expand Down
2 changes: 1 addition & 1 deletion dataset_updater.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import argparse
from datasets import Audio, Dataset, Value, Sequence
from AudCodec.codec import load_codec, list_codec
from SoundCodec.codec import load_codec, list_codec
from datasets import load_dataset


Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
from setuptools import setup, find_packages

setup(
name='AudCodec',
name='SoundCodec',
version='1.0',
packages=find_packages(),
install_requires=[],
Expand Down
Loading

0 comments on commit 8ca1154

Please sign in to comment.