feat: audio filter plugin #350

typester · 2025-01-27T19:52:16Z

The rust-sdk implementation (livekit/rust-sdks#559) must be merged before this pull request.

…-plugin

bcherry

lgtm, lmk when it's ready for final review

livekit-rtc/livekit/rtc/_ffi_client.py

theomonnom · 2025-02-06T22:38:26Z

livekit-rtc/livekit/rtc/audio_stream.py

+        enable_filter: Any = None,
+        filter_options: dict[str, Any] | None = None,


I would be more eager to add strict typings here.
I think we should remove the filter_options and put it inside the KrispFilter constructor.
I was imagining something like:

rtc.AudioStream(krisp=KrispFilter(options)) # options can be automatically determined by our third package that will find the dynamic lib inside our wheel

I'm not sure why we make it a generic interface today, feels easier to explicitly mention Krisp, wdyt?

EDIT: another option is to mention Enhanced Noise Suppression or Background Voice Cancellation instead of Krisp

Since I'm not familiar with Python, any suggestion like this is super welcome! Let me think about this

theomonnom · 2025-02-06T22:49:34Z

livekit-rtc/livekit/rtc/audio_filter.py

+            {
+                "url": url,
+                "token": token,
+            }


Should we make it strictly typed inside our protobufs definition instead of using json? The auth function could accept those as arguments (no need for a prost dependency)

theomonnom · 2025-02-06T22:50:40Z

livekit-rtc/livekit/rtc/room.py

+    def __init__(
+        self,
+        loop: Optional[asyncio.AbstractEventLoop] = None,
+        filters: Optional[List[Any]] = None,


Using the constructor is tricky because users don't construct it on agents, we do it for them. IMO we should only expose this API inside the rtc.AudioStream for now

Yeah, I think so. But the Krisp filter needs to be initialized with the LIVEKIT_URL and TOKEN parameters to verify the token. So, I've initialized it here.

Instead, do you mean that it would be better to pass all of this information in the AudioStream constructor, as you suggested earlier? Or creating a separate init function to pass these info? Something like:

krisp.init(url, token)

or something like that. I think the latter is better

no strong opinion, tho I know dz wanted to avoid the auth on the python side (so grabbing the url/token on the rust side)

theomonnom · 2025-02-13T11:27:16Z

livekit-rtc/livekit/rtc/audio_stream.py

@@ -54,6 +55,7 @@ def __init__(
        capacity: int = 0,
        sample_rate: int = 48000,
        num_channels: int = 1,
+        audio_filter: Optional[Tuple[str, dict[str, any]]] = None,


Any way to get something that looks more like #350 (comment)
This type isn't self-explanatory

I intended to cover this in the filter module side, because the option parameter will vary depending on the filter module. It looks something like this:

audio_filter=noise_filter.NC()

Please refer to the noise_filter repo for more details

I'm not sure why we want to make this generic, if we have more filters in the future we could also add new keyword arguments, I propose to change the argument name to be directly krisp

I also doubt we're going to have more filters? (~~Noise cancellation~~Echo cancellation should be exposed directly to Python with raw API). Because we need to use it on external speakers not available on the Rust side (not tied to any rtc.AudioStream ― I wouldn't design for it inside this PR)

I think we should narrow down the API for now

Exposing the raw API is better in terms of architecture, and I agreed with this in our previous discussion. But it ended up with it's not good with our metric system, wasn't it? So we decided to extend AudioStream for now.

In that case, I would say that it would be better to handle the filter options in noise_filter side and to keep AudioStream generic. This would make updates to noise_filter easier. What do you think?

Sorry, I meant the echo cancellation in the previous message.

What I mean is why we want to keep the name generic like audio_filter instead of krisp
e.g krisp=KrispFilter(options)
or could also be noise_cancellation=krisp.AudioFilter(options)

Oh I see. I don't have a strong opinion on this. We're creating a noise filter package as livekit-noise-filter (Initially, it was named livekit-krisp, but we renamed it.) So, would it be the latter?

I introduced two parameters to the python-sdk.

The first one is for RoopOptions: (This is required to register the filter to the room)

await room.connect( os.getenv("LIVEKIT_URL"), token, rtc.RoomOptions( auto_subscribe=True, audio_filters=[noise_filter], ) )

The second is for AudioStream:

stream = rtc.AudioStream.from_track( track=track, audio_filter=noise_filter.NC(), )

So, should both options be named noise_cancellation_modules and noise_cancellation? How do they sound?

typester added 4 commits January 27, 2025 11:49

change rust-sdk to aa3be1b3f1bdd3c1cb0fed686ebeeb30c51c2e64

735f53e

add audio_filter ffi

0674b86

ruff

cb0a348

fix multiple import

d924905

bcherry self-requested a review January 28, 2025 07:19

typester and others added 9 commits February 3, 2025 13:07

update submodule

8d36a7b

update submodule

0602bd2

update submodule

7db3afd

change plugin interface

781c8e0

Merge remote-tracking branch 'origin/main' into typester/audio-filter…

2fad7c6

…-plugin

generated protobuf

ce435eb

ruff

1d3b647

add types

e03cda4

fix types

d93b2f3

bcherry reviewed Feb 6, 2025

View reviewed changes

livekit-rtc/livekit/rtc/_ffi_client.py Outdated Show resolved Hide resolved

typester added 3 commits February 6, 2025 13:10

this should be RemoteAudioTrack

9c5f506

revert debugging code

dd2e973

fmt

1e49989

typester marked this pull request as ready for review February 6, 2025 21:17

typester changed the title ~~[WIP] feat: audio filter plugin~~ feat: audio filter plugin Feb 6, 2025

typester requested review from bcherry and theomonnom February 6, 2025 21:18

theomonnom reviewed Feb 6, 2025

View reviewed changes

typester and others added 5 commits February 11, 2025 17:25

update proto

ac18abc

update rust-sdk and proto

1971ca1

new interface

5336388

ruff

637389d

generated protobuf

792bb90

typester requested a review from theomonnom February 12, 2025 20:04

theomonnom reviewed Feb 13, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: audio filter plugin #350

feat: audio filter plugin #350

typester commented Jan 27, 2025

bcherry left a comment

theomonnom Feb 6, 2025 •

edited

Loading

typester Feb 7, 2025

theomonnom Feb 6, 2025

theomonnom Feb 6, 2025 •

edited

Loading

typester Feb 7, 2025

theomonnom Feb 11, 2025

theomonnom Feb 13, 2025

typester Feb 13, 2025

theomonnom Feb 13, 2025 •

edited

Loading

typester Feb 14, 2025

theomonnom Feb 14, 2025 •

edited

Loading

typester Feb 14, 2025

		enable_filter: Any = None,
		filter_options: dict[str, Any] \| None = None,

feat: audio filter plugin #350

Are you sure you want to change the base?

feat: audio filter plugin #350

Conversation

typester commented Jan 27, 2025

bcherry left a comment

Choose a reason for hiding this comment

theomonnom Feb 6, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

theomonnom Feb 6, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

theomonnom Feb 13, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

theomonnom Feb 14, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

theomonnom Feb 6, 2025 •

edited

Loading

theomonnom Feb 6, 2025 •

edited

Loading

theomonnom Feb 13, 2025 •

edited

Loading

theomonnom Feb 14, 2025 •

edited

Loading