Skip to content

On-device speaker diarization powered by deep learning

License

Notifications You must be signed in to change notification settings

Picovoice/falcon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

d63f5a5 · Jun 18, 2024

History

32 Commits
Jun 18, 2024
Jun 18, 2024
Jun 18, 2024
Nov 28, 2023
Apr 24, 2024
Apr 24, 2024
Nov 28, 2023
Apr 25, 2024
Nov 28, 2023
Nov 28, 2023
Nov 28, 2023
Apr 24, 2024

Repository files navigation

Falcon

GitHub release GitHub

Maven Central CocoaPods npm PyPI

Made in Vancouver, Canada by Picovoice

Twitter URL YouTube Channel Views

Falcon is an on-device speaker diarization engine. Falcon is:

  • Private; All voice processing runs locally.
  • Cross-Platform:
    • Linux (x86_64), macOS (x86_64, arm64), Windows (x86_64)
    • Raspberry Pi (3, 4, 5) and NVIDIA Jetson Nano
    • Android and iOS
    • Chrome, Safari, Firefox, and Edge

Table of Contents

What is Speaker Diarization?

Speaker diarization, a fundamental step in automatic speech recognition and audio processing, focuses on identifying and separating distinct speakers within an audio recording. Its objective is to divide the audio into segments while precisely identifying the speakers and their respective speaking intervals.

AccessKey

AccessKey is your authentication and authorization token for deploying Picovoice SDKs, including Falcon. Anyone who is using Picovoice needs to have a valid AccessKey. You must keep your AccessKey secret. You would need internet connectivity to validate your AccessKey with Picovoice license servers even though the speaker recognition is running 100% offline.

AccessKey also verifies that your usage is within the limits of your account. Everyone who signs up for Picovoice Console receives the Free Tier usage rights described here. If you wish to increase your limits, you can purchase a subscription plan.

Demos

Python Demo

Install the demo package:

pip3 install pvfalcondemo

Run the following in the terminal:

falcon_demo_file --access_key ${ACCESS_KEY} --audio_paths ${AUDIO_PATH}

Replace ${ACCESS_KEY} with yours obtained from Picovoice Console.

For more information about Python demos go to demo/python.

C Demo

Build the demo:

cmake -S demo/c/ -B demo/c/build && cmake --build demo/c/build

Run the demo:

./demo/c/build/falcon_demo -a ${ACCESS_KEY} -l ${LIBRARY_PATH} -m ${MODEL_PATH} ${AUDIO_PATH}

Web Demo

From demo/web run the following in the terminal:

yarn
yarn start

(or)

npm install
npm run start

Open http://localhost:5000 in your browser to try the demo.

iOS Demo

To run the demo, go to demo/ios/FalconDemo and run:

pod install

Replace let accessKey = "${YOUR_ACCESS_KEY_HERE}" in the file ViewModel.swift with your AccessKey.

Then, using Xcode, open the generated FalconDemo.xcworkspace and run the application.

Android Demo

Using Android Studio, open demo/android/FalconDemo as an Android project and then run the application.

Replace "${YOUR_ACCESS_KEY_HERE}" in the file MainActivity.java with your AccessKey.

SDKs

Python

Install the Python SDK:

pip3 install pvfalcon

Create an instance of the engine and perform speaker diarization on an audio file:

import pvfalcon

falcon = pvfalcon.create(access_key='${ACCESS_KEY}')

print(falcon.process_file('${AUDIO_PATH}'))

Replace ${ACCESS_KEY} with yours obtained from Picovoice Console and ${AUDIO_PATH} to path an audio file.

Finally, when done be sure to explicitly release the resources:

falcon.delete()

C

Create an instance of the engine and perform speaker diarization on an audio file:

#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>

#include "pv_falcon.h"

pv_falcon_t *falcon = NULL;
pv_status_t status = pv_falcon_init("${ACCESS_KEY}", "${MODEL_PATH}", &falcon);
if (status != PV_STATUS_SUCCESS) {
    // error handling logic
}

int32_t num_segments = 0;
pv_segment_t *segments = NULL;
status = pv_falcon_process_file(falcon, "${AUDIO_PATH}", &num_segments, &segments);
if (status != PV_STATUS_SUCCESS) {
    // error handling logic
}

for (int32_t i = 0; i < num_segments; i++) {
    pv_segment_t *segment = &segments[i];
    fprintf(
            stdout,
            "Speaker: %d -> Start: %5.2f, End: %5.2f\n",
            segment->speaker_tag,
            segment->start_sec,
            segment->end_sec);
}

pv_falcon_segments_delete(segments);

Replace ${ACCESS_KEY} with yours obtained from Picovoice Console, ${MODEL_PATH} to path to default model file (or your custom one), and ${AUDIO_PATH} to path an audio file.

Finally, when done be sure to release resources acquired:

pv_falcon_delete(falcon);

Web

Install the web SDK using yarn:

yarn add @picovoice/falcon-web

or using npm:

npm install --save @picovoice/falcon-web

Create an instance of the engine using FalconWorker and diarize an audio file:

import { Falcon } from '@picovoice/falcon-web';
import falconParams from '${PATH_TO_BASE64_FALCON_PARAMS}';

function getAudioData(): Int16Array {
  // ... function to get audio data
  return new Int16Array();
}

const falcon = await FalconWorker.create('${ACCESS_KEY}', {
  base64: falconParams,
});

const { segments } = await falcon.process(getAudioData());
console.log(segments);

Replace ${ACCESS_KEY} with yours obtained from Picovoice Console. Finally, when done release the resources using falcon.release().

iOS

The Falcon iOS binding is available via CocoaPods. To import it into your iOS project, add the following line to your Podfile and run pod install:

pod 'Falcon-iOS'

Create an instance of the engine and perform speaker diarization on an audio_file:

import Falcon

let falcon = Falcon(accessKey: "${ACCESS_KEY}")

do {
    let audioPath = Bundle(for: type(of: self)).path(forResource: "${AUDIO_FILE_NAME}", ofType: "${AUDIO_FILE_EXTENSION}")
    let segments = falcon.process(audioPath)
} catch { }

Replace ${ACCESS_KEY} with yours obtained from Picovoice Console, ${AUDIO_FILE_NAME} with the name of the audio file and ${AUDIO_FILE_EXTENSION} with the extension of the audio file.

Android

To include the Falcon package in your Android project, ensure you have included mavenCentral() in your top-level build.gradle file and then add the following to your app's build.gradle:

dependencies {
    implementation 'ai.picovoice:falcon-android:${LATEST_VERSION}'
}

Create an instance of the engine and perform speaker diarization on an audio file:

import ai.picovoice.falcon.*;

final String accessKey = "${ACCESS_KEY}"; // AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)
try {
    Falcon falcon = new Falcon.Builder()
        .setAccessKey(accessKey)
        .build(appContext);

        File audioFile = new File("${AUDIO_FILE_PATH}");
        FalconSegment[] segments = falcon.processFile(audioFile.getAbsolutePath());

} catch (FalconException ex) { }

Replace ${ACCESS_KEY} with yours obtained from Picovoice Console, and ${AUDIO_FILE_PATH} with the path to the audio file.

Finally, when done make sure to explicitly release the resources:

falcon.delete()

For more details, see the Android SDK.

Releases

v1.0.0 — November 28th, 2023

  • Initial release.

FAQ

You can find the FAQ here.