Skip to content

Commit

Permalink
V1.1 (#11)
Browse files Browse the repository at this point in the history
  • Loading branch information
kenarsa authored Dec 3, 2020
1 parent 2dea5e9 commit a99ad0a
Show file tree
Hide file tree
Showing 64 changed files with 903 additions and 323 deletions.
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.idea
node_modules
.DS_Store
.DS_Store
demo/c/picovoice_demo_file
demo/c/picovoice_demo_mic
99 changes: 51 additions & 48 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,28 +26,28 @@ spoken command:
- **Private & Secure:** Everything is processed offline. Intrinsically private; HIPAA and GDPR compliant.
- **Accurate:** Resilient to noise and reverberation. Outperforms cloud-based alternatives by wide margins.
- **Cross-Platform:** Design once, deploy anywhere. Build using familiar languages and frameworks. Raspberry Pi, BeagleBone,
Android, iOS, Linux (x86_64), macOS (x86_64), Windows (x86_64), and modern web browsers are supported. Enterprise customers
can access ARM Cortex-M SDK.
Android, iOS, Linux (x86_64), macOS (x86_64), Windows (x86_64), and modern web browsers are supported. Enterprise customers
can access the ARM Cortex-M SDK.
- **Self-Service:** Design, train, and test voice interfaces instantly in your browser, using [Picovoice Console](https://picovoice.ai/console/).
- **Reliable:** Runs locally without needing continuous connectivity.
- **Zero Latency:** Edge-first architecture eliminates unpredictable network delay.

## Build with Picovoice

1. **Evaluate:** The Picovoice SDK is a cross-platform library for adding voice to anything. It includes some
pre-trained speech models. The SDK is licensed under Apache 2.0 and available on GitHub to encourage independent
benchmarking and integration testing. You are empowered to make a data-driven decision.
pre-trained speech models. The SDK is licensed under Apache 2.0 and available on GitHub to encourage independent
benchmarking and integration testing. You are empowered to make a data-driven decision.

2. **Design:** [Picovoice Console](https://picovoice.ai/console/) is a cloud-based platform for designing voice
interfaces and training speech models, all within your web browser. No machine learning skills are required. Simply
describe what you need with text and export trained models.
interfaces and training speech models, all within your web browser. No machine learning skills are required. Simply
describe what you need with text and export trained models.

3. **Develop:** Exported models can run on Picovoice SDK without requiring constant connectivity. The SDK runs on a wide
range of platforms and supports a large number of frameworks. The Picovoice Console and Picovoice SDK enable you to
design, build and iterate fast.
range of platforms and supports a large number of frameworks. The Picovoice Console and Picovoice SDK enable you to
design, build and iterate fast.

4. **Deploy:** Deploy at scale without having to maintain complex cloud infrastructure. Avoid unbounded cloud fees,
limitations, and control imposed by big tech.
limitations, and control imposed by big tech.

## Platform Features

Expand All @@ -66,11 +66,11 @@ platform.

## License & Terms

The Picovoice SDK is free and licensed under Apache 2.0 including the models released within. [Picovoice Console]((https://picovoice.ai/console/)) offers
The Picovoice SDK is free and licensed under Apache 2.0 including the models released within. [Picovoice Console](https://picovoice.ai/console/) offers
two types of subscriptions: Personal and Enterprise. Personal accounts can train custom speech models that run on the
Picovoice SDK, subject to limitations and strictly for non-commercial purposes. Personal accounts empower researchers,
hobbyists, and tinkerers to experiment. Enterprise accounts can unlock all capabilities of Picovoice Console, are
permitted for use in commercial settings, and have a path to graduate to commercial distribution[<sup>*</sup>](https://picovoice.ai/pricing/).
permitted for use in commercial settings, and have a path to graduate to commercial distribution[<sup>\*</sup>](https://picovoice.ai/pricing/).

## Table of Contents

Expand Down Expand Up @@ -304,7 +304,7 @@ both of which [run offline in the browser](https://picovoice.ai/blog/offline-voi

### Python

Install the package
Install the package:

```bash
pip3 install picovoice
Expand All @@ -323,11 +323,9 @@ def wake_word_callback():
context_path = ...

def inference_callback(inference):
# `inference` exposes three immutable fields:
# (1) `is_understood`
# (2) `intent`
# (3) `slots`
pass
print(inference.is_understood)
print(inference.intent)
print(inference.slots)

handle = Picovoice(
keyword_path=keyword_path,
Expand All @@ -336,15 +334,15 @@ handle = Picovoice(
inference_callback=inference_callback)
```

`handle` is an instance of Picovoice runtime engine that detects utterances of wake phrase defined in the file located at
`handle` is an instance of the Picovoice runtime engine. It detects utterances of wake phrase defined in the file located at
`keyword_path`. Upon detection of wake word it starts inferring user's intent from the follow-on voice command within
the context defined by the file located at `context_path`. `keyword_path` is the absolute path to
[Porcupine wake word engine](https://github.com/Picovoice/porcupine) keyword file (with `.ppn` suffix).
`context_path` is the absolute path to [Rhino Speech-to-Intent engine](https://github.com/Picovoice/rhino) context file
(with `.rhn` suffix). `wake_word_callback` is invoked upon the detection of wake phrase and `inference_callback` is
the context defined by the file located at `context_path`. `keyword_path` is the absolute path to the
[Porcupine wake word engine](https://github.com/Picovoice/porcupine) keyword file (with `.ppn` extension).
`context_path` is the absolute path to the [Rhino Speech-to-Intent engine](https://github.com/Picovoice/rhino) context file
(with `.rhn` extension). `wake_word_callback` is invoked upon the detection of wake phrase and `inference_callback` is
invoked upon completion of follow-on voice command inference.

When instantiated, valid sample rate can be obtained via `handle.sample_rate`. Expected number of audio samples per
When instantiated, the required rate can be obtained via `handle.sample_rate`. Expected number of audio samples per
frame is `handle.frame_length`. The engine accepts 16-bit linearly-encoded PCM and operates on single-channel audio. The
set of supported commands can be retrieved (in YAML format) via `handle.context_info`.

Expand All @@ -356,11 +354,7 @@ while True:
handle.process(get_next_audio_frame())
```

When done resources have to be released explicitly

```python
handle.delete()
```
When done, resources have to be released explicitly `handle.delete()`.

### NodeJS

Expand All @@ -376,8 +370,8 @@ yarn add @picovoice/picovoice-node
npm install @picovoice/picovoice-node
```

The SDK provides the `Picovoice` class. Create an instance of this class using a Porcupine keyword (with `.ppn` suffix)
and Rhino context file (with `.rhn` suffix), as well as callback functions that will be invoked on wake word detection
The SDK provides the `Picovoice` class. Create an instance of this class using a Porcupine keyword (with `.ppn` extension)
and Rhino context file (with `.rhn` extension), as well as callback functions that will be invoked on wake word detection
and command inference completion events, respectively:

```javascript
Expand Down Expand Up @@ -419,7 +413,7 @@ As the audio is processed through the Picovoice engines, the callbacks will fire
### .NET

You can install the latest version of Picovoice by adding the latest
[Picovoice Nuget package](https://www.nuget.org/packages/Picovoice/) in Visual Studio or using the .NET CLI.
[Picovoice NuGet package](https://www.nuget.org/packages/Picovoice/) in Visual Studio or using the .NET CLI.

```bash
dotnet add package Picovoice
Expand Down Expand Up @@ -455,13 +449,13 @@ Picovoice handle = new Picovoice(keywordPath,
`handle` is an instance of Picovoice runtime engine that detects utterances of wake phrase defined in the file located at
`keywordPath`. Upon detection of wake word it starts inferring user's intent from the follow-on voice command within
the context defined by the file located at `contextPath`. `keywordPath` is the absolute path to
[Porcupine wake word engine](https://github.com/Picovoice/porcupine) keyword file (with `.ppn` suffix).
[Porcupine wake word engine](https://github.com/Picovoice/porcupine) keyword file (with `.ppn` extension).
`contextPath` is the absolute path to [Rhino Speech-to-Intent engine](https://github.com/Picovoice/rhino) context file
(with `.rhn` suffix). `wakeWordCallback` is invoked upon the detection of wake phrase and `inferenceCallback` is
(with `.rhn` extension). `wakeWordCallback` is invoked upon the detection of wake phrase and `inferenceCallback` is
invoked upon completion of follow-on voice command inference.

When instantiated, valid sample rate can be obtained via `handle.SampleRate`. Expected number of audio samples per
frame is `handle.FrameLength`. The engine accepts 16-bit linearly-encoded PCM and operates on single-channel audio.
When instantiated, the required sample rate can be obtained via `handle.SampleRate`. The expected number of audio samples per
frame is `handle.FrameLength`. The Picovoice engine accepts 16-bit linearly-encoded PCM and operates on single-channel audio.

```csharp
short[] GetNextAudioFrame()
Expand Down Expand Up @@ -519,16 +513,16 @@ try{
} catch (PicovoiceException e) { }
```

`handle` is an instance of Picovoice runtime engine that detects utterances of wake phrase defined in the file located at
`keywordPath`. Upon detection of wake word it starts inferring user's intent from the follow-on voice command within
`handle` is an instance of the Picovoice runtime engine that detects utterances of wake phrase defined in the file located at
`keywordPath`. Upon detection of wake word it starts inferring the user's intent from the follow-on voice command within
the context defined by the file located at `contextPath`. `keywordPath` is the absolute path to
[Porcupine wake word engine](https://github.com/Picovoice/porcupine) keyword file (with `.ppn` suffix).
[Porcupine wake word engine](https://github.com/Picovoice/porcupine) keyword file (with `.ppn` extension).
`contextPath` is the absolute path to [Rhino Speech-to-Intent engine](https://github.com/Picovoice/rhino) context file
(with `.rhn` suffix). `wakeWordCallback` is invoked upon the detection of wake phrase and `inferenceCallback` is
(with `.rhn` extension). `wakeWordCallback` is invoked upon the detection of wake phrase and `inferenceCallback` is
invoked upon completion of follow-on voice command inference.

When instantiated, valid sample rate can be obtained via `handle.getSampleRate()`. Expected number of audio samples per
frame is `handle.getFrameLength()`. The engine accepts 16-bit linearly-encoded PCM and operates on single-channel audio.
When instantiated, the required sample rate can be obtained via `handle.getSampleRate()`. The expected number of audio samples per
frame is `handle.getFrameLength()`. The Picovoice engine accepts 16-bit linearly-encoded PCM and operates on single-channel audio.

```java
short[] getNextAudioFrame()
Expand Down Expand Up @@ -558,7 +552,7 @@ There are two possibilities for integrating Picovoice into an Android applicatio
[PicovoiceManager](/sdk/android/Picovoice/picovoice/src/main/java/ai/picovoice/picovoice/PicovoiceManager.java) provides
a high-level API for integrating Picovoice into Android applications. It manages all activities related to creating an
input audio stream, feeding it into Picovoice engine, and invoking user-defined callbacks upon wake word detection and
inference completion. The class can be initialized as follow
inference completion. The class can be initialized as follows:

```java
import ai.picovoice.picovoice.PicovoiceManager;
Expand Down Expand Up @@ -592,22 +586,22 @@ PicovoiceManager manager = new PicovoiceManager(
);
```

Sensitivity is the parameter that enables developers to trade miss rate for false alarm. It is a floating number within
Sensitivity is the parameter that enables developers to trade miss rate for false alarm. It is a floating point number within
[0, 1]. A higher sensitivity reduces miss rate at cost of increased false alarm rate.

When initialized, input audio can be processed using
When initialized, input audio can be processed using:

```java
manager.start();
```

Stop the manager by
Stop the manager with:

```java
manager.stop();
```

When done be sure to release resources using
When done be sure to release resources:

```java
manager.delete();
Expand Down Expand Up @@ -650,7 +644,7 @@ Picovoice picovoice = new Picovoice(
);
```

Sensitivity is the parameter that enables developers to trade miss rate for false alarm. It is a floating number within
Sensitivity is the parameter that enables developers to trade miss rate for false alarm. It is a floating point number within
[0, 1]. A higher sensitivity reduces miss rate at cost of increased false alarm rate.

Once initialized, `picovoice` can be used to process incoming audio.
Expand All @@ -668,7 +662,7 @@ while (true) {
```

Finally, be sure to explicitly release resources acquired as the binding class does not rely on the garbage collector
for releasing native resources.
for releasing native resources:

```java
picovoice.delete();
Expand Down Expand Up @@ -708,6 +702,15 @@ when initialized input audio can be processed using `manager.start()`. The proce

## Releases

### v1.1.0 - December 2nd, 2020

- Improved accuracy.
- Runtime optimizations.
- .NET SDK.
- Java SDK.
- React Native SDK.
- C SDK.

### v1.0.0 - October 22, 2020

- Initial release.
Expand Down
Binary file modified demo/android/Activity/app/src/main/res/raw/porcupine_android.ppn
Binary file not shown.
Binary file not shown.
2 changes: 1 addition & 1 deletion demo/android/Activity/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ buildscript {
jcenter()
}
dependencies {
classpath "com.android.tools.build:gradle:4.0.2"
classpath 'com.android.tools.build:gradle:4.1.1'

// NOTE: Do not place your application dependencies here; they belong
// in the individual module build.gradle files
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#Thu Oct 08 18:22:43 PDT 2020
#Wed Dec 02 11:26:17 PST 2020
distributionBase=GRADLE_USER_HOME
distributionPath=wrapper/dists
zipStoreBase=GRADLE_USER_HOME
zipStorePath=wrapper/dists
distributionUrl=https\://services.gradle.org/distributions/gradle-6.1.1-all.zip
distributionUrl=https\://services.gradle.org/distributions/gradle-6.5-bin.zip
Binary file modified demo/android/Activity/picovoice/picovoice-release.aar
Binary file not shown.
Binary file modified demo/android/Activity/porcupine/porcupine-release.aar
Binary file not shown.
Binary file modified demo/android/Activity/rhino/rhino-release.aar
Binary file not shown.
Loading

0 comments on commit a99ad0a

Please sign in to comment.