- Production (Mastering)
- We introduced a new feature. Which we're very excited about!
An often requested feature was "I want to enhance my existing audio files with your mastering chain" or "I want to use FFMpeg in the cloud". Well now you can!
Here's a simple example. Where I've uploaded some recorded speech (say from a voice note) plus a backing track.
python
backgroundId = apiaudio.Media.upload(file_path="background.wav")["mediaId"]
speechId = apiaudio.Media.upload(file_path="speech1.wav")["mediaId"]
timeline = [
{
"files" : [
{
"mediaId" : speechId,
"startAt" : 2,
"endAt" : 14,
}
],
"contentType" : "speech"
},
{
"files" : [
{
"mediaId" : backgroundId,
"startAt" : 0,
"endAt" : 45,
}
],
"contentType" : "sound"
}
]
response = apiaudio.Mastering.create_media_timeline(timeline=timeline, masteringPreset="lightducking")
- Audio filters
- One problem that you have with sound is making your sound super good.
We've been working hard on leveraging digital signal processing and machine learning to produce beautiful-audio-as-a-service. Have a listen to the demo below.
import apiaudio
apiaudio.api_base = "https://v1.api.audio"
apiaudio.api_key = ""
template = "3am"
preset = "excitermaster"
name = "gabriel"
text = f"""Hi I am {name} using the preset {preset} and the {template} sound template, presenting our new exciter plug-in, to enhance the clarity of our mixes"""
response = apiaudio.Script.create(scriptText=text, scriptName=f"demo-{name}-{preset}", projectName = "demo")
script_id = response["scriptId"]
response = apiaudio.Speech.create(scriptId=script_id, voice=name)
r = apiaudio.Mastering.create(scriptId=script_id, soundTemplate=template, masteringPreset = preset)
print(r)
r = apiaudio.Mastering.download(scriptId=script_id)
-
Firstly happy new year from everyone here at Aflorithmic!
- We're constantly improving Api.audio and taking care of small issues and annoyances that we find (or you send us!). We wanted to start telling you about these fixes, so from now on we'll update you every few months about the bug fixes and other small tasks we take care of in between our larger feature projects.
-
Bug fixes
- Transactional emails. We had some issues with our transactional emails. For example some weren't sent correctly, some had some errors. We've fixed these and you'll get reliable transactional emails telling you when your credits are being used up. We're going to rebuild these in the future based on customer feedback as well.
- Console Sign up. We fixed some issues with validation in the console sign up, for example it sometimes accepted incorrectly formatted emails, or didn't give feedback if unaccepted characters were included. This should be a much better user experience.
- Enterprise sign up, we also had some issues with our access code, sometimes outdated access codes would be accepted.
- Speech
- Voice Cloning
- We shipped an update of the voice cloning API to production - and used by users. Allowing you to integrate Voice Cloning into your services.
- Bug fixes
- We enhanced our normaliser and voice intelligence functionality
- Better pronunciation of digits and telephone numbers, telephone numbers (German)
- Media files with zero bytes were corrupted. We now report an error in the mastering endpoints. Thanks to our customers who told us about this
- Speedcloning In order to make Voice Cloning a time efficient process, we’ve developed the world’s first Speed Cloning feature. Where in, users can clone their voices with just 30 minutes of recording time, compared to the industry standard of 6 hours.
- Italian script 🇮🇹
- We're working on multiple languages and support for various European Languages.
We just shipped the Italian Script. In the future we'll have our first Italian voice for a customer.
We're excited to continually invest in more languages!
- Bug Fixes We fixed some bugs in unexpected behaviour in the SDK about media files and script and the interaction of them. This should improve your user experience. Thanks to our users for reporting these bugs, we’re happy to make our product even better!
-
Version 1 - New Billing & Analytics in Console
-
De-esser !
- De-essing is the process of attenuating or reducing sibilance, or harsh high-frequency sounds that come from dialogue or vocals using the letters S, F, X, SH, and soft Cs.
It’s often a necessary process when mixing audio, but it’s rarely easy—especially when you’re just getting started. Many factors contribute to the complex nature of de-essing, from the way split-band processors can impact the character of a sound, to the manner in which the human voice can change from sibilance to sibilance.
Here's an example
import apiaudio
import os
apiaudio.api_base = "https://v1.api.audio"
apiaudio.api_key = os.environ["API_KEY"]
# generate female voice with and without desser for A/B test
preset_list = ["default", "deesserfemale"] # try male voice ex. -> ["default", "deessermale"]
name = "vicki" # try male voice ex. -> ["brandon"]
text = f"""Hi I am {name} and this is the result of applying de-essing to my voice, she sells seashells on the seashore. Sand, sent, sink, sonar, sun"""
for preset in preset_list:
try:
response = apiaudio.Script.create(scriptText=text, scriptName=f"testing-{name}-{preset}", projectName = "testing")
script_id = response["scriptId"]
response = apiaudio.Speech.create(scriptId=script_id, voice=name)
r = apiaudio.Mastering.create(scriptId=script_id, soundTemplate="", masteringPreset = preset)
print(r)
r = apiaudio.Mastering.download(scriptId=script_id)
except Exception as e:
print(e)
We look forwarding to shipping more audio quality improvements like this.
We shipped SMS based authentication, some of our users reported issues with Email based authentication. We hope this improves your customer experience
We had a bug in our billing for updating your credit card details. This is now fixed. Sorry for any inconvenience!
We updated our billing functionality. Some of the improvements are backend and reliability improvements however we also wanted to share the following. We're introducing starter and enterprise plans. For access to Enterprise you'll need an access code from your account manager.
And here you can see the Enterprise sign up
We are doing this to allow easiness for the user to sign up and companies and also because Enterprise plans vary in their prices depending on your expected usage.
Performance improvements in script-get()
, better response times when fetching large quantities of script(s).
We had a bug in our shareUrl
setup. Which meant that users weren't able to easily share audio.
We've fixed this bug.
mastering = apiaudio.Mastering.create(
scriptId="concert-ad",
soundTemplate="house",
share=True
)
# Check the response
print('Response from mastering', mastering)
# Listen and share your audio file
print('Listen to your audio here', mastering['shareUrl'])
This will get a response like
Response from mastering {'shareUrl': 'https://console.api.audio/share?id=e3b91a92', 'message': 'Mastering completed successfully', 'url': 'https://v1.api.audio/url/aaecb3/concert-ad__band~nickelback__city~berlin.mp3', 'warnings': ''}
Listen to your audio here https://console.api.audio/share?id=e3b91a92
Where the key url for the share functionality is https://console.api.audio/share?id=id_1
You can see how it looks here
- We updated the Javascript SDK
get_org_data()
- Get organizations data, including orgId, orgName etc.- Parameters:
- None.
- Example:
org_data = apiaudio.Organization.get_org_data()
- Parameters:
list_child_orgs()
- List your child organizations.- Parameters:
- None.
- Example:
child_orgs = apiaudio.Organization.list_child_orgs()
- Parameters:
get_secrets()
- Get your api key, webhook url and webhook secret.- Parameters:
- None.
- Example:
secrets = apiaudio.Organization.get_secrets()
- Parameters:
## Script enhancements
-> must supply project name to list by module -> must supply project name, and module name to list by script name -> cannot give a project name, module name, or script name a name beginning with _
apiaudio.script.list_projects() OR (GET) /script/list_projects
Will list all the projects within your organization
apiaudio.script.list_modules() OR (GET) /script/list_modules
Will list all the modules within your organization. The values returned will be in the format <project_name>/<module_name>
apiaudio.script.list_script_names() OR (GET) /script/list_script_names
Will list all the scripts within your organization. The values returned will be in the format <project_name>/<module_name>/<script_name>
apiaudio.script.delete_multiple() OR (DELETE) /script/scripts
Deletes multiple scripts from a project/module/script. Must supply project name to delete by module (etc)
You can now supply verbose=false to script list resB = apiaudio.Script.list(projectName="x", verbose=False)
this will only supply scriptId's and PMS names in the response
listing all scripts returns a pagination token (limited to a 1000 at a time)
One of our customers provided a bug report for some unicode characters. We fixed this with specific voices. So now you should be able to use all voices with unicode. The example is shown below the ellipsis.
text = """Just being transparent… """
We updated some of our pricing to be more in line with the api pricing page. You can see it here https://docs.api.audio/docs/api-pricing
### Voice Cloner SDK
- [https://github.com/aflorithmic/apiaudio-python/pull/121](Voice Cloner SDK) -
- SDK's readability and the accuracy of API's representations (object members annotations in IDEs!),
- testability (proper client tests, increased probability of catching breaking changes),
- simplicity of version-specific changes (e.g. SDK-side transformations, API-side version-specific feature flags, extra keys ignored in previous versions etc.)
- We now have over 600 voices! Improving our customer experience.
- We introduced a new partner DeepZen which have amazing emotive voices. Give them a try :)
- You can listen to them here in the library
We recently migrated our ML platform from an old platform to a new platform. (Older version of kubeflow to newer version of kubeflow) We've rolled the out to all customers.
This has a bunch of advantages
- Running on a well-maintained distribution (by AWS) so easier to developer - enabling faster feature delivery
- AWS Services better integrated - more reliable
- Fixed our security vulnerabilities - because we care about customer trust (this also addresses Log4J issues we had)
- New inference API, faster inference - faster voices
- We’re able to roll out faster, more reliable, and do more frequent updates. Enabling us to develop better features, faster
- The deployment is way easier (we created a python package)
- Our ML team is able to access logs from the dashboard, so able to fix bugs faster :)
## Voice builder We improved our voice creation capabilities so we did automatic voice preview. We have automatic preview of files, as soon as the model finished training, and proper status updates. This enhances the customer experience in our ability to deliver great voices.
Our connectors functionality, got a redesign. This redesign means adding new connectors will be much faster!
We've got some awesome voice improvements coming, and a much better voice cloning experience.
- We've been working on improvements such as showing you the child orgs in your super org, and allowing you to click through to the child org from your console.
## Voice Cloner
- We shipped a new alpha release of Voice Cloner, a much more improved process and faster time to your first voice.
- Getting started we revamped the reference part of the API docs.
We launched a Discussion so as you can ask us your questions directly. In addition to that, we have also launched an updated FAQ Page FAQs
We updated our Dictionary feature - to make it even more user friendly. We call our Dictionary feature - Lexi.
# correct the word sapiens
r = apiaudio.Lexi.register_custom_word(word="sapiens", replacement="saypeeoons", lang="en")
print(r)
You can also look at this longer example below
text = "hello I am in the city of <!reading> today. My name is Sam - and lexi is live in production. Try it out at aflorithmic."
# register two words in our custom dicts
r = apiaudio.Lexi.register_custom_word(word="Sam", replacement="hackerman", lang="en")
print(r)
r = apiaudio.Lexi.register_custom_word(word="lexi", replacement="lexi a k a the awesome word replacement machine", lang="en")
print(r)
# list all our dicts
r = apiaudio.Lexi.list_custom_dicts()
print(r)
# list words in our en dict
r = apiaudio.Lexi.list_custom_words(lang="en")
print(r)
# usual api stuff
script = apiaudio.Script.create(scriptText=text)
print(r)
r = apiaudio.Speech.create(scriptId=script["scriptId"], voice="sara", useDictionary=True)
print(r)
r = apiaudio.Speech.download(scriptId=script["scriptId"])
print(r)
# we can delete words also!
r = apiaudio.Lexi.delete_custom_word(word="sam", lang="en")
print(r)
We love hearing what you'll build.
We've been listening to our user feedback about our docs and we've made some changes
- Welcome Our welcome page is clearer and more benefits driven
- Dynamic Versioning We made our example easier to learn and adopt
- Roles and Permissions We launched a NEW page with an explanation of our Roles and Permissions functionality.
- Webhooks We use webhooks to notify your application when an event happens in your organization. Webhooks are particularly useful for asynchronous events like when producing audio files that take a long time to process. This was requested by customers as well :)
## Voice capture app
- Within our Voice Capture App, we have invested into improving the performance of our backend processes, particularly enhancing the customer experience. It's available to select users hereVoice Cloning
- We fixed a phoneme compatibility bug in Visemes - it wasn't working with all inputted phonemes- If you want to access this feature and you are in the avatar business, please get in touch to discuss!
- We optimised our Voice Intelligence feature to be 3x faster. Have fun using it :)
After careful discussion with our wonderful customers, we decided to update our pricing policy. We did this to make things simpler and we also connected price more to value.
WHAT HAS CHANGED?
-
Instead of 5 tiers we now only have 4. The enterprise plan is now a fully managed service and can be found on Aflorithmic.ai.
-
FREE plan - This is a free trial period of 30 days, granting you 250 production credits to get your feet wet.
-
INDIE plan - At just $39 per month you can start building your own audio production environment without a watermark and including 1,100 production credits per month.
-
PRODUCTION plan - If you are a developer with a commercial project in mind, this is the plan for you. With 8,000 production credits and at a lower additional cost per credit it’s designed to build a lot.
-
PROFESSIONAL plan - This plan is for organizations with a lot of tech muscle who still want to stay in full control of their product, but don’t want to move to an enterprise plan yet. Literally anything can be customized.
You can read more here in the Pricing email
- We've been working hard on our docs lately. One improvement we're proud of is the new Quickstart which will help you produce beautiful audio in minutes.
- Some customer voices reported issues with loudness, we've deployed these to some of our
messner
voices (available on some of our plans) and will continue to deploy these changes. This enhances the acoustic quality of the voices and improve their naturalness to almost human-like!
Some of our customers remarked to us that they wanted to Pay in their existing method. So we this week we added thanks to our partners at Stripe a whole range of new payment methods. These include
- SEPA Direct Debit
- BACS Direct Debit
- iDEAL
- EPS
- Giropay
- Bancontact
- Carte Bancaire
This is on top of Google Pay, Apple Pay and Card that we already offer.
You can even see the payment method in your local currency
And if you're in some parts of the Eurozone you'll see
We hope this helps you pay us the way that you want. And helps us add more value to our customers.
- We now have a new category in our library, and it's called Character. it is mainly gaming voices/effected voices. Please have a listen Character link and let us know what you think!
Users often want the ability to handle and upload sound templates themselves - especially if you've got the creative ability. This feature allows you to manage and upload sound templates. (Needs more explanations)
- We fixed a bug in our urls for sharing audio, so your shared content now works for up to 7 days. Thanks to those who helped discover and fix that bug!
We're working on our normaliser and we'll be rolling out some changes next week. This will allow the likes of cm
to be pronounced correctly and also will work for dates and some times.
This only works with German, so we're sorry if you don't read that. But we're a big believer in supporting many languages :) In fact we have over 50 languages supported.
import os
import apiaudio
apiaudio.api_key = os.environ["API_KEY"]
some_text = """
Die Größe des Täters wurde mit 2 cm angegeben. Im August wurde es über 2 °C warm. Die Strecke ist 234 km lang. Meine Schuhe wiegen nur 300 gr.
Mit über 18,1 Kg ist der Fernseher ein echtes Schwergewicht. Die neuen LCD Bildschirme arbeiten mit 100 hz. All diese Geräte arbeiten mit 110 V.
Der Heizstrahler hat eine Leistung von 300 W. Um circa 14:43 Uhr ereignete sich der Vorfall. Das ganze fing schon um 3:22 an. Ab 9.00 Uhr ist angezapft. Mit nur 30 min ist das Meeting eher kurz. Putin hat sich in einer Rede in die Tradition von Zar Peter dem Großen (1988-2002) gestellt. Am 31.10.1982 fand das Event statt. Das war am 02.04.2021. In der Umfrage sagten 1/3 der Befragten das Gleiche. Sie brauchen ¼ Tasse Wasser. Das Glas Bier kostet 3,30€. Der Meter Fichtenholz kostet 0,46€. Das Glas Fanta kostet 1,95 EUR. Das Glas Cola kostet 1,40 EURO. Und ein Kasten 12,00 euro. Alles Weitere kommt mit 0,03 euro noch hinzu. Davon zog er noch 4,03 eur ab. Apple berechnet dafür in den USA 1290$. Microsoft jedoch nur 34,43 USD. Den Deal zahlte er demnach pro Gramm – soll er jedenfalls. Es handelte sich um 5.550 Männer und 5550 Frauen. jetzt bei scout24.com. Es gab 200 Lieferanten. Und 900 Liefer, Anten. Peugeot. Drogendealer.
"""
VOICE="lena"
#you can change this to another German voice such as 'erika','bernd' or'greta'
script = apiaudio.Script.create(scriptText=some_text)
r = apiaudio.Speech.create(scriptId=script["scriptId"], useTextNormaliser=True, voice=VOICE)
print(r)
apiaudio.Speech.download(scriptId=script["scriptId"])
os.rename("default__section~1of1.wav", f"{VOICE}_with_norm.wav")
And the various acronymns and currencies and dates will all be correctly pronounced. This improves the user experience big time in our tests.
We've been working hard on our German voice capture through our Voice cloner app and also improving our ML pipelines. Our beta customers love our new voices - they score highest in our tests that we've ever built. If you're interested in cloning your voice in German please reach out to bjoern[at]aflorithmic.ai to discuss cloning your voice for your brand.
At Aflorithmic we try every day to "delight our customers" and so we've been investing in our monitoring, bug tracking and understanding the workflows of our users. We want to mention just one improvement - and there's many others in the background.
- We've been working on better monitoring and understanding bugs for our users, so we recently launched a new system for better monitoring and reproducing errors. This will help us respond better to you and your queries.
-
We've fixed a few performance bugs in mastering builder, these are incremental changes and you might not notice the difference, but the journey to a world-class product is built upon incremental customer centric change :)
-
One customer reported an issue with our [https://console.api.audio/voice-cloner](Voice Cloner Tab) it turned out that this wasn't loading if you have lots of scripts stored, we improved the performance and fixed this issue. This only affects some plans - Corporate in particular.
-
We fixed some bugs with
Tax Collection Id
was not being accepted for some users when signing up with stripe. We're very sorry for this problem and any issues it caused with your upgrading accounts.
We've made some billing enhancements and improving to our billing systems. We'll ship this next week.
- We launched OpenAPI so you can see in Swagger style our endpoints. If you're used to this workflow let us know we'd love to talk some more about it. Feel free to reach out to peadar[at]aflorithmic[dot]ai
A common customer request was - I want to understand what my customers usage is. For example if you're a distribution partner or agency you'll encounter this workflow.
This allows you to see usage of your child organisations :) Have a look below. It's in production so you can try it out!
- When you use syncTTS with
metadata=True
there will also be a url generated
response = apiaudio.SyncTTS.create(text="Hello", voice="shelly", metadata=True)
{
"audio_data": string, # base64 encoded
"url": string,
"sampling_rate": string,
"event_data": [
{
"phoneme": string,
"viseme": string,
"offset": float,
"duration": float
}
]
}
-
One customer reported some unicode issues with non-latin based alphabets. We're sorry for this error and we've fixed it for a range of languages including Arabic, Telugu and Greek. If you find any issues like this let us know.
-
Breaking Change Since 4th of July, when using certain sound templates in conjunction with mastering section properties, resulted in sections that were in-fact to long. We've fixed these errors and this is already shipped. This will improve your experience, but you may notice some differences with behaviour of our mastering engine.
## Wednesday 13th July 2022
- We added two features to
syncTTS
resource:- Retrieve URL to audio instead of raw data (Note: size limits don't apply in this case!):
dictionary = apiaudio.SyncTTS.create( text="Hello, how are you?", voice="joanna", url=True ) url = dictionary["url"]
- Specify format of audio (currently supported formats are: "mp3", "wav", "pcm"):
mp3_bytes = apiaudio.SyncTTS.create( text="Hello, how are you?", voice="joanna", format="mp3" )
- We added one of our Partners Cereproc you can see the voices here
- You can try out a voice from Cereproc with this example
import apiaudio
apiaudio.api_key="API-KEY"
script = apiaudio.Script.create(scriptText="<<soundSegment::intro>><<sectionName::intro>>Hello world. Welcome to API dot audio.<<soundSegment::main>><<sectionName::main>> Create audio in a few easy steps.")
response = apiaudio.Speech.create(scriptId=script.get("scriptId"), voice="dakota")
response = apiaudio.Mastering.create(scriptId=script.get("scriptId"), soundTemplate = "parisianmorning" )
file = apiaudio.Mastering.download(scriptId=script.get("scriptId"))
We've been updating the console a lot.
You can see some images here of Get Started
You can see the changelog here
### Voice cloner
- We shipped a new version of voice cloner
- This is up to a 3X faster User Experience.
-
We improved superorg and integrated billing. So child orgs API consumption now contributes to your bill. And we added metering and analytics (you'll need to contact [email protected] for that)
-
SDK support for superorgs and all assuming mechanism. Both for JS and Python sdks. See the SDK here
We've optimised our pricing for usage.
- 250 free credits on sign up, instead of 500.
- Monthly allowance for free plans, instead of giving them time till the end of month regardless what day they signed up. This way, free organisations will have 28 days to use their credits. We will add some email marketing about this as well.
- UX improvements
- Bug free :)
- Setup page
- Silence detector (hands free operation)
- AES version 2 - better UX for our customers (threshold adjustments)
How can I get started? You'll be able to sign up with https://voice-cloning.api.audio/
- We're delighted to ship a much requested feature from customers! Webhooks
- If you're looking for a good explanation of webhooks we recommend this blog post but basically these are automated messages (akin to SMS) sent from APIs when something happens in an API. A common use case for us is mastering requests. Some mastering requests take some time to run, so you may want to simply set these up with webhooks and have a better customer experience :)
You can see some image here
You'll receive the logs of the webhooks here
Here's some code as well. Let's say you have a long running mastering request. It might take 30s. Rather than waiting for the request to run - you can simple add a callback url.
apiaudio.Mastering().create(scriptId=script_id, callback_url='call-based callback url')
We also have some awesome security features such as verifying signatures. We'll add more to the docs about this soon.
apiaudio.Webhooks.verify(event_body, x-aflr-secret, clients_webhook_secret, tolerance = defaults to 300 seconds)
- Our free plan has watermarks, if you want these removed you'll have to upgrade. We've added these to multiple languages. Here are some examples
WATER_MARKS = {
"default" : ". Created with API audio.",
"en" : ". Created with API audio.",
"pl" : ". Stworzone przy użyciu API audio.",
"ga" : ". Cruthaithe le fuaim API audio",
"tr" : ". api.audio ile oluşturuldu",
"fr" : ". créé avec API audio",
"ca-es" : ". creat amb API audio",
"ca" : ". creat amb API audio",
"el" : ". Δημιουργήθηκε με το api.audio",
"nl" : ". Gemaakt met API audio",
"de" : ". Erstellt mit api punkt audio.",
"pt-br" : ". Criado com api.audio",
"pt" : ". Criado com api.audio",
"hu" : ". Az api.audio felhasználásával készült.",
"et" : ". valmistaja Api audio",
"hi" : ". yeh aawaaz api audio se banayi gayi",
"zh" : ". 用Api audio製作的",
"ua" : "створено за допомогою API крапка аудіо",
"bn" : "API অডিও দিয়ে তৈরি করা |"
}
- We're going to add to the console the SuperOrg functionality enabling users to administer functionality for the companies that are using their accounts.
- There's a lot more fine grained control coming as well, but this is an enterprise ready feature and requested by numerous customers. If you want a demo feel free to reach out.
Online (sync) Visemes 2.0: We have onboarded all of our customers' feedback, and we are now confident to launch a new version, the result of 3 months of R&D. There are substantial improvements in alignment, speed and latency. They are deployed in our custom msnr
voices
- Some customers reported that the amplitude of some of our
msnr
voices was louder on the second sentence and not the first sentence. We created a fix for this, and we hope this fixes the error.
- We have new partnerships for our beta customers please reach out to us to learn more to try out new voices.
v.0.16.3
- Listing superorgs - the ability to list these organisations.
- Billing integration so each superorg and child org is charged correctly.
Our Data Capture App used as part of our voice cloning process is now rebranded voice cloner
These are the top bug fixes this week which should result in a much better user experience
- Microphone not detected on some Mobile Phones. (Fixed)
- Play button not playing back full audio. (Fixed)
- “Record” button wasn’t recording full sentences. (Fixed)
- AutoGain was causing distorting in audio (Fixed)
You can try it out here Voice Cloner
### Bug Fixes
- We fixed a bug that was showing hidden voices in the voice library. This was hurting the user experience.
- We've shipped the following to select Beta customers. If you want access let us know and we'll give you access.
It's our best voice ever created by our internal TTS research team - it's called
margareta-v1
# script text
text = text = """
Hallo Peadar. Ich wurde am 20.06.2022 in der Softwareschmiede von Aflorithmic in Produktion eingesetzt.
"""
# script creation
script = apiaudio.Script.create(scriptText=text, scriptName="breaking_news")
r = apiaudio.Speech().create(
scriptId=script.get("scriptId"),
voice="margareta-v1",
speed=110,
)
template = "breakingnews"
response = apiaudio.Mastering().create(
scriptId=script.get("scriptId"),
soundTemplate=template
)
print(response)
file = apiaudio.Mastering().download(
scriptId=script.get("scriptId"), destination=".")
print(file)
You can listen to an example here
margareta-v1-example.mp4
- We discovered an incorrect billing issue with some voices on our platform (only affecting IBM voices). This only impacted some customers all customers have been informed and refunded. We've added alarms and detection mechanisms and enhanced quality control to fix this going forward. We're sorry for any inconvenience.
- We're implementing changes to handle this and working on our reliability and monitoring.
We've been working hard on our console in the recent weeks. And you can view it here
You can see the easier view of Total api calls, Script api calls, Speech api calls and Mastering api calls
If you click on these you'll see some other information including this awesome donut chart
And if you want to dive deep into this have a look here at the logs
These are just some highlights of the stuff we've improved based on customer feedback :)
v0.16.2
- We introduced a new function called
set_assume_org_id
for incoming super organization feature. By using this method, you can assume the id of a child organization as their super organization, and make your calls on behalf of them.
A super organization is loosely modelled on superuser. So you can as a company ACME have specific criteria and permissions - and then you can share these with your child organisations. This allows you fine grained control of users and their roles and permissions and the ability to share settings and voices across orgs. We were informed by IAM from AWS in our design. If you want access let us know, we're working hard on this. This is part of a whole
- Voices We have some great partnerships coming up with 2 new voice providers. Reach out to us if you want to know more :)
Under msnr we will soon have another german voice. We're testing this with some beta customers the margareta-v1
voice.
(Updated above)
We've also improved our performance and invested more in
v0.16.1
- We noticed in testing some additional latency caused by our feature flag implementation - so we refactored this to make it much faster. This was mostly visible in our custom voice implementations.
- We added the ability to share audio with other users. It's a magic link feature. For example
curl -X POST \
'https://v1.api.audio/mastering' \
--header 'x-api-key: API-KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
"scriptId": "xx",
"share": true
}'
This returns a url - if you goto
https://console-api.audio/share?id=your_id
You'll get a page that you can share with your organisation and via Whatsapp and other social networks.
- We've updated our voice library page to have faster loading and we've added some other improvements in the GET request for voices so you can have a better voice discovery experience. Most of this is backend and performance enhancements.
- Enhanced query parameters for voice discoverability
GET https://v1.api.audio/voice?tags=upbeat,storytelling
Will voices that have both of these tags. We also introduced pagination as well.
Pagination
(Not yet available in the SDK)
Use parameter limit
to set the amount of returned voices. Use offset
to iterate through the results.
E.g.
GET https://v1.api.audio/voice?limit=10&offset=10
GET https://v1.api.audio/voice?limit=10&offset=20
GET https://v1.api.audio/voice?limit=10&offset=30
Query body in POST /voice
A query
JSON is available in POST method.
Supported operators:
"$gt", "$gte", "$lt", "$lte", "$contains", "$is_in", "$ne"
Basic query (attribute equals x):
{
"query": {
"language": "english",
"provider": "polly"
}
}
List specific languages:
{
"query": {
"language": {
"$is_in": ["english", "spanish", "polish"]
}
}
}
List only private and public_paid tiers:
{
"query": {
"tier": {
"$ne": "public"
}
}
}
List spanish voices where priority is less than or equal to 3:
{
"query": {
"language": "spanish",
"priority": { "$lte": 3 }
}
}
There's a lot you can do with this so we hope this makes your develper experience easier.
-
Enhancements of voices. We've been testing with some users a more natural pauses to our custom voices. We contacted customers affected and some requested to not have this feature enabled please contact us if you want to use the old voices. We do feel that these voices are more natural and our beta testing was positive. These will be available soon under
msnr
voices in our API. -
As a developer user I want to share my audio with my team/bosses/business person so they can try and test the wonders of api.audio. We call this virality link and it's available in the console soon.
v0.16.0
- We added new endFormat (for making sure your audio is the correct format for a target audience) this week we added Alexa preset
m = apiaudio.Mastering().create(scriptId=GLOBAL_SCRIPT_ID, endFormat="mp3_alexa")
In the future we will add other endFormats - please tell us which ones you'd like.
- To continue to be able to offer free content to those of you still getting to know API.audio, we have added a watermark to our files.
- For our corporate plan users we've enabled
sandboxing
this allows you to test safely API requests without using up credits. Please contact your account manager for further details.
- Our feature (pronunciation dictionary) has had some usability enhancements. The biggest change is adding
useDictionary
as a boolean. Here's an exampleThe addition of useDictionary and the change in behaviour is likely to present some breaking changes. We've notified any customers who are affected.scriptText = """Hello I am reading a book in the city of <!location>reading<!> today""" script = apiaudio.Script.create(scriptText=scriptText) speech = apiaudio.Speech.create(scriptId=script["scriptId"], voice="Ryan", useDictionary=True) print(speech)
We had some third party downtime with some voices from one provider. We notified them and fixed this issue.
We have removed old code and streamlined other code using our Mastering engine. This will allow us to add features more easily whilst removing unnecessary complexity. We also made various performance improvements and work in the background. Although not all of this will be customer facing, these incremental improvements are very important.
(These are features that aren't added yet, but will be released next week)
-
Enhancements of voices. We've been testing with some users a more natural pauses to our custom voices. We contacted customers affected and some requested to not have this feature enabled please contact us if you want to use the old voices. We do feel that these voices are more natural and our beta testing was positive. These will be available soon under
msnr
voices in our API. -
As a developer user I want to share my audio with my team/bosses/business person so they can try and test the wonders of api.audio. We call this virality link and it's available in the console soon.
-
We've been developing the usability of our data capture app for voice cloning (this is available in some plans). You'll see this in the next few weeks.