Skip to content
Wolle edited this page Jan 7, 2025 · 46 revisions

audioI2S FAQ

What is this?

The library enables decoding of MP3 and AAC compression and plays 8bit or 16bit wav files. The audio data can come from the Internet, SD card or SPIFFS. Many radio stations can be heard. Playlists are unpacked and a connection to the (first) URL is established, formats are * .pls, * .m3u and * .asx. SSL connections are possible.

Examples:

connecttohost("http://online.rockarsenal.ru:8000/rockarsenal_aacplus");
connecttoSD("click.mp3");
connecttoFS(SD, "/wave_test/Wav_868kb.wav");
connecttoFS(SPIFFS, "wobble.mp3");

Stations can be received up to 320Kbit/s. A good connection is a prerequisite for this. Many, but not every, station that runs smoothly in the VLC player works on the ESP32 without dropouts. Shortly before the input buffer is empty, this message appears in the serial monitor slow stream, dropouts are possible If the connection is lost, the library tries to re-establish the connection. Tip: the AAC decoder supports SBR (Spectral Band Replication). To do this, 'AAC_ENABLE_SBR' can be activated in 'aac_decoder.h'. However, another ~ 60KB are required in RAM. In SBR mode, PSRAM cannot be used because of the longer access time.


Which external DACs can be used?

Basically all 16 bit DACs that have the pins DIN, BLCK and LRC. The PCM5102A delivers good results. Most GPIOs can be used

setPinout(uint8_t BCLK, uint8_t LRC, uint8_t DOUT); 

Also ESP32-A1S can be used; the library https://github.com/Yveaux/AC101 can be integrated for this purpose. See the examples folder https://github.com/schreibfaul1/ESP32-audioI2S/tree/master/examples/ESP32-A1S MCLK is required in certain cases (only support GPIO0/GPIO1/GPIO3).

i2s_mclk_pin_select(const uint8_t pin);

For DACs such as the PT8211, you can switch from the I2S standard to the Japanese (Least Significant Bit Justified) format. For this there is the command setI2SCommFMT_LSB(true) which has to be executed before activating the I2S interface (i.e. before connectTo ...)

setI2SCommFMT_LSB(true);

Can the Arduino IDE be used?

Yes, the library can be downloaded as a zip file. The installation in the Arduino IDE runs via the library manager

Arduino IDE Library

Tip: Use the partition scheme 'Huge App' so that there is enough memory for your own extensions

Arduino IDE Partition Scheme


What about PSRAM?

If available, PSRAM can be used. PSRAM is recognized automatically. The input buffer is then automatically relocated and enlarged.

without PSRAM, inputBufferSize is about 6.25KBytes

without PSRAM

with PSRAM, inputBufferSize is about 29KBytes

with PSRAM

8000 or 30000 bytes are allocated, part of which is used internally to avoid copying the audio data during operation

The decoders automatically detect whether PSRAM is available. If so, the buffers are created in PSRAM, leaving more space for your own projects


How to adjust the balance and volume?

Internally, the volume is divided into 64 steps. With setVolume() the volume is set to 22 steps by default. Internally, the 22 steps are assigned to the 64 volume steps via a table. This creates a logarithmic curve. This is the ideal solution for buttons or touchpads. Manche externen Geräte (z.B. AC101, ES8388 ...) require a larger range of values. The default maximum (21) can be overwritten with setVolumeSteps(uint8_t steps). In this way, value ranges can be redefined, e.g. (0...63) or (0...99). The balance attenuates the left or right channel (values between -16 ...16).

setBalance(-16); // mutes the left channel
setVolume(21);  // max loudness

The volume control stages are not linear, but follow a logarithmic control characteristic to cover a large dynamic range with linear adjustment. To achieve this, two different curves are implemented. Curve 0 follows a quadratic curve, curve 1 a logarithmic curve. Which curve is chosen depends on personal preference and the hardware used.
Call: setVolume(uint8_t vol, uint8_t curve); Volume Settings Dynamics


How To Change Bass And Treble?

Yes, that is possible. There are built-in IIR filters to simulate a 3 band equalizer.

setTone(int8_t gainLowPass, int8_t gainBandPass, int8_t gainHighPass){
    // values can be between -40 ... +6 (dB)

SetTone (0, 0, 0) is the default setting. If you want to go deeper into the rabbit hole, take a look at the routine IIR_calculateCoefficients (int8_t G0, int8_t G1, int8_t G2). The limit frequencies are specified there. The filter formulas I used can be find here: https://www.earlevel.com/main/2012/11/26/biquad-c-source-code/ The filter effect can be evaluated graphically here:

lowpass


The server requires access data

audio_info: authentification failed, wrong credentials?

The name and password can be transferred when the destination is called, use:

connecttohost("http://xxxx", "name", "password");

What audio events are there?

The events are weakly integrated by the compiler. This means they can, but do not have to be used.

audio_info outputs the current status, suitable for debugging and troubleshooting

void audio_info(const char *info)

audio_id3data many mp3 files contain information about artists, albums or bands. The data is read from the file and can be used further via this event

void audio_id3data(const char *info)

audio_eof_mp3 is called after the end of an audio file. * info contains the file name. With this event it is possible to create playlists

void audio_eof_mp3(const char *info)

Playlist example:

void audio_eof_mp3(const char *info){  //end of file
    Serial.print("audio_info: "); Serial.println(info);
    static int i=0;
    if(i==0) audio.connecttoSD("/wave_test/If_I_Had_a_Chicken.mp3");
    if(i==1) audio.connecttoSD("/wave_test/test_8bit_stereo.wav");
    if(i==2) audio.connecttoSD("/wave_test/test_16bit_mono.wav");
    i++;
    if(i==3) i=0;
}

audio_eof_stream the same as audio_eof_mp3 for podcasts or files transferred from the server

void audio_eof_stream(const char *info)

audio_showstation many radio stations provide their names at the beginning of the connection. This can be used in your own applications

void audio_showstation(const char *info)

audio_showstreamtitle if the radio station transmits information about the artist, music track ... in its metadata, this event is called

void audio_showstreamtitle(const char *info)

audio_bitrate returns the current bit rate as text

void audio_bitrate(const char *info)

audio_commercial commercials are often played at the beginning of the broadcast (and during the program). Info contains the expected duration of the advertisement. So the sound can be switched off for the time being

void audio_commercial(const char *info)

audio_icyurl if the station has a homepage this is called here

void audio_icyurl(const char *info)

audio_lasthost the URL that is called via connecttohost does not have to be the current URL. Sometimes it is redirected to another URL that can be read out here.

void audio_lasthost(const char *info)

audio_id3image mp3 files can contain pictures. Here the pointer to the current mp3 file, the position of the beginning of the picture and the size is transmitted. In the example, the cover image is extracted and written to SD.

void audio_id3image(File& file, const size_t pos, const size_t size) { // cover image
  Serial.printf("id3image found at pos: %u, length: %u\n", pos, size);
  uint8_t buf[1024];
  file.seek(pos + 1); // skip 1 byte encoding
  char mimeType[255]; // mime-type (null terminated)
  for (uint8_t i = 0u; i < 255; i++) {
      mimeType[i] = file.read();
      if (uint8_t(mimeType[i]) == 0) break;
  }
  Serial.printf("MineType: %s\n", mimeType);
  uint8_t imageType = file.read(); // image type (1 Byte)
  Serial.printf("ImageType: %d\n", imageType);
  for (uint8_t i = 0u; i < 255; i++) { // description (null terminated)
    buf[i] = file.read();
    if (uint8_t(buf[i]) == 0) break;
  }
  // raw image data
  File coverFile = SD.open("/cover.jpg", FILE_WRITE);
  size_t len = size;
  while(len) {
      uint16_t bytesRead = file.read(buf, sizeof(buf));
      if(len >= bytesRead) len -= bytesRead;
      else {bytesRead = len; len = 0;}
      coverFile.write( buf, bytesRead);
  }
  Serial.print("Cover file written\n");
  coverFile.close();
}

audio_oggimage OGG can contain embedded images in the comment header. If these are larger than an OggS frame, the image is fragmented and embedded in further OggS frames. The image data is always Base64 encoded.

OGG METADATABLOCKPICTURE

void audio_oggimage(File& audiofile, std::vector<uint32_t> vec){ //OGG blockpicture
    log_i("oggimage:..  " ANSI_ESC_GREEN "---------------------------------------------------------------------------");
    log_i("oggimage:..  " ANSI_ESC_GREEN "ogg metadata blockpicture found:");
    for(int i = 0; i < vec.size(); i += 2) {
        log_i("oggimage:..  " ANSI_ESC_GREEN "segment %02i, pos %07i, len %05i", i / 2, vec[i], vec[i + 1]);
    }
    log_i("oggimage:..  " ANSI_ESC_GREEN "---------------------------------------------------------------------------");
}

audio_process_i2s I2S is used to decouple the audio signal and pass it on to external devices. If continueI2S is true, the signal is written to the I2S DMA. But you can also manipulate a signal. The example shows how an audio stream is accompanied by a sine tone.

void audio_process_i2s(int16_t* outBuff, uint16_t validSamples, uint8_t bitsPerSample, uint8_t channels, bool *continueI2S){

    int16_t sineWaveTable[44] = {
         0,   3743,   7377,  10793,  14082,  17136,  19848,  22113,  23825,  24908,
      25311,  24908,  23825,  22113,  19848,  17136,  14082,  10793,   7377,   3743,
         0,  -3743,  -7377, -10793, -14082, -17136, -19848, -22113, -23825, -24908,
     -25311, -24908, -23825, -22113, -19848, -17136, -14082, -10793,  -7377,  -3743
    };

    static uint8_t tabPtr = 0;
    int16_t* sample[2]; // assume 2 channels, 16bit
    for(int i= 0; i < validSamples; i++){
        *(sample + 0) = outBuff + i * 2;     // channel left
        *(sample + 1) = outBuff + i * 2 + 1; // channel right

        *(*sample + 0) = (sineWaveTable[tabPtr] /50 + *(*sample + 0));
        *(*sample + 1) = (sineWaveTable[tabPtr] /50 + *(*sample + 1));
        tabPtr++;
        if(tabPtr == 44) tabPtr = 0;
    }
   *continueI2S = true;
}

What else is there?

There are other useful functions for building MP3 players, for example

setConnectionTimeout() In some cases it can make sense to change the threshold value for establishing a connection. By default, 250ms are set for unencrypted connections and 2700ms for SSL connections.

uint16_t timeout_ms = 300;
uint16_t timeout_ms_ssl = 3000;
audio.setConnectionTimeout(timeout_ms, timeout_ms_ssl);

getAudioFileDuration() Indicates the expected length of an audio file in seconds. With a constant bit rate, CBR, the value is exact, with a variable bit rate, VBR, the duration is estimated based on the first 100 mp3 frames and can therefore deviate slightly from the actual playback time

uint32_t getAudioFileDuration()

getAudioCurrentTime() returns the current playing time in seconds

uint32_t getAudioCurrentTime()

An example program could look like this:

Ticker ticker;
void setup()  {
	...
	ticker.attach(1, tcr1s);
	...
}

void tcr1s(){
    uint32_t act=audio.getAudioCurrentTime();
    uint32_t afd=audio.getAudioFileDuration();
    uint32_t pos =audio.getFilePos();
    log_i("pos =%i", pos);
    log_i("audioTime: %i:%02d - duration: %i:%02d", (act/60), (act%60) , (afd/60), (afd%60));
}

The output in the serial monitor

Audio Duration

This works with local files (SD, FFat, SD_MMC, SPIFFS) and with web files in wav or mp3 format. The current time for AAC-coded files (m4a) cannot be precisely determined and is therefore estimated using the mean value of the bit rate.

Sometimes you want to play an audio file in a loop.
setFileLoop() the position is determined internally after the audio header. At the end of the file there is a jump to the audio start position

bool setFileLoop(true);

In some projects there is only one audio amplifier or speaker. Then it makes sense to convert the stereo signal into a mono signal. With forceMono(true); the mean value is calculated from the signal of both channels and placed on the left and right channel.

void forceMono(true);  // change stereo to mono
void forceMono(false); // default stereo will be played

setAudioTaskCore(uint8_t coreID) The audio task takes the data from the buffer, decodes it and feeds the I2S. On the other hand, the audio.loop() fills the buffer, takes care of the entire control, processes all 'non' audio-relevant data, such as the metadata, and generates the events. For good performance, the audio task should not run on the core of the Arduino loop task. By default, the audio task runs on core 0, but can be changed here.


A simple project to receive a webstream

Here is a simple program example, you need an ESP32 developer board and an external DAC (e.g. PCM5102A)

#include "Arduino.h"
#include "WiFi.h"
#include "Audio.h"

#define I2S_DOUT      26  // connect to DAC pin DIN
#define I2S_BCLK      27  // connect to DAC pin BCK
#define I2S_LRC       25  // connect to DAC pin LCK

Audio audio;

const char* ssid =     "SSID";
const char* password = "password";

void setup() {
    Serial.begin(115200);
    WiFi.begin(ssid, password);
    while (WiFi.status() != WL_CONNECTED) delay(1500);
    audio.setPinout(I2S_BCLK, I2S_LRC, I2S_DOUT);
    audio.connecttohost("http://s1.knixx.fm/dein_webradio_64.aac"); // 64 kbp/s aac+
}

void loop() {
    audio.loop();
}

void audio_info(const char *info){
    Serial.print("info        "); Serial.println(info);
}

The output in the serial monitor:

rst:0x1 (POWERON_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:1
load:0x3fff0018,len:4
load:0x3fff001c,len:1216
ho 0 tail 12 room 4
load:0x40078000,len:10944
load:0x40080400,len:6388
entry 0x400806b4
info        PSRAM not found, inputBufferSize = 6399 bytes
info        buffers freed, free Heap: 228148 bytes
info        Connect to new host: "http://s1.knixx.fm/dein_webradio_64.aac"
info        Connect to "s1.knixx.fm" on port 80, extension "/dein_webradio_64.aac"
info        Connected to server
info        Server: nginx/1.14.2
info        audio/aac seen.
info        format is aac
info        AACDecoder has been initialized, free Heap: 199916 bytes
info        chunked data transfer
info        Connection: close
info        ice-audio-info: channels=2;samplerate=44100;bitrate=64
info        icy-description: Wir spielen Musik von den 60ern bis Heute! Und immer um halb aktuelle Country-Music.
info        icy-genre: variety,pop,oldies,country
info        icy-name: knixx.fm - Dein Webradio. / 64 kbp/s aac+
info        icy-pub: 1
info        icy-url: https://knixx.fm
info        Cache-Control: no-cache, no-store
info        Access-Control-Allow-Origin: *
info        Access-Control-Allow-Headers: Origin, Accept, X-Requested-With, Content-Type
info        Access-Control-Allow-Methods: GET, OPTIONS, HEAD
info        Expires: Mon, 26 Jul 1997 05:00:00 GMT
info        X-Frame-Options: SAMEORIGIN
info        X-Content-Type-Options: nosniff
info        Switch to DATA, bitrate is 64000, metaint is 4096
info        inputbuffer is being filled
info        StreamTitle="Michael Bolton - Soul Provider -- 1989"
info        stream ready
info        buffer filled in 7 ms
info        syncword found at pos 0
info        AAC Channels=1
info        AAC SampleRate=22050
info        AAC BitsPerSample=16
info        AAC Bitrate=64000
info        StreamTitle="Symbol - The Most Beautiful Girl In The World -- 1994"

building it on a breadboard:

Simple_Project

the schematic:

Simple_Project Schematic


Who wants to build a simple internet radio

There are displays for the Raspberry Pi with a resolution of 480x320 pixels and an SPI bus. These are particularly suitable, see the radio folder

Simple_WiFi_Radio