Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Quantizing EEGNet Model on ESP32-S3 (AIV-744) #194

Open
3 tasks done
manhdatbn93 opened this issue Jan 21, 2025 · 6 comments
Open
3 tasks done

Issue with Quantizing EEGNet Model on ESP32-S3 (AIV-744) #194

manhdatbn93 opened this issue Jan 21, 2025 · 6 comments

Comments

@manhdatbn93
Copy link

manhdatbn93 commented Jan 21, 2025

Checklist

  • Checked the issue tracker for similar issues to ensure this is not a duplicate.
  • Provided a clear description of your suggestion.
  • Included any relevant context or examples.

Issue or Suggestion Description

Hi Team, I am currently working on quantizing the EEGNet model for deployment on my custom board based on the ESP32-S3. My goal is to use this model for a simple classification task: detecting eye blink and eye non-blink from EEG data.

Here is the EEGNet model I am using: https://github.com/YuDongPan/DL_Classifier/blob/main/Model/EEGNet.py

Issue
When I attempt to quantize this model using ESPDL, I encounter an error in the avg_pool2d function. The error details are as follows:

File "~\Python\Python310\lib\site-packages\ppq\executor\op\torch\default.py", line 800, in AveragePool_forward
output = F.avg_pool2d(
RuntimeError: Given input size: (96x1x1). Calculated output size: (96x1x0). Output size is too small

My batch size is

torch.Size([32, 1, 8, 500])

Here is the function for quantizing

quant_ppq_graph = espdl_quantize_torch(
        model=model,
        espdl_export_file=ESPDL_MODLE_PATH,
        calib_dataloader=dataloader,
        calib_steps=32,
        input_shape=[1] + [1] + INPUT_SHAPE,
        target=TARGET,
        num_of_bits=NUM_OF_BITS,
        collate_fn=collate_fn2,
        setting=None,
        device=DEVICE,
        error_report=True,
        skip_export=False,
        export_test_values=False,
        verbose=1,
    )

I would appreciate it if you could help identify the root cause of this issue and provide guidance on:

  1. Whether the pooling operations in EEGNet require modification for compatibility with ESPDL quantization.
  2. Any specific changes needed in the model or data pipeline for successful quantization.
  3. Do you have any example of Quantize model with 2-dimensions input like EEG input signal? The input shape like this: [batch_size, channels, samples].

Thank you for your support. Please let me know if you need additional details about the setup or the error logs.

Additional Details
Data Input Shape: [batch_size, 1, 8, 500]
Target Board: ESP32-S3
Quantization Tool: ESPDL
Framework: PyTorch

Full modified code, other codes are same as example:

def collate_eeg_fn(x: Tuple) -> torch.Tensor:
    return torch.cat([sample[0].unsqueeze(0).unsqueeze(1) for sample in x], dim=0)

if __name__ == "__main__":
    BATCH_SIZE = 32
    INPUT_SHAPE = [8, 500]
    DEVICE = "cpu"  #  'cuda' or 'cpu', if you use cuda, please make sure that cuda is available
    TARGET = "esp32s3"
    NUM_OF_BITS = 8
    ESPDL_MODLE_PATH = "models/torch/eegnet.espdl"

    model = eeg.EEGNet(8, 500, 2)
    model = model.to(DEVICE)
    # Load the pretrained weights
    pretrained_weights_path = "eyeblink_eegnet.pt"
    model.load_state_dict(torch.load(pretrained_weights_path))

    loaded_dataset = torch.load("eyeblink_dataset.pt", weights_only=True)
    print(f"Loaded dataset with {len(loaded_dataset)} samples.")

    dataloader = DataLoader(
            dataset=loaded_dataset,
            batch_size=BATCH_SIZE,
            shuffle=False,
            num_workers=1,
            pin_memory=False,
            collate_fn=collate_eeg_fn,
        )

    for batch in dataloader:
        print(batch.shape)
        break
    
   
    # quant_setting, model = quant_setting_eegnet(
    #     model, ["LayerwiseEqualization_quantization"]
    # )

    quant_ppq_graph = espdl_quantize_torch(
        model=model,
        espdl_export_file=ESPDL_MODLE_PATH,
        calib_dataloader=dataloader,
        calib_steps=32,
        input_shape=[1] + [1] + INPUT_SHAPE,
        target=TARGET,
        num_of_bits=NUM_OF_BITS,
        collate_fn=collate_fn2,
        setting=None,
        device=DEVICE,
        error_report=True,
        skip_export=False,
        export_test_values=False,
        verbose=1,
    )

Attached is the model I got as *.onnx although the run failed.
Image

@github-actions github-actions bot changed the title Issue with Quantizing EEGNet Model on ESP32-S3 Issue with Quantizing EEGNet Model on ESP32-S3 (AIV-744) Jan 21, 2025
@BlueSkyB
Copy link
Collaborator

We will download the code to analyze the issue. If it's convenient for you to upload the ONNX model file, it will accelerate our debugging process.

@manhdatbn93
Copy link
Author

manhdatbn93 commented Jan 23, 2025

Hi @BlueSkyB, Yes. I have uploaded my trained model in pytorch (*.pt) and ONNX model that I got from quantize script in zip file

eyeblink_trained_model.zip

@manhdatbn93
Copy link
Author

Is it feasible to operate a small model, like the one I have, on an ESP32S3 module without a PSRAM component? because the ESP32-S3-WROOM-1-N16 I received does not have a PSRAM option.
I tried to flash an example from GitHub and always triggers CPU reset due to PSRAM

@BlueSkyB
Copy link
Collaborator

You can disable the configuration of CONFIG_SPIRAM by using idf.py menuconfig. If there is no PSRAM, the system will default to using internal RAM. In this case, you must ensure that the model is small enough so that the memory required for its parameters and feature maps is less than the available system memory; otherwise, it will not function properly.

Alternatively, you can refer to the document https://docs.espressif.com/projects/esp-dl/en/latest/tutorials/how_to_load_model.html to disable param_copy, but this will significantly reduce performance.

Additionally, your model contains the Elu operator, which is currently not supported by ESP-DL. You can check the supported operators at https://github.com/espressif/esp-dl/blob/master/operator_support_state.md. We will support it in the future.
You can also add the operator yourself by referring to https://docs.espressif.com/projects/esp-dl/en/latest/tutorials/how_to_add_a_new_module%28operator%29.html

@manhdatbn93
Copy link
Author

manhdatbn93 commented Jan 23, 2025

Hi @BlueSkyB, thank you so much for your information.

I have modified the model to make it simpler for testing, and it successfully passes the quantization script.

I imported the eegnet.espdl (18 KB) in the same way as demonstrated in the mobilenet_v2 example, but I encountered the following issue:
Warning:

calloc_aligned:` heap_caps_aligned_calloc failed, retry with MALLOC_CAP_8BIT

Updated (250124): This warning has been fixed once I switched to my ESP32S3 EVK with PSRAM support. The PSRAM effect appears to call the heap within. However, the Error is still visible.

Error:

assert failed: virtual std::vector<std::vector> dl::module::Gemm::get_output_shape(std::vector<std::vector>& ) dl_module_gemm.hpp:71 (input_shapes[0][input_shapes[0].size() - 1] == filter-

Do you know how to resolve the warning and the error?
I added some code to debug in dl_module_gemm.hpp module as below:

ESP_LOGI("input_shapes", "get_output_shape");
ESP_LOGI("input_shapes", "input_shapes size: %d", input_shapes.size());
ESP_LOGI("input_shapes", "input_shapes[0] shape: %s", shape_to_string(input_shapes[0]).c_str());
ESP_LOGI("filter", "filter shape: %s", shape_to_string(filter->shape).c_str());
ESP_LOGI("filter", "filter->shape[2]: %d", filter->shape[2]);
ESP_LOGI("test", "input_shapes[0][input_shapes[0].size() - 1]: %d", input_shapes[0][input_shapes[0].size() - 1]);
assert(input_shapes.size() == 1);
assert(filter->shape.size() == 4);
assert(filter->shape[0] == 1);
assert(filter->shape[1] == 1);
assert(input_shapes[0][input_shapes[0].size() - 1] == filter->shape[2]);

And here is the log

I (536) input_shapes: get_output_shape
I (536) input_shapes: input_shapes size: 1
I (546) input_shapes: input_shapes[0] shape: [1, 176]
I (546) filter: filter shape: [1, 1, 352, 16]
I (556) filter: filter->shape[2]: 352
I (556) test: input_shapes[0][input_shapes[0].size() - 1]: 176

It seems something miss in my code because 176 = 352/2

For reference, I have included the log from the monitor below.

I (476) dl::Model: PPQ_Operation_0: Transpose
I (476) dl::Model: /flatten/Flatten: Flatten
I (486) dl::Model: /fc1/Gemm: Gemm
W (486) calloc_aligned: heap_caps_aligned_calloc failed, retry with MALLOC_CAP_8BIT
W (496) calloc_aligned: heap_caps_aligned_calloc failed, retry with MALLOC_CAP_8BIT
I (506) dl::Model: /prelu3/PRelu: PRelu
W (506) calloc_aligned: heap_caps_aligned_calloc failed, retry with MALLOC_CAP_8BIT
I (516) dl::Model: /fc2/Gemm: Gemm
W (516) calloc_aligned: heap_caps_aligned_calloc failed, retry with MALLOC_CAP_8BIT
W (526) calloc_aligned: heap_caps_aligned_calloc failed, retry with MALLOC_CAP_8BIT

assert failed: virtual std::vector<std::vector<int> > dl::module::Gemm::get_output_shape(std::vector<std::vector<int> >&) dl_module_gemm.hpp:71 (input_shapes[0][input_shapes[0].size() - 1] == filter-


Backtrace: 0x40375a8d:0x3fc99560 0x4037b071:0x3fc99580 0x40381a4d:0x3fc995a0 0x42021979:0x3fc996c0 0x4203b631:0x3fc996f0 0x4203c706:0x3fc998f0 0x4202f3b8:0x3fc99960 0x4202f857:0x3fc999f0 0x42009dd2:0x3fc99a20 0x4209251f:0x3fc99a50 0x4037b899:0x3fc99a80
--- To exit from IDF monitor please use "Ctrl+]". Alternatively, you can use Ctrl+T Ctrl+X to exit.
I (23) boot: ESP-IDF HEAD-HASH-NOTFOUND 2nd stage bootloader
I (23) boot: compile time Jan 23 2025 23:01:11
I (23) boot: Multicore bootloader
I (24) boot: chip revision: v0.2
I (27) boot: efuse block revision: v1.3
I (30) boot.esp32s3: Boot SPI Speed : 80MHz
I (34) boot.esp32s3: SPI Mode       : DIO
I (38) boot.esp32s3: SPI Flash Size : 16MB
I (42) boot: Enabling RNG early entropy source...
I (46) boot: Partition Table:
I (49) boot: ## Label            Usage          Type ST Offset   Length
I (55) boot:  0 factory          factory app      00 00 00010000 003e8000
I (62) boot:  1 model            Unknown data     01 82 003f8000 000fa000
I (68) boot: End of partition table
I (72) esp_image: segment 0: paddr=00010020 vaddr=3c0a0020 size=10708h ( 67336) map
I (91) esp_image: segment 1: paddr=00020730 vaddr=3fc93b00 size=02af0h ( 10992) load
I (94) esp_image: segment 2: paddr=00023228 vaddr=40374000 size=0cdf0h ( 52720) load
I (106) esp_image: segment 3: paddr=00030020 vaddr=42000020 size=930d8h (602328) map
I (213) esp_image: segment 4: paddr=000c3100 vaddr=40380df0 size=02c04h ( 11268) load
I (215) esp_image: segment 5: paddr=000c5d0c vaddr=600fe100 size=0001ch (    28) load
I (222) boot: Loaded app from partition at offset 0x10000
I (223) boot: Disabling RNG early entropy source...
I (238) cpu_start: Multicore app
I (247) cpu_start: Pro cpu start user code
I (247) cpu_start: cpu freq: 240000000 Hz
I (247) app_init: Application information:
I (250) app_init: Project name:     eyeblink_v2
I (255) app_init: App version:      1
I (260) app_init: Compile time:     Jan 23 2025 23:00:21
I (266) app_init: ELF file SHA256:  84eb69f48...
I (271) app_init: ESP-IDF:          HEAD-HASH-NOTFOUND
I (277) efuse_init: Min chip rev:     v0.0
I (281) efuse_init: Max chip rev:     v0.99
I (286) efuse_init: Chip rev:         v0.2
I (291) heap_init: Initializing. RAM available for dynamic allocation:
I (298) heap_init: At 3FC97108 len 00052608 (329 KiB): RAM
I (304) heap_init: At 3FCE9710 len 00005724 (21 KiB): RAM
I (311) heap_init: At 3FCF0000 len 00008000 (32 KiB): DRAM
I (317) heap_init: At 600FE11C len 00001ECC (7 KiB): RTCRAM
I (324) spi_flash: detected chip: gd
I (327) spi_flash: flash io: dio
I (331) sleep_gpio: Configure to isolate all GPIO pins in sleep state
I (339) sleep_gpio: Enable automatic switching of GPIO sleep configuration
I (346) main_task: Started on CPU0
I (376) main_task: Calling app_main()
I (376) EYEBLINK_DETECTION_TEST: get into app_main
I (376) FbsLoader: The storage free size is 32000 KB
I (376) FbsLoader: The partition size is 1000 KB
I (386) dl::Model: model:main_graph, version:0

I (396) dl::Model: /temporal_conv/Conv: Conv
W (396) calloc_aligned: heap_caps_aligned_calloc failed, retry with MALLOC_CAP_8BIT
W (406) calloc_aligned: heap_caps_aligned_calloc failed, retry with MALLOC_CAP_8BIT
I (416) dl::Model: /prelu1/PRelu: PRelu
W (416) calloc_aligned: heap_caps_aligned_calloc failed, retry with MALLOC_CAP_8BIT
I (426) dl::Model: /pool1/AveragePool: AveragePool
I (436) dl::Model: /spatial_conv/Conv: Conv
W (436) calloc_aligned: heap_caps_aligned_calloc failed, retry with MALLOC_CAP_8BIT
W (446) calloc_aligned: heap_caps_aligned_calloc failed, retry with MALLOC_CAP_8BIT
I (456) dl::Model: /prelu2/PRelu: PRelu
W (456) calloc_aligned: heap_caps_aligned_calloc failed, retry with MALLOC_CAP_8BIT
I (466) dl::Model: /pool2/AveragePool: AveragePool
I (476) dl::Model: PPQ_Operation_0: Transpose
I (476) dl::Model: /flatten/Flatten: Flatten
I (486) dl::Model: /fc1/Gemm: Gemm
W (486) calloc_aligned: heap_caps_aligned_calloc failed, retry with MALLOC_CAP_8BIT
W (496) calloc_aligned: heap_caps_aligned_calloc failed, retry with MALLOC_CAP_8BIT
I (506) dl::Model: /prelu3/PRelu: PRelu
W (506) calloc_aligned: heap_caps_aligned_calloc failed, retry with MALLOC_CAP_8BIT
I (516) dl::Model: /fc2/Gemm: Gemm
W (516) calloc_aligned: heap_caps_aligned_calloc failed, retry with MALLOC_CAP_8BIT
W (526) calloc_aligned: heap_caps_aligned_calloc failed, retry with MALLOC_CAP_8BIT

assert failed: virtual std::vector<std::vector<int> > dl::module::Gemm::get_output_shape(std::vector<std::vector<int> >&) dl_module_gemm.hpp:71 (input_shapes[0][input_shapes[0].size() - 1] == filter-


Backtrace: 0x40375a8d:0x3fc99560 0x4037b071:0x3fc99580 0x40381a4d:0x3fc995a0 0x42021979:0x3fc996c0 0x4203b631:0x3fc996f0 0x4203c706:0x3fc998f0 0x4202f3b8:0x3fc99960 0x4202f857:0x3fc999f0 0x42009dd2:0x3fc99a20 0x4209251f:0x3fc99a50 0x4037b899:0x3fc99a80
--- 0x40375a8d: panic_abort at ~/esp/v5.4/esp-idf/components/esp_system/panic.c:454
0x4037b071: esp_system_abort at ~/esp/v5.4/esp-idf/components/esp_system/port/esp_system_chip.c:92
0x40381a4d: __assert_func at ~/esp/v5.4/esp-idf/components/newlib/assert.c:80
0x42021979: dl::module::Gemm::get_output_shape(std::vector<std::vector<int, std::allocator<int> >, std::allocator<std::vector<int, std::allocator<int> > > >&) at ~/Firmware/eyeblink_v2/esp-dl/dl/module/include/dl_module_gemm.hpp:71 (discriminator 1)
0x4203b631: dl::memory::MemoryManagerGreedy::get_tensor_info_from_fbs(fbs::FbsModel*, std::vector<dl::module::Module*, std::allocator<dl::module::Module*> >, std::vector<dl::memory::TensorInfo*, std::allocator<dl::memory::TensorInfo*> >&) at ~/Firmware/eyeblink_v2/esp-dl/dl/model/src/dl_memory_manager_greedy.cpp:140
0x4203c706: dl::memory::MemoryManagerGreedy::alloc(fbs::FbsModel*, std::vector<dl::module::Module*, std::allocator<dl::module::Module*> >&) at ~/Firmware/eyeblink_v2/esp-dl/dl/model/src/dl_memory_manager_greedy.cpp:41 (discriminator 1)
0x4202f3b8: dl::Model::build(unsigned int, dl::memory_manager_t, bool) at ~/Firmware/eyeblink_v2/esp-dl/dl/model/src/dl_model_base.cpp:171
0x4202f857: dl::Model::Model(char const*, fbs::model_location_type_t, int, dl::memory_manager_t, unsigned char*, bool) at ~/Firmware/eyeblink_v2/esp-dl/dl/model/src/dl_model_base.cpp:20
 (inlined by) dl::Model::Model(char const*, fbs::model_location_type_t, int, dl::memory_manager_t, unsigned char*, bool) at ~/Firmware/eyeblink_v2/esp-dl/dl/model/src/dl_model_base.cpp:12
0x42009dd2: app_main at~/Firmware/eyeblink_v2/main/main.cpp:103 (discriminator 1)
0x4209251f: main_task at ~/esp/v5.4/esp-idf/components/freertos/app_startup.c:208
0x4037b899: vPortTaskWrapper at ~/esp/v5.4/esp-idf/components/freertos/FreeRTOS-Kernel/portable/xtensa/port.c:139





ELF file SHA256: 84eb69f48

Rebooting...

And here is the code in app_main function

extern "C" void app_main(void)
{
    ESP_LOGI(TAG, "get into app_main");
    Model *model = new Model("model", fbs::MODEL_LOCATION_IN_FLASH_PARTITION, 0, MEMORY_MANAGER_GREEDY, nullptr, false);
    delete model;
    ESP_LOGI(TAG, "exit app_main");
}

My ONNX model (change Relu to PRelu)

Image

@BlueSkyB
Copy link
Collaborator

BlueSkyB commented Jan 26, 2025

Hi @BlueSkyB, Yes. I have uploaded my trained model in pytorch (*.pt) and ONNX model that I got from quantize script in zip file

eyeblink_trained_model.zip

Hi @manhdatbn93 . We have fixed the bug and the model is now running normally. Please update esp-ppq and esp-dl as well.
After the update, if there are still other issues with the network you want to deploy, you can raise another issue. The network you mentioned later can also be tried again after the update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants