Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

训练时报错 #5

Closed
kslz opened this issue Dec 30, 2021 · 4 comments
Closed

训练时报错 #5

kslz opened this issue Dec 30, 2021 · 4 comments

Comments

@kslz
Copy link

kslz commented Dec 30, 2021

输出如下

-----------  Configuration Arguments -----------
batch_size: 32
gpus: 0
input_shape: (None, 1, 128, 128)
learning_rate: 0.001
num_classes: 10
num_epoch: 50
num_workers: 4
save_model: models/
test_list_path: dataset/test_list.txt
train_list_path: dataset/train_list.txt
------------------------------------------------
W1230 03:44:49.589563  1696 device_context.cc:404] Please NOTE: device: 0, GPU Compute Capability: 6.0, Driver API Version: 11.2, Runtime API Version: 10.2
W1230 03:44:49.593922  1696 device_context.cc:422] device: 0, cuDNN Version: 7.6.
-------------------------------------------------------------------------------
   Layer (type)         Input Shape          Output Shape         Param #    
===============================================================================
     Conv2D-1        [[1, 1, 128, 128]]    [1, 64, 64, 64]         3,136     
   BatchNorm2D-1     [[1, 64, 64, 64]]     [1, 64, 64, 64]          256      
      ReLU-1         [[1, 64, 64, 64]]     [1, 64, 64, 64]           0       
    MaxPool2D-1      [[1, 64, 64, 64]]     [1, 64, 32, 32]           0       
     Conv2D-2        [[1, 64, 32, 32]]     [1, 64, 32, 32]        36,864     
   BatchNorm2D-2     [[1, 64, 32, 32]]     [1, 64, 32, 32]          256      
      ReLU-2         [[1, 64, 32, 32]]     [1, 64, 32, 32]           0       
     Conv2D-3        [[1, 64, 32, 32]]     [1, 64, 32, 32]        36,864     
   BatchNorm2D-3     [[1, 64, 32, 32]]     [1, 64, 32, 32]          256      
   BasicBlock-1      [[1, 64, 32, 32]]     [1, 64, 32, 32]           0       
     Conv2D-4        [[1, 64, 32, 32]]     [1, 64, 32, 32]        36,864     
   BatchNorm2D-4     [[1, 64, 32, 32]]     [1, 64, 32, 32]          256      
      ReLU-3         [[1, 64, 32, 32]]     [1, 64, 32, 32]           0       
     Conv2D-5        [[1, 64, 32, 32]]     [1, 64, 32, 32]        36,864     
   BatchNorm2D-5     [[1, 64, 32, 32]]     [1, 64, 32, 32]          256      
   BasicBlock-2      [[1, 64, 32, 32]]     [1, 64, 32, 32]           0       
     Conv2D-6        [[1, 64, 32, 32]]     [1, 64, 32, 32]        36,864     
   BatchNorm2D-6     [[1, 64, 32, 32]]     [1, 64, 32, 32]          256      
      ReLU-4         [[1, 64, 32, 32]]     [1, 64, 32, 32]           0       
     Conv2D-7        [[1, 64, 32, 32]]     [1, 64, 32, 32]        36,864     
   BatchNorm2D-7     [[1, 64, 32, 32]]     [1, 64, 32, 32]          256      
   BasicBlock-3      [[1, 64, 32, 32]]     [1, 64, 32, 32]           0       
     Conv2D-9        [[1, 64, 32, 32]]     [1, 128, 16, 16]       73,728     
   BatchNorm2D-9     [[1, 128, 16, 16]]    [1, 128, 16, 16]         512      
      ReLU-5         [[1, 128, 16, 16]]    [1, 128, 16, 16]          0       
     Conv2D-10       [[1, 128, 16, 16]]    [1, 128, 16, 16]       147,456    
  BatchNorm2D-10     [[1, 128, 16, 16]]    [1, 128, 16, 16]         512      
     Conv2D-8        [[1, 64, 32, 32]]     [1, 128, 16, 16]        8,192     
   BatchNorm2D-8     [[1, 128, 16, 16]]    [1, 128, 16, 16]         512      
   BasicBlock-4      [[1, 64, 32, 32]]     [1, 128, 16, 16]          0       
     Conv2D-11       [[1, 128, 16, 16]]    [1, 128, 16, 16]       147,456    
  BatchNorm2D-11     [[1, 128, 16, 16]]    [1, 128, 16, 16]         512      
      ReLU-6         [[1, 128, 16, 16]]    [1, 128, 16, 16]          0       
     Conv2D-12       [[1, 128, 16, 16]]    [1, 128, 16, 16]       147,456    
  BatchNorm2D-12     [[1, 128, 16, 16]]    [1, 128, 16, 16]         512      
   BasicBlock-5      [[1, 128, 16, 16]]    [1, 128, 16, 16]          0       
     Conv2D-13       [[1, 128, 16, 16]]    [1, 128, 16, 16]       147,456    
  BatchNorm2D-13     [[1, 128, 16, 16]]    [1, 128, 16, 16]         512      
      ReLU-7         [[1, 128, 16, 16]]    [1, 128, 16, 16]          0       
     Conv2D-14       [[1, 128, 16, 16]]    [1, 128, 16, 16]       147,456    
  BatchNorm2D-14     [[1, 128, 16, 16]]    [1, 128, 16, 16]         512      
   BasicBlock-6      [[1, 128, 16, 16]]    [1, 128, 16, 16]          0       
     Conv2D-15       [[1, 128, 16, 16]]    [1, 128, 16, 16]       147,456    
  BatchNorm2D-15     [[1, 128, 16, 16]]    [1, 128, 16, 16]         512      
      ReLU-8         [[1, 128, 16, 16]]    [1, 128, 16, 16]          0       
     Conv2D-16       [[1, 128, 16, 16]]    [1, 128, 16, 16]       147,456    
  BatchNorm2D-16     [[1, 128, 16, 16]]    [1, 128, 16, 16]         512      
   BasicBlock-7      [[1, 128, 16, 16]]    [1, 128, 16, 16]          0       
     Conv2D-18       [[1, 128, 16, 16]]     [1, 256, 8, 8]        294,912    
  BatchNorm2D-18      [[1, 256, 8, 8]]      [1, 256, 8, 8]         1,024     
      ReLU-9          [[1, 256, 8, 8]]      [1, 256, 8, 8]           0       
     Conv2D-19        [[1, 256, 8, 8]]      [1, 256, 8, 8]        589,824    
  BatchNorm2D-19      [[1, 256, 8, 8]]      [1, 256, 8, 8]         1,024     
     Conv2D-17       [[1, 128, 16, 16]]     [1, 256, 8, 8]        32,768     
  BatchNorm2D-17      [[1, 256, 8, 8]]      [1, 256, 8, 8]         1,024     
   BasicBlock-8      [[1, 128, 16, 16]]     [1, 256, 8, 8]           0       
     Conv2D-20        [[1, 256, 8, 8]]      [1, 256, 8, 8]        589,824    
  BatchNorm2D-20      [[1, 256, 8, 8]]      [1, 256, 8, 8]         1,024     
      ReLU-10         [[1, 256, 8, 8]]      [1, 256, 8, 8]           0       
     Conv2D-21        [[1, 256, 8, 8]]      [1, 256, 8, 8]        589,824    
  BatchNorm2D-21      [[1, 256, 8, 8]]      [1, 256, 8, 8]         1,024     
   BasicBlock-9       [[1, 256, 8, 8]]      [1, 256, 8, 8]           0       
     Conv2D-22        [[1, 256, 8, 8]]      [1, 256, 8, 8]        589,824    
  BatchNorm2D-22      [[1, 256, 8, 8]]      [1, 256, 8, 8]         1,024     
      ReLU-11         [[1, 256, 8, 8]]      [1, 256, 8, 8]           0       
     Conv2D-23        [[1, 256, 8, 8]]      [1, 256, 8, 8]        589,824    
  BatchNorm2D-23      [[1, 256, 8, 8]]      [1, 256, 8, 8]         1,024     
   BasicBlock-10      [[1, 256, 8, 8]]      [1, 256, 8, 8]           0       
     Conv2D-24        [[1, 256, 8, 8]]      [1, 256, 8, 8]        589,824    
  BatchNorm2D-24      [[1, 256, 8, 8]]      [1, 256, 8, 8]         1,024     
      ReLU-12         [[1, 256, 8, 8]]      [1, 256, 8, 8]           0       
     Conv2D-25        [[1, 256, 8, 8]]      [1, 256, 8, 8]        589,824    
  BatchNorm2D-25      [[1, 256, 8, 8]]      [1, 256, 8, 8]         1,024     
   BasicBlock-11      [[1, 256, 8, 8]]      [1, 256, 8, 8]           0       
     Conv2D-26        [[1, 256, 8, 8]]      [1, 256, 8, 8]        589,824    
  BatchNorm2D-26      [[1, 256, 8, 8]]      [1, 256, 8, 8]         1,024     
      ReLU-13         [[1, 256, 8, 8]]      [1, 256, 8, 8]           0       
     Conv2D-27        [[1, 256, 8, 8]]      [1, 256, 8, 8]        589,824    
  BatchNorm2D-27      [[1, 256, 8, 8]]      [1, 256, 8, 8]         1,024     
   BasicBlock-12      [[1, 256, 8, 8]]      [1, 256, 8, 8]           0       
     Conv2D-28        [[1, 256, 8, 8]]      [1, 256, 8, 8]        589,824    
  BatchNorm2D-28      [[1, 256, 8, 8]]      [1, 256, 8, 8]         1,024     
      ReLU-14         [[1, 256, 8, 8]]      [1, 256, 8, 8]           0       
     Conv2D-29        [[1, 256, 8, 8]]      [1, 256, 8, 8]        589,824    
  BatchNorm2D-29      [[1, 256, 8, 8]]      [1, 256, 8, 8]         1,024     
   BasicBlock-13      [[1, 256, 8, 8]]      [1, 256, 8, 8]           0       
     Conv2D-31        [[1, 256, 8, 8]]      [1, 512, 4, 4]       1,179,648   
  BatchNorm2D-31      [[1, 512, 4, 4]]      [1, 512, 4, 4]         2,048     
      ReLU-15         [[1, 512, 4, 4]]      [1, 512, 4, 4]           0       
     Conv2D-32        [[1, 512, 4, 4]]      [1, 512, 4, 4]       2,359,296   
  BatchNorm2D-32      [[1, 512, 4, 4]]      [1, 512, 4, 4]         2,048     
     Conv2D-30        [[1, 256, 8, 8]]      [1, 512, 4, 4]        131,072    
  BatchNorm2D-30      [[1, 512, 4, 4]]      [1, 512, 4, 4]         2,048     
   BasicBlock-14      [[1, 256, 8, 8]]      [1, 512, 4, 4]           0       
     Conv2D-33        [[1, 512, 4, 4]]      [1, 512, 4, 4]       2,359,296   
  BatchNorm2D-33      [[1, 512, 4, 4]]      [1, 512, 4, 4]         2,048     
      ReLU-16         [[1, 512, 4, 4]]      [1, 512, 4, 4]           0       
     Conv2D-34        [[1, 512, 4, 4]]      [1, 512, 4, 4]       2,359,296   
  BatchNorm2D-34      [[1, 512, 4, 4]]      [1, 512, 4, 4]         2,048     
   BasicBlock-15      [[1, 512, 4, 4]]      [1, 512, 4, 4]           0       
     Conv2D-35        [[1, 512, 4, 4]]      [1, 512, 4, 4]       2,359,296   
  BatchNorm2D-35      [[1, 512, 4, 4]]      [1, 512, 4, 4]         2,048     
      ReLU-17         [[1, 512, 4, 4]]      [1, 512, 4, 4]           0       
     Conv2D-36        [[1, 512, 4, 4]]      [1, 512, 4, 4]       2,359,296   
  BatchNorm2D-36      [[1, 512, 4, 4]]      [1, 512, 4, 4]         2,048     
   BasicBlock-16      [[1, 512, 4, 4]]      [1, 512, 4, 4]           0       
AdaptiveAvgPool2D-1   [[1, 512, 4, 4]]      [1, 512, 1, 1]           0       
     Linear-1            [[1, 512]]            [1, 10]             5,130     
===============================================================================
Total params: 21,300,554
Trainable params: 21,266,506
Non-trainable params: 34,048
-------------------------------------------------------------------------------
Input size (MB): 0.06
Forward/backward pass size (MB): 28.00
Params size (MB): 81.26
Estimated Total Size (MB): 109.32
-------------------------------------------------------------------------------

Epoch 0: StepDecay set learning rate to 0.001.
/usr/local/lib/python3.7/dist-packages/librosa/core/audio.py:165: UserWarning: PySoundFile failed. Trying audioread instead.
  warnings.warn("PySoundFile failed. Trying audioread instead.")
/usr/local/lib/python3.7/dist-packages/librosa/core/audio.py:165: UserWarning: PySoundFile failed. Trying audioread instead.
  warnings.warn("PySoundFile failed. Trying audioread instead.")
ERROR:root:DataLoader reader thread raised an exception!
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python3.7/threading.py", line 926, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.7/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dataloader/dataloader_iter.py", line 411, in _thread_loop
    batch = self._get_data()
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dataloader/dataloader_iter.py", line 525, in _get_data
    batch.reraise()
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dataloader/worker.py", line 168, in reraise
    raise self.exc_type(msg)
ValueError: DataLoader worker(2) caught ValueError with message:
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dataloader/worker.py", line 320, in _worker_loop
    batch = fetcher.fetch(indices)
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dataloader/fetcher.py", line 99, in fetch
    data = [self.dataset[idx] for idx in batch_indices]
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dataloader/fetcher.py", line 99, in <listcomp>
    data = [self.dataset[idx] for idx in batch_indices]
  File "/content/AudioClassification-PaddlePaddle/reader.py", line 36, in __getitem__
    spec_mag = load_audio(audio_path, mode=self.model, spec_len=self.spec_len)
  File "/content/AudioClassification-PaddlePaddle/reader.py", line 14, in load_audio
    crop_start = random.randint(0, spec_mag.shape[1] - spec_len)
  File "/usr/lib/python3.7/random.py", line 222, in randint
    return self.randrange(a, b+1)
  File "/usr/lib/python3.7/random.py", line 200, in randrange
    raise ValueError("empty range for randrange() (%d,%d, %d)" % (istart, istop, width))
ValueError: empty range for randrange() (0,-15, -15)


Traceback (most recent call last):
  File "train.py", line 125, in <module>
    train(args)
  File "train.py", line 85, in train
    for batch_id, (spec_mag, label) in enumerate(train_loader()):
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dataloader/dataloader_iter.py", line 585, in __next__
    data = self._reader.read_next_var_list()
SystemError: (Fatal) Blocking queue is killed because the data reader raises an exception.
  [Hint: Expected killed_ != true, but received killed_:1 == true:1.] (at /paddle/paddle/fluid/operators/reader/blocking_queue.h:166)

/usr/local/lib/python3.7/dist-packages/librosa/core/audio.py:165: UserWarning: PySoundFile failed. Trying audioread instead.
  warnings.warn("PySoundFile failed. Trying audioread instead.")
/usr/local/lib/python3.7/dist-packages/librosa/core/audio.py:165: UserWarning: PySoundFile failed. Trying audioread instead.
  warnings.warn("PySoundFile failed. Trying audioread instead.")
/usr/local/lib/python3.7/dist-packages/librosa/core/audio.py:165: UserWarning: PySoundFile failed. Trying audioread instead.
  warnings.warn("PySoundFile failed. Trying audioread instead.")
/usr/local/lib/python3.7/dist-packages/librosa/core/audio.py:165: UserWarning: PySoundFile failed. Trying audioread instead.
  warnings.warn("PySoundFile failed. Trying audioread instead.")
/usr/local/lib/python3.7/dist-packages/librosa/core/audio.py:165: UserWarning: PySoundFile failed. Trying audioread instead.
  warnings.warn("PySoundFile failed. Trying audioread instead.")
@kslz
Copy link
Author

kslz commented Dec 30, 2021

改了一改数据集 报错又有了变化...
Total params: 21,300,554
Trainable params: 21,266,506
Non-trainable params: 34,048

Input size (MB): 0.06
Forward/backward pass size (MB): 28.00
Params size (MB): 81.26
Estimated Total Size (MB): 109.32

Epoch 0: StepDecay set learning rate to 0.001.
ERROR:root:DataLoader reader thread raised an exception!
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/usr/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dataloader/dataloader_iter.py", line 411, in _thread_loop
batch = self._get_data()
File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dataloader/dataloader_iter.py", line 525, in _get_data
batch.reraise()
File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dataloader/worker.py", line 168, in reraise
raise self.exc_type(msg)
ValueError: DataLoader worker(1) caught ValueError with message:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dataloader/worker.py", line 320, in _worker_loop
batch = fetcher.fetch(indices)
File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dataloader/fetcher.py", line 99, in fetch
data = [self.dataset[idx] for idx in batch_indices]
File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dataloader/fetcher.py", line 99, in
data = [self.dataset[idx] for idx in batch_indices]
File "/content/AudioClassification-PaddlePaddle/reader.py", line 36, in getitem
spec_mag = load_audio(audio_path, mode=self.model, spec_len=self.spec_len)
File "/content/AudioClassification-PaddlePaddle/reader.py", line 14, in load_audio
crop_start = random.randint(0, spec_mag.shape[1] - spec_len)
File "/usr/lib/python3.7/random.py", line 222, in randint
return self.randrange(a, b+1)
File "/usr/lib/python3.7/random.py", line 200, in randrange
raise ValueError("empty range for randrange() (%d,%d, %d)" % (istart, istop, width))
ValueError: empty range for randrange() (0,-26, -26)

Traceback (most recent call last):
File "train.py", line 125, in
train(args)
File "train.py", line 85, in train
for batch_id, (spec_mag, label) in enumerate(train_loader()):
File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dataloader/dataloader_iter.py", line 585, in next
data = self.reader.read_next_var_list()
SystemError: (Fatal) Blocking queue is killed because the data reader raises an exception.
[Hint: Expected killed
!= true, but received killed_:1 == true:1.] (at /paddle/paddle/fluid/operators/reader/blocking_queue.h:166)

@yeyupiaoling
Copy link
Owner

还是数据有问题,请检查数据

@kslz
Copy link
Author

kslz commented Jan 13, 2022

还是数据有问题,请检查数据

能不能麻烦提供一下可以用的数据集压缩包,有个样例就行

@yeyupiaoling
Copy link
Owner

看文档 生成数据列表 部分

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants