get_data_from_batch problem in NMT example #42

ozancaglayan · 2015-09-23T13:49:29Z

Hi,

I have a very weird problem going on with the machine translation example. In stream.py, get_data_from_batch method was reimplemented for PaddingWithEOS class.

On a system with up-to-date Theano, blocks and fuel, the RNNSearch model runs correctly. I'm trying to implement another model on top of this example which also uses the stream.py module for stream processing but with masks disabled: All sequences are padded with EOS and the sequences are all 32 length.

The problem is that for this code, I'm getting a Theano exception. I traced it back to the following problem: Blocks was never getting the padded batch which should have been returned by PaddingWithEOS.get_data_from_batch() method. I then checked through this method in the fuel repository but it wasn't mentioned anywhere in the code. Actually, I couldn't find any caller of this method at all in blocks, fuel, blocks-examples code tree :) Then I discovered that this method was renamed to get_batch() in:

commit 667e81fd1e4c02dece0ef0abc71a7870b18506bc
Author: Vincent Dumoulin <[email protected]>
Date:   Tue Jul 7 15:26:40 2015 -0400

    Adapt Filter, Cache, Batch, Unpack, Padding to new Transformer interface

Now when I rename it to get_batch in stream.py, it started to work. The holy question is how come the current RNNSearch code is working without exception?

Thanks.

The text was updated successfully, but these errors were encountered:

rizar · 2015-10-01T16:16:55Z

Can you please try the most recent Fuel?

orhanf · 2016-01-06T20:13:05Z

@ozancaglayan are you still having a problem with this?

ghost · 2016-02-12T19:52:27Z

To follow up this question, I add source_sentence to training monitor. The data actually shows that the source_sentence and target_sentence are padded with zeros. Is it possible that get_data_from_batch is never called?

source: [[ 1.83600000e+03 1.74000000e+02 9.00000000e+00 ..., 0.00000000e+00
0.00000000e+00 0.00000000e+00]
[ 1.00000000e+00 1.00000000e+00 1.00000000e+00 ..., 0.00000000e+00
0.00000000e+00 0.00000000e+00]
[ 1.73000000e+02 5.00000000e+00 3.45000000e+02 ..., 0.00000000e+00
0.00000000e+00 0.00000000e+00]

source_mask: [[ 1. 1. 1. ..., 0. 0. 0.]
[ 1. 1. 1. ..., 0. 0. 0.]
[ 1. 1. 1. ..., 0. 0. 0.]
...,
[ 1. 1. 1. ..., 0. 0. 0.]
[ 1. 1. 1. ..., 0. 0. 0.]
[ 1. 1. 1. ..., 0. 0. 0.]]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

get_data_from_batch problem in NMT example #42

get_data_from_batch problem in NMT example #42

ozancaglayan commented Sep 23, 2015

rizar commented Oct 1, 2015

orhanf commented Jan 6, 2016

ghost commented Feb 12, 2016

get_data_from_batch problem in NMT example #42

get_data_from_batch problem in NMT example #42

Comments

ozancaglayan commented Sep 23, 2015

rizar commented Oct 1, 2015

orhanf commented Jan 6, 2016

ghost commented Feb 12, 2016