Keeping blocks-examples small #49

rizar · 2015-10-01T16:28:48Z

Oops, I first created this one empty.

Blocks-examples is an extremely important repository in Blocks family. As @dwf said in an off-line conversation, there is very little sense in having a framework without working examples of how it can be used.

That said, we will have to update all the examples here every time anything is changed in Blocks. This is why it is essential to limit the growth of this repo. One way to do it is to stick to examples for which ratio 'educational value/size' is the largest one.

Of the examples currently available machine translation demo is by far the largest one, and I think should live in a separate repository. I do not know why it was moved from https://github.com/kyunghyuncho/NMT to Blocks-examples in the first place, but I think this historical error should be fixed.

rizar · 2015-10-01T16:36:06Z

@orhanf , would you mind? It we go this way, you will not have to wait for reviews for weeks to make changes in your code.

(don't pay attention to all this random closing and opening)

orhanf · 2015-10-01T20:08:20Z

@rizar i'm okay with any consensus.

I am only worried about a few points. If this example is going to be maintained in another repo, then it will be something like "an NMT implementation using (some version of) Blocks", not a reference "Blocks example for NMT", what i am trying to say is, the example will work only with a specific blocks, fuel, picklable_itertools version. It still has to be synched periodically to be a proper and valid example, otherwise it will be outdated and hardly be called as a blocks-example. I can continue working on this separate repo and i am totally okay with it, but synching issue has to be solved somehow.

This was moved from https://github.com/kyunghyuncho/NMT because back then, it was a private repo (for WMT'15 challenge), and was made puclic only last month or the other. Also, it was seriously out of sync with blocks, hence i adapted almost everything with a clean start.

Since this was originally assigned to me as a ccw ticket - a loong one :) , i am also curious about supervisors' opinions, @bartvm and @dmitriy-serdyuk.

One additional thing specific to machine_translation example: I think there are lots of utilities/features that are not necessary or do not have to be there, like stream, prepare_data.py, BleuValidator, early_stopping, checkpointing etc. Because the actual core computational graph is only about 300 lines of code, the rest is, i think, optional, and may be we can leave a minimalist, stripped down version of NMT at blocks-examples and maintain the rest at another repo ?

rizar · 2015-10-01T21:51:18Z

@orhanf , thanks for your quick response and active position.

Regarding you first concern: indeed, every example will have to be regularly synchronized to be compatible with the latest Blocks. This is exactly the reason why I think we should have as few as possible Blocks examples, but we can have an unlimited number of example projects using Blocks. What I would like to do is to transfer the machine translation code from the first category to the second one. Because it is very expensive for us in terms of manpower to bear full responsibility that the latest NMT code is 100% correct and compatible with the latest Blocks. For the same reason DRAW and speech recognition implementations should not be in Blocks-examples.

Also, the machine translation example, upon separation from Blocks-examples, can use a fixed version of Blocks. We do plan to make regular versions, expect 0.1 to be released within one month the latest.

All the utilities are in fact very useful for an NMT implementation, so I am not sure that having an NMT example without them makes sense. On the other hand, reverse_words is in fact a stripped down example of machine translation.

I agree that it would be great to hear @bartvm and @dmitriy-serdyuk. As far as I understood from preliminary offline discussions @dmitriy-serdyuk shares my concerns.

Just to make it clear here: machine translation demo is a great piece of code which is very useful for the lab and for the community in general, but I would like to keep it out of the area of direct responsibility of Blocks+Fuel development team. My concern is sustainability here: the code is growing much faster than the number of active developers...

dwf · 2015-11-11T22:31:38Z

Just a thought: would it be possible to have an extremely stripped down version of an encoder-decoder-with-attention in blocks-examples, with a pointer to the whole WMT15 shebang somewhere else?

orhanf · 2015-11-12T03:29:27Z

I agree with @dwf on this, or may be mt example can be moved under Example Projects completely, like the other real world examples ?

dwf · 2015-11-12T03:31:09Z

Actually @orhanf, looking at it, I think I just described the reverse_words example. 😄

rizar · 2015-11-12T03:38:29Z

Hey guys, this is just to say that I agree that the MT example should be in
separate repository. reverse_words was intended to be the toy example of
Encoder-Decoder with attention, and I think that differences between it and
MT are not that significant, while it is much shorter.

On 11 November 2015 at 22:31, David Warde-Farley [email protected]
wrote:

Actually @orhanf https://github.com/orhanf, looking at it, I think I
just described the reverse_words example. [image: 😄]

—
Reply to this email directly or view it on GitHub
#49 (comment)
.

bartvm · 2015-11-12T03:46:56Z

Although I don't have anything against separate repositories, I think it
would be nice if there was an easy way for them to share things like
.travis.yml files and the like, so that when Travis breaks (like it did
when miniconda changed its path) we don't have to go through a dozen
repositories to make the same change everywhere.

On Wed, Nov 11, 2015 at 10:38 PM, Dzmitry Bahdanau <[email protected]

wrote:

Hey guys, this is just to say that I agree that the MT example should be in
separate repository. reverse_words was intended to be the toy example of
Encoder-Decoder with attention, and I think that differences between it and
MT are not that significant, while it is much shorter.

On 11 November 2015 at 22:31, David Warde-Farley <[email protected]

wrote:

Actually @orhanf https://github.com/orhanf, looking at it, I think I
just described the reverse_words example. [image: 😄]

—
Reply to this email directly or view it on GitHub
<
#49 (comment)

.

—
Reply to this email directly or view it on GitHub
#49 (comment)
.

dwf · 2015-11-20T01:14:15Z

An annoying data point: we have 15 open tickets on blocks-examples, and as of today, 9 of them are about NMT (including this one). I'm unwatching this repository so I don't keep getting email about NMT being broken.

rizar closed this as completed Oct 1, 2015

rizar reopened this Oct 1, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Keeping blocks-examples small #49

Keeping blocks-examples small #49

rizar commented Oct 1, 2015

rizar commented Oct 1, 2015

orhanf commented Oct 1, 2015

rizar commented Oct 1, 2015

dwf commented Nov 11, 2015

orhanf commented Nov 12, 2015

dwf commented Nov 12, 2015

rizar commented Nov 12, 2015

bartvm commented Nov 12, 2015

dwf commented Nov 20, 2015

Keeping blocks-examples small #49

Keeping blocks-examples small #49

Comments

rizar commented Oct 1, 2015

rizar commented Oct 1, 2015

orhanf commented Oct 1, 2015

rizar commented Oct 1, 2015

dwf commented Nov 11, 2015

orhanf commented Nov 12, 2015

dwf commented Nov 12, 2015

rizar commented Nov 12, 2015

bartvm commented Nov 12, 2015

dwf commented Nov 20, 2015