Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keeping blocks-examples small #49

Open
rizar opened this issue Oct 1, 2015 · 9 comments
Open

Keeping blocks-examples small #49

rizar opened this issue Oct 1, 2015 · 9 comments

Comments

@rizar
Copy link
Contributor

rizar commented Oct 1, 2015

Oops, I first created this one empty.

Blocks-examples is an extremely important repository in Blocks family. As @dwf said in an off-line conversation, there is very little sense in having a framework without working examples of how it can be used.

That said, we will have to update all the examples here every time anything is changed in Blocks. This is why it is essential to limit the growth of this repo. One way to do it is to stick to examples for which ratio 'educational value/size' is the largest one.

Of the examples currently available machine translation demo is by far the largest one, and I think should live in a separate repository. I do not know why it was moved from https://github.com/kyunghyuncho/NMT to Blocks-examples in the first place, but I think this historical error should be fixed.

@rizar rizar closed this as completed Oct 1, 2015
@rizar
Copy link
Contributor Author

rizar commented Oct 1, 2015

@orhanf , would you mind? It we go this way, you will not have to wait for reviews for weeks to make changes in your code.

(don't pay attention to all this random closing and opening)

@rizar rizar reopened this Oct 1, 2015
@orhanf
Copy link
Contributor

orhanf commented Oct 1, 2015

@rizar i'm okay with any consensus.

I am only worried about a few points. If this example is going to be maintained in another repo, then it will be something like "an NMT implementation using (some version of) Blocks", not a reference "Blocks example for NMT", what i am trying to say is, the example will work only with a specific blocks, fuel, picklable_itertools version. It still has to be synched periodically to be a proper and valid example, otherwise it will be outdated and hardly be called as a blocks-example. I can continue working on this separate repo and i am totally okay with it, but synching issue has to be solved somehow.

This was moved from https://github.com/kyunghyuncho/NMT because back then, it was a private repo (for WMT'15 challenge), and was made puclic only last month or the other. Also, it was seriously out of sync with blocks, hence i adapted almost everything with a clean start.

Since this was originally assigned to me as a ccw ticket - a loong one :) , i am also curious about supervisors' opinions, @bartvm and @dmitriy-serdyuk.

One additional thing specific to machine_translation example: I think there are lots of utilities/features that are not necessary or do not have to be there, like stream, prepare_data.py, BleuValidator, early_stopping, checkpointing etc. Because the actual core computational graph is only about 300 lines of code, the rest is, i think, optional, and may be we can leave a minimalist, stripped down version of NMT at blocks-examples and maintain the rest at another repo ?

@rizar
Copy link
Contributor Author

rizar commented Oct 1, 2015

@orhanf , thanks for your quick response and active position.

Regarding you first concern: indeed, every example will have to be regularly synchronized to be compatible with the latest Blocks. This is exactly the reason why I think we should have as few as possible Blocks examples, but we can have an unlimited number of example projects using Blocks. What I would like to do is to transfer the machine translation code from the first category to the second one. Because it is very expensive for us in terms of manpower to bear full responsibility that the latest NMT code is 100% correct and compatible with the latest Blocks. For the same reason DRAW and speech recognition implementations should not be in Blocks-examples.

Also, the machine translation example, upon separation from Blocks-examples, can use a fixed version of Blocks. We do plan to make regular versions, expect 0.1 to be released within one month the latest.

All the utilities are in fact very useful for an NMT implementation, so I am not sure that having an NMT example without them makes sense. On the other hand, reverse_words is in fact a stripped down example of machine translation.

I agree that it would be great to hear @bartvm and @dmitriy-serdyuk. As far as I understood from preliminary offline discussions @dmitriy-serdyuk shares my concerns.

Just to make it clear here: machine translation demo is a great piece of code which is very useful for the lab and for the community in general, but I would like to keep it out of the area of direct responsibility of Blocks+Fuel development team. My concern is sustainability here: the code is growing much faster than the number of active developers...

@dwf
Copy link
Contributor

dwf commented Nov 11, 2015

Just a thought: would it be possible to have an extremely stripped down version of an encoder-decoder-with-attention in blocks-examples, with a pointer to the whole WMT15 shebang somewhere else?

@orhanf
Copy link
Contributor

orhanf commented Nov 12, 2015

I agree with @dwf on this, or may be mt example can be moved under Example Projects completely, like the other real world examples ?

@dwf
Copy link
Contributor

dwf commented Nov 12, 2015

Actually @orhanf, looking at it, I think I just described the reverse_words example. 😄

@rizar
Copy link
Contributor Author

rizar commented Nov 12, 2015

Hey guys, this is just to say that I agree that the MT example should be in
separate repository. reverse_words was intended to be the toy example of
Encoder-Decoder with attention, and I think that differences between it and
MT are not that significant, while it is much shorter.

On 11 November 2015 at 22:31, David Warde-Farley [email protected]
wrote:

Actually @orhanf https://github.com/orhanf, looking at it, I think I
just described the reverse_words example. [image: 😄]


Reply to this email directly or view it on GitHub
#49 (comment)
.

@bartvm
Copy link
Member

bartvm commented Nov 12, 2015

Although I don't have anything against separate repositories, I think it
would be nice if there was an easy way for them to share things like
.travis.yml files and the like, so that when Travis breaks (like it did
when miniconda changed its path) we don't have to go through a dozen
repositories to make the same change everywhere.

On Wed, Nov 11, 2015 at 10:38 PM, Dzmitry Bahdanau <[email protected]

wrote:

Hey guys, this is just to say that I agree that the MT example should be in
separate repository. reverse_words was intended to be the toy example of
Encoder-Decoder with attention, and I think that differences between it and
MT are not that significant, while it is much shorter.

On 11 November 2015 at 22:31, David Warde-Farley <[email protected]

wrote:

Actually @orhanf https://github.com/orhanf, looking at it, I think I
just described the reverse_words example. [image: 😄]


Reply to this email directly or view it on GitHub
<
#49 (comment)

.


Reply to this email directly or view it on GitHub
#49 (comment)
.

@dwf
Copy link
Contributor

dwf commented Nov 20, 2015

An annoying data point: we have 15 open tickets on blocks-examples, and as of today, 9 of them are about NMT (including this one). I'm unwatching this repository so I don't keep getting email about NMT being broken.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants