Rouge #103

YuhengHuang42 · 2020-05-21T03:11:48Z

Related to issue #6
implement ROUGE-N, ROUGE-W, ROUGE-L

add rouge-l and rouge-w

codecov-commenter · 2020-05-21T03:18:53Z

Codecov Report

Merging #103 into master will decrease coverage by 0.77%.
The diff coverage is 86.30%.

@@            Coverage Diff             @@
##           master     #103      +/-   ##
==========================================
- Coverage   94.41%   93.63%   -0.78%     
==========================================
  Files          64       65       +1     
  Lines        1611     1776     +165     
==========================================
+ Hits         1521     1663     +142     
- Misses         90      113      +23

Impacted Files	Coverage Δ
torchnlp/word_to_vector/fast_text.py	`100.00% <ø> (ø)`
torchnlp/metrics/rouge.py	`86.06% <86.06%> (ø)`
torchnlp/metrics/__init__.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cde86ba...6d60aa8. Read the comment docs.

PetrochukM

Thanks for the large contribution! With such a large contribution, there is a large responsibility to write maintainable and correct code.

Can you please take more time to ensure the code is readable and tested?

Thanks again!

PetrochukM · 2020-07-04T03:31:19Z

torchnlp/metrics/rouge.py

+class Ngrams(object):
+    """
+    datastructure for n grams.
+    if `exclusive`, datastructure is set


Can you please formally define Args?

PetrochukM · 2020-07-04T03:33:57Z

torchnlp/metrics/rouge.py

+    return ngram_set
+
+
+def _frp_rouge_n(eval_count, ref_count, overlapping_count):


Should this be factored out as a generic f1 score utility function?

PetrochukM · 2020-07-04T03:34:35Z

torchnlp/metrics/rouge.py

+    p_lcs = llcs / n
+    if not beta:
+        beta = p_lcs / (r_lcs + 1e-12)
+    num = (1 + (beta**2)) * r_lcs * p_lcs


Do you have a paper or link to learn more about this algorithm?

PetrochukM · 2020-07-04T03:37:47Z

torchnlp/metrics/rouge.py

+
+
+def get_rouge_w(evaluated_sentence, reference_sentence,
+                f=lambda x: x**2, inv_f=lambda x: math.sqrt(x)):


Suggested change

f=lambda x: x**2, inv_f=lambda x: math.sqrt(x)):

f=lambda x: x**2, inv_f=math.sqrt):

PetrochukM · 2020-07-04T03:38:11Z

torchnlp/metrics/rouge.py

+    Computes ROUGE-W of two sequences, namely evaluated_sentence and reference sentece
+    Reference: https://www.aclweb.org/anthology/W04-1013.pdf
+    Args:
+        evaluated_sentence: a sentence that have been produced by the summarizer


Can you make sure all the arguments are defined, including the optional arguments?

PetrochukM · 2020-07-04T03:50:21Z

torchnlp/metrics/rouge.py

+    return lcs_seq_wrd(len(x), len(y))
+
+
+def _w_lcs(x, y, func=lambda x: x**2):


This algorithm is fairly complex. Can you provide a full test suite to make sure it's correct?

PetrochukM · 2020-07-04T03:50:58Z

torchnlp/metrics/rouge.py

+       dictionary. WLCS-based F-measure score, P-score and R-score
+    """
+    if strict:
+        assert(check_increase(f))


Thanks for adding invariant checks!

PetrochukM · 2020-07-04T03:51:25Z

torchnlp/metrics/rouge.py

+    Returns:
+        bool, if f(x + y) > f(x) + f(y) or not
+    """
+    for i in range(100000):


Should this constant (10000) be parameterized?

PetrochukM · 2020-07-04T03:52:22Z

torchnlp/metrics/rouge.py

+    return True
+
+
+def _frp_rouge_w(wlcs, m, n, f=lambda x: x**2, inv_f=lambda x: math.sqrt(x),


Do you need to repeat these lambdas a second time? I'm thinking about the principles of DRY.

PetrochukM · 2020-07-04T03:52:35Z

torchnlp/metrics/rouge.py

+       n: number of words in candidate summary
+       f: weighting function
+       inv_f: inverse function of weighting function
+       beta: beta = P_lcs / R_lcs when ∂ F_lcs / ∂ R_lcs = ∂ F_lcs / ∂ P_lcs.


strict is not defined :(

YuhengHuang42 · 2020-07-04T04:51:39Z

Thanks for the large contribution! With such a large contribution, there is a large responsibility to write maintainable and correct code.

Can you please take more time to ensure the code is readable and tested?

Thanks again!

Sorry for the inconvenience, I should have mentioned that the codes here are mainly for learning the project, and so I didn't consider making it maintainable.
Some undefined variable may happen because of merge problem.

I should take the responsibility here. However, I am currently not available to rewrite the codes and do the testing. And I can not promise to do it in the future. I will try to fix the problems when I am available.
Still, I will list some of the references so that someone interested in this project can help.

Rouge. This implementation is more complete than tensorflow one.

What is ROUGE and how it works for evaluation of
summarization tasks?. I referenced this documentation for ROUGE-N. And tested for the examples there.

My friend may give more paper references, since ROUGE-L and ROUGE-W are implemented by him (@ST-Saint)

PetrochukM · 2020-07-04T04:56:11Z

:) No problem!

Do you feel like you learned a lot?

If you'd like, I can close the PR and you don't have to worry about fixing the implementation.

YuhengHuang42 · 2020-07-04T04:58:37Z

:) No problem!

Do you feel like you learned a lot?

If you'd like, I can close the PR and you don't have to worry about fixing the implementation.

Yes, sorry for the inconvenience again. I will open another PR if I rewrite the codes and do the testing. I shouldn't bother you mainly for the learning. I hope I can contribute to the project in the future. Thanks a lot!

PetrochukM · 2020-07-04T05:04:45Z

I'm happy to help! Thanks for contributing!

YuhengHuang42 and others added 8 commits May 16, 2020 11:57

Support loading fasttext model from custom file

c45977d

fix problem in .flake8

878722c

add rouge-N

a09b9c3

add rouge-l and rouge-w

c1e90df

fix annotation errors

f5c9ce5

Merge pull request #1 from ST-Saint/rouge-l

e25c6be

add rouge-l and rouge-w

fix code-style problem

dd7e6e6

making fast_text the same with maste branch

8d30049

PetrochukM added 2 commits June 30, 2020 22:07

Merge branch 'master' into rouge

112f6d0

Merge branch 'master' into rouge

6d60aa8

PetrochukM suggested changes Jul 4, 2020

View reviewed changes

PetrochukM closed this Jul 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rouge #103

Rouge #103

YuhengHuang42 commented May 21, 2020

codecov-commenter commented May 21, 2020 •

edited

Loading

PetrochukM left a comment

PetrochukM Jul 4, 2020

PetrochukM Jul 4, 2020

PetrochukM Jul 4, 2020

PetrochukM Jul 4, 2020

PetrochukM Jul 4, 2020

PetrochukM Jul 4, 2020

PetrochukM Jul 4, 2020

PetrochukM Jul 4, 2020

PetrochukM Jul 4, 2020

PetrochukM Jul 4, 2020

YuhengHuang42 commented Jul 4, 2020 •

edited

Loading

PetrochukM commented Jul 4, 2020

YuhengHuang42 commented Jul 4, 2020

PetrochukM commented Jul 4, 2020

		return ngram_set


		def _frp_rouge_n(eval_count, ref_count, overlapping_count):



		def get_rouge_w(evaluated_sentence, reference_sentence,
		f=lambda x: x**2, inv_f=lambda x: math.sqrt(x)):

	f=lambda x: x**2, inv_f=lambda x: math.sqrt(x)):
	f=lambda x: x**2, inv_f=math.sqrt):

		return lcs_seq_wrd(len(x), len(y))


		def _w_lcs(x, y, func=lambda x: x**2):

		return True


		def _frp_rouge_w(wlcs, m, n, f=lambda x: x**2, inv_f=lambda x: math.sqrt(x),

Rouge #103

Rouge #103

Conversation

YuhengHuang42 commented May 21, 2020

codecov-commenter commented May 21, 2020 • edited Loading

Codecov Report

PetrochukM left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

YuhengHuang42 commented Jul 4, 2020 • edited Loading

PetrochukM commented Jul 4, 2020

YuhengHuang42 commented Jul 4, 2020

PetrochukM commented Jul 4, 2020

codecov-commenter commented May 21, 2020 •

edited

Loading

YuhengHuang42 commented Jul 4, 2020 •

edited

Loading