MUR Task #17

gyhou123 · 2023-04-26T16:15:04Z

Hello, I would like to consult the following line of code.

, mlm_tgt_encodings, * = self.utt_encoder.bert(context_mlm_targets[ctx_mlm_mask], context_utts_attn_mask[ctx_mlm_mask])

context_mlm_targets[ctx_mlm_mask] represents the utterance tokenization before [MASK]
context_utts_attn_mask[ctx_mlm_mask] represents the attention mask after [MASK]

They don't match.
Why not recalculate the attention mask？

guxd · 2023-05-24T12:50:58Z

By saying [MASK], do you mean masking utterances in contexts or masking words in utterances?
If the former, then 'context_utts_attn_mask' represents the attention mask before [MASK].
Please check Line 249 in data_loader.py: context_utts_attn_mask = [[1]*len(utt) for utt in context], which does not set masked positions to 0's.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MUR Task #17

MUR Task #17

gyhou123 commented Apr 26, 2023

guxd commented May 24, 2023

MUR Task #17

MUR Task #17

Comments

gyhou123 commented Apr 26, 2023

guxd commented May 24, 2023