You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current bigram model was built from a corpus with just over 1 millions tokens (words and punctuation). We need probably something five times this size for the production version. The current sparseness is illustrated by the prediction context "The j" where "Jewish" and "Jews" are both in the prediction list, something that would not be expected after analysis of a larger, more representative English corpus.
The text was updated successfully, but these errors were encountered:
The current bigram model was built from a corpus with just over 1 millions tokens (words and punctuation). We need probably something five times this size for the production version. The current sparseness is illustrated by the prediction context "The j" where "Jewish" and "Jews" are both in the prediction list, something that would not be expected after analysis of a larger, more representative English corpus.
The text was updated successfully, but these errors were encountered: