You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've uploaded word clouds for every language in "docs". Besides, I realized that Roni Kaufman and Clément Godbarge have already done the charts I had in mind - language/category and language/parent tag correlations. If you agree, I think I should just as well use theirs and add my comments and observations on what this data can tell us. Thank you!
The text was updated successfully, but these errors were encountered:
can you describe how you created these wordclouds?
How are you populating the data? We noticed that not all terms/phrases were represented and some seemed cut off
What app are you using to generate the clouds?
do the colors of the words have significance/meaning?
perhaps it is better to generate a vocabulary diversity score for each language and plot it as a bar chart or scattergram?
--> these questions of methodology (ie how you made these charts) should be included in this repo but also in your final paper
It would be great to use what Roni and Clement have already generated if they are of use to you and your argument. Just remember to cite them
I created the wordclouds using a world cloud generator online (https://www.freewordcloudgenerator.com/) following Terry's advice. I extracted the texts from the tc dataset after cleaning the data from GitHub on Excel. Maybe some text was lost in the process - I'll double-check. I'll take your suggestion about diversity scores converted into bar charts - maybe with different colors for each tag in the manuscript, in order to provide another visualization of the languages-tags correlation.
I'll cite Roni and Clement of course, and make sure to discuss my methodology in my paper. Thank you again for your precious help!
@njr2128 @tcatapano
I've uploaded word clouds for every language in "docs". Besides, I realized that Roni Kaufman and Clément Godbarge have already done the charts I had in mind - language/category and language/parent tag correlations. If you agree, I think I should just as well use theirs and add my comments and observations on what this data can tell us. Thank you!
The text was updated successfully, but these errors were encountered: