-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finding a Proper Fix/Replacement for Coreferee? #9
Comments
From Dr. Lynch in #7 (now merged with current issue): In the early development of the components an issue was found with the coreferee/spacy module where it makes errors in the pronominal reference when it calculates pronoun antecedents around third-person plural pronouns. The solution for this was the development of a link to a separate untested BERT server. In the GRE passage I used for testing, the pronoun "they" was used to refer to animates, but the coreferee module overrode the semantic valence of the verb to prefer the syntactically most prominent potential antecedent, which was inanimate. We need to develop (a) unit tests for this that can reliably evaluate whether the full system is working; and (b) evaluate the relative cost of doing this probability evaluation with BERT, Spacy/Coreferee, and LanguageTool which appears to have a built in probability estimation feature. The code that uses the BERT service is located in awe_components/components/utility_functions.py under ResolveReferences. The code that runs the BERT service is under AWE_Workbench. |
Related to coreferee issue here: richardpaulhudson/coreferee#29 Other related: |
Moved issue to AWE_Components, since that's where the solution will sit. |
From Paul:
multiple_essay_report.py
is a script used to visualize document/token features of our spacy NLP pipeline at work. This script can be used to verify if pronouns like "they" are properly used in a document.Currently, Paul has noted that coreferee (which is used for coreference resolution) fails to properly do this at the current version of spacy (specifically for pronoun antecedents). Our text examples include essays written for the GRE, which make use of pronouns in ways which spacy or coreferee are not trained to handle/properly identify. Now that we are updating to spacy 3.6+, we need to see if coreferee continues to do poorly; finally, we would like to see if other coreference modules perform better.
The text was updated successfully, but these errors were encountered: