This is a Python module to create count-based distributional models for semantic analysis. It was developed within the Nephological Semantics project at KU Leuven, mostly written by Tao Chen and with the collaboration of Dirk Geeraerts, Dirk Speelman, Kris Heylen, Weiwei Zhang, Karlien Franco, Stefano De Pascale and Mariana Montes.
The code can be implemented but still requires thorough automatic testing tools.
In order to use this code, clone this repository, add it to your PATH and then import the nephosem
library:
import os
os.path.append('/path/to/repository')
import nephosem
The theoretical framework and methodology followed in the project were presented by Mariana Montes and Karlien Franco in the II Jornadas de Lingüística y Gramática Española on October 1, 2021. You can watch the presentation in English or dubbed to Spanish.
Schütze, Hinrich. 1998. Automatic Word Sense Discrimination. Computational Linguistics 24(1). 97–123.
De Pascale, S. 2019. Token-based vector space models as semantic control in lexical lectometry. Leuven: KU Leuven PhD Dissertation. (8 November, 2019).
De Pascale, Stefano & Weiwei Zhang. 2021. Scoring with Token-based Models. A Distributional Semantic Replication of Socioectometric Analyses in Geeraerts, Grondelaers, and Speelman (1999). In Gitte Kristiansen, Karlien Franco, Stefano De Pascale, Laura Rosseel & Weiwei Zhang (eds.), Cognitive Sociolinguistics Revisited, 186–199. De Gruyter. https://doi.org/10.1515/9783110733945-021.
Montes, Mariana. 2021. Cloudspotting: visual analytics for distributional semantics. Leuven: KU Leuven PhD Dissertation.
Montes, Mariana, Karlien Franco & Kris Heylen. 2021. Indestructible Insights. A Case Study in Distributional Prototype Semantics. In Gitte Kristiansen, Karlien Franco, Stefano De Pascale, Laura Rosseel & Weiwei Zhang (eds.), Cognitive Sociolinguistics Revisited, 251–263. De Gruyter. https://doi.org/10.1515/9783110733945-021.
Montes, Mariana & Kris Heylen. 2022. Visualizing Distributional Semantics. In Dennis Tay & Molly Xie Pan (eds.), Data Analytics in Cognitive Linguistics. Methods and Insights. Mouton De Gruyter.
Heylen, Kris, Dirk Speelman & Dirk Geeraerts. 2012. Looking at word meaning. An interactive visualization of Semantic Vector Spaces for Dutch synsets. In Proceedings of the eacl 2012 Joint Workshop of LINGVIS & UNCLH, 16–24. Avignon.
Heylen, Kris, Thomas Wielfaert, Dirk Speelman & Dirk Geeraerts. 2015. Monitoring polysemy: Word space models as a tool for large-scale lexical semantic analysis. Lingua 157. 153–172.
Speelman, Dirk, Stefan Grondelaers, Benedikt Szmrecsanyi & Kris Heylen. 2020. Schaalvergroting in het syntactische alternantieonderzoek: Een nieuwe analyse van het presentatieve er met automatisch gegenereerde predictoren. Nederlandse Taalkunde 25(1). 101–123. https://doi.org/10.5117/NEDTAA2020.1.005.SPEE.