Become a sponsor to Adrien Barbaresi
Hi there! π
I enjoy exploring new ideas, and sharing what I learn while I build in the open. I also maintain projects which have been widely adopted, downloaded millions of times, and now rely on your support.
My current focus:
- Developing software solutions and resources (Trafilatura, Simplemma, Courlan)
- Actively maintaining packages (Htmldate, Py3langid)
- Publishing a list of natural language processing resources for German (German-NLP)
By supporting me, you will give me more incentive and resources to add new features to the projects. You will help maintain and enhance popular packages, ensuring their growth, robustness and ease of use for everyone.
Thank you a thousand times if you consider sponsoring me π
Featured work
-
adbar/trafilatura
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
Python 3,814 -
adbar/German-NLP
Curated list of open-access/open-source/off-the-shelf resources and tools developed with a particular focus on German
-
adbar/courlan
Clean, filter and sample URLs to optimize data collection β Python & command-line β Deduplication, spam, content and language filters
Python 129 -
adbar/htmldate
Fast and robust date extraction from web pages, with Python or on the command-line
Python 121 -
adbar/simplemma
Simple multilingual lemmatizer for Python, especially useful for speed and efficiency
Python 149 -
adbar/py3langid
Faster, modernized fork of the language identification tool langid.py
Python 50