Skip to content
View bratao's full-sized avatar
👻
Improving inefficiencies
👻
Improving inefficiencies
  • Escavador

Highlights

  • Pro

Organizations

@FORMAS

Block or report bratao

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

NLP

90 repositories

Repository for the paper "Named Entity Recognition for Entity Linking: What Works and What's Next" (EMNLP 2021).

Python 75 2 Updated Feb 22, 2022

Easy Language Model Pretraining leveraging Huggingface's Transformers and Datasets

Python 128 15 Updated Nov 12, 2022

Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptatio…

Python 328 37 Updated Jul 6, 2023

A codebase that makes differentially private training of transformers easy.

Python 168 23 Updated Dec 9, 2022

Memory Efficient Attention (O(sqrt(n)) for Jax and PyTorch

Python 180 19 Updated Jan 6, 2023

Unofficially Implements https://arxiv.org/abs/2112.05682 to get Linear Memory Cost on Attention for PyTorch

Python 12 4 Updated Jan 16, 2022

Repository containing code for "How to Train BERT with an Academic Budget" paper

Python 311 46 Updated Sep 18, 2023

Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch

Python 857 106 Updated Oct 30, 2023

A Serverless Text Annotation Tool for Corpus Development

JavaScript 56 19 Updated Dec 16, 2024

Automatically create Faiss knn indices with the most optimal similarity search parameters.

Python 833 74 Updated May 21, 2024

Contriever: Unsupervised Dense Information Retrieval with Contrastive Learning

Python 709 61 Updated Apr 7, 2023

skweak: A software toolkit for weak supervision applied to NLP tasks

Python 923 75 Updated Sep 2, 2024

Represent, send, store and search multimodal data

Python 3,013 232 Updated Nov 22, 2024

Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/

Rust 21,817 1,492 Updated Feb 15, 2025

Language model fine-tuning on NER with an easy interface and cross-domain evaluation. "T-NER: An All-Round Python Library for Transformer-based Named Entity Recognition, EACL 2021"

Python 383 41 Updated May 11, 2023

Compiler for LightGBM gradient-boosted trees, based on LLVM. Speeds up prediction by ≥10x.

Python 392 33 Updated Dec 4, 2024

RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RN…

Python 13,150 892 Updated Feb 7, 2025

A machine learning tool for fishing entities

Java 258 24 Updated Feb 14, 2025

SpanNER: Named EntityRe-/Recognition as Span Prediction

Python 124 18 Updated May 13, 2022

Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways - in Jax (Equinox framework)

Python 186 10 Updated Jun 24, 2022

A Python library that encapsulates various methods for neuron interpretation and analysis in Deep NLP models.

Python 100 25 Updated Oct 4, 2023

For optimization algorithm research and development.

Python 490 37 Updated Feb 10, 2025

Fast and memory-efficient exact attention

Python 15,475 1,464 Updated Feb 12, 2025

Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language models

Python 2,959 597 Updated Jul 19, 2024

Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application

Python 1,251 179 Updated Feb 15, 2025

🕷️ The pipeline for the OSCAR corpus

Rust 166 14 Updated Dec 18, 2023

TLS and HTTP signature and fingerprint library

Python 39 2 Updated Jun 17, 2022

Large-scale pretrained models for goal-directed dialog

Python 862 112 Updated Dec 10, 2023

A Python implementation of the SimString, a simple and efficient algorithm for approximate string matching.

Python 6 Updated Nov 18, 2024

Download and load spaCy models on-the-fly

Python 14 1 Updated Feb 9, 2023