Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Splink4 banner #2252

Merged
merged 3 commits into from
Jul 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 7 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,19 +7,20 @@
[![Documentation](https://img.shields.io/badge/API-documentation-blue)](https://moj-analytical-services.github.io/splink/)

> [!IMPORTANT]
> Development has begun on Splink 4 on the `splink4_dev` branch. Splink 3 is in maintenance mode and we are no longer accepting new features. We welcome contributions to Splink 4. Read more on our latest [blog](https://moj-analytical-services.github.io/splink/blog/2024/04/02/splink-3-updates-and-splink-4-development-announcement---april-2024.html).
> 🎉 Splink 4 is nearing release! We'd love your feedback - try it by installing the [prerelease](https://pypi.org/project/splink/4.0.0.dev7/). Examples of new syntax are [here](https://robinl.github.io/splink/demos/examples/examples_index.html) and a blog about our aims is [here](https://moj-analytical-services.github.io/splink/blog/2024/04/02/splink-3-updates-and-splink-4-development-announcement---april-2024.html) 🎉


# Fast, accurate and scalable probabilistic data linkage

Splink is a Python package for probabilistic record linkage (entity resolution) that allows you to deduplicate and link records from datasets that lack unique identifiers.

## Key Features

⚡ **Speed:** Capable of linking a million records on a laptop in around a minute.
🎯 **Accuracy:** Support for term frequency adjustments and user-defined fuzzy matching logic.
🌐 **Scalability:** Execute linkage in Python (using DuckDB) or big-data backends like AWS Athena or Spark for 100+ million records.
🎓 **Unsupervised Learning:** No training data is required for model training.
📊 **Interactive Outputs:** A suite of interactive visualisations help users understand their model and diagnose problems.
⚡ **Speed:** Capable of linking a million records on a laptop in around a minute.
🎯 **Accuracy:** Support for term frequency adjustments and user-defined fuzzy matching logic.
🌐 **Scalability:** Execute linkage in Python (using DuckDB) or big-data backends like AWS Athena or Spark for 100+ million records.
🎓 **Unsupervised Learning:** No training data is required for model training.
📊 **Interactive Outputs:** A suite of interactive visualisations help users understand their model and diagnose problems.

Splink's linkage algorithm is based on Fellegi-Sunter's model of record linkage, with various customisations to improve accuracy.

Expand Down
2 changes: 1 addition & 1 deletion docs/overrides/main.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,6 @@
{% block announce %}
<!-- Add announcement here, including arbitrary HTML -->

<center>🎉 Check out our new <a href="https://moj-analytical-services.github.io/splink/blog/2024/04/02/splink-3-updates-and-splink-4-development-announcement---april-2024.html">Blog Post</a> to see the latest features in Splink 3, plus a sneak peak of what we have in store for Splink 4! 🎉</center>
<center>🎉 Splink 4 is nearing release! We'd love your feedback - try it by installing the <a href ="https://pypi.org/project/splink/4.0.0.dev7/">prerelease</a>. Examples of new syntax are <a href="https://robinl.github.io/splink/demos/examples/examples_index.html">here</a> a blog about our aims is <a href="https://moj-analytical-services.github.io/splink/blog/2024/04/02/splink-3-updates-and-splink-4-development-announcement---april-2024.html">here</a> 🎉</center>

{% endblock %}
Loading