Skip to content

Commit

Permalink
Extract all the spark specific code
Browse files Browse the repository at this point in the history
  • Loading branch information
jelmerk committed Dec 30, 2023
1 parent 0f71351 commit 71fb2e7
Show file tree
Hide file tree
Showing 87 changed files with 57 additions and 7,139 deletions.
7 changes: 6 additions & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
name: CI pipeline

permissions:
checks: write

on:
pull_request:
paths:
Expand All @@ -9,6 +12,8 @@ on:
- '*'
tags-ignore:
- 'v[0-9]+.[0-9]+.[0-9]+'
paths-ignore:
- '**.md'

jobs:
ci-pipeline:
Expand Down Expand Up @@ -40,7 +45,7 @@ jobs:
3.9
- name: Build and test
run: |
sbt -java-home "$JAVA_HOME_17_X64" clean +test -DsparkVersion="$SPARK_VERSION"
sbt -java-home "$JAVA_HOME_8_X64" clean +test -DsparkVersion="$SPARK_VERSION"
- name: Publish Unit test results
uses: mikepenz/action-junit-report@v4
with:
Expand Down
9 changes: 6 additions & 3 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
@@ -1,11 +1,13 @@
name: Publish pipeline

on:
workflow_dispatch:
permissions:
contents: read

on:
push:
tags:
- 'v[0-9]+.[0-9]+.[0-9]+'
workflow_dispatch:

jobs:
publish-artifacts:
Expand Down Expand Up @@ -45,4 +47,5 @@ jobs:
PASSPHRASE: ${{ secrets.GPG_PASSPHRASE }}
- name: Publish artifacts
run: |
sbt -java-home "$JAVA_HOME_17_X64" clean +publishSigned -DsparkVersion="$SPARK_VERSION"
sbt -java-home "$JAVA_HOME_8_X64" clean +publishSigned -DsparkVersion="$SPARK_VERSION"
sbt sonatypeBundleRelease
5 changes: 5 additions & 0 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
name: Release pipeline

permissions:
contents: write

on:
workflow_dispatch:
inputs:
Expand All @@ -13,6 +16,8 @@ jobs:
steps:
- name: Checkout main branch
uses: actions/checkout@v3
with:
token: ${{ secrets.RELEASE_TOKEN }}
- name: Release
run: |
git config --global user.email "[email protected]"
Expand Down
16 changes: 0 additions & 16 deletions .run/Template ScalaTest.run.xml

This file was deleted.

1 change: 0 additions & 1 deletion .sbtopts

This file was deleted.

2 changes: 1 addition & 1 deletion .sdkmanrc
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Enable auto-env through the sdkman_auto_env config
# Add key=value pairs of SDKs to use below
java=17.0.9-amzn
java=8.0.382-amzn
scala=2.12.18
sbt=1.9.8
95 changes: 0 additions & 95 deletions FAQ.md

This file was deleted.

37 changes: 6 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,35 +1,10 @@
[![Build Status](https://github.com/jelmerk/hnswlib/actions/workflows/ci.yml/badge.svg?branch=master)](https://github.com/jelmerk/hnswlib/actions/workflows/ci.yml)
[![Build Status](https://github.com/jelmerk/hnswlib-spark/actions/workflows/ci.yml/badge.svg?branch=master)](https://github.com/jelmerk/hnswlib/actions/workflows/ci.yml)

Hnswlib
=======
Hnswlib spark
=============

Spark / PySpark integration for the [hsnwlib](https://github.com/jelmerk/hnswlib) project that was originally part of
the core hnswlib project.

Java implementation of the [the Hierarchical Navigable Small World graphs](https://arxiv.org/abs/1603.09320) (HNSW) algorithm for doing approximate nearest neighbour search.

The index is thread safe, serializable, supports adding items to the index incrementally and has experimental support for deletes.

It's flexible interface makes it easy to apply it to use it with any type of data and distance metric.

The following distance metrics are currently pre-packaged :

- bray curtis dissimilarity
- canberra distance
- correlation distance
- cosine distance
- euclidean distance
- inner product
- manhattan distance

It comes with [spark integration](https://github.com/jelmerk/hnswlib/tree/master/hnswlib-spark), [pyspark integration](https://github.com/jelmerk/hnswlib/tree/master/hnswlib-pyspark) and a [scala wrapper](https://github.com/jelmerk/hnswlib/tree/master/hnswlib-scala) that should feel native to scala developers

To find out more about how to use this library take a look at the [hnswlib-examples](https://github.com/jelmerk/hnswlib/tree/master/hnswlib-examples) module or browse the documentation
To find out more about how to use this library take a look at the [hnswlib-spark-examples](https://github.com/jelmerk/hnswlib-spark/tree/master/hnswlib-spark-examples) module or browse the documentation
in the readme files of the submodules

Sponsors
--------

![YourKIT logo](https://www.yourkit.com/images/yklogo.png)

YourKit is the creator of [YourKit Java Profiler](https://www.yourkit.com/java/profiler),
[YourKit .NET Profiler](https://www.yourkit.com/.net/profiler/),
and [YourKit YouMonitor](https://www.yourkit.com/youmonitor/).
Loading

0 comments on commit 71fb2e7

Please sign in to comment.