-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FuzzyWuzzy MIT? #84
Comments
It is not possible to use python-Levenshtein in a MIT Licensed library, which is the reason FuzzyWuzzy has the GPL license. |
@maxbachmann One could also look into directly using the C code with a Java Native Interface without porting the code, i don't have a lot of experience what that means performance wise though |
@Chase22 I do something similar already for Python. When using small strings with a fast similarity metric there is a relevant performance impact. However the main reason for this is that Python calls functions with a list of arguments and a hashmap of named arguments, which has to be parsed on each call. I could think of the following advantages/disadvantages of the JNI: + probably less maintenance since it reuses a big part of the code. Note however that in Python the Wrapper to call the C++ code from Python is actually much bigger than the code (partially because much of the code is generated). The C++ library has around 5k lines of code, while the wrapper has over 50k lines of code. - I guess it would have to be compiled for each platform, which can be a pain -/+ performance wise I am unsure as well. The JNI might add relevant overhead (e.g. in case all strings have to be copied). However the algorithms make heavy use of bitwise operations, which might be slower in pure Java. So this might go either way. |
GPL applies to the original Python code ONLY, but if you re-write this in Java it is no longer considered using the original code, but completely different code base? GPL does not apply to the algorithm, only the use of the original code "as is". |
GPL applies to any derivative work. https://github.com/xdrop/fuzzywuzzy/blob/e8376dfdc1c0cb72f7924f3a347bfcd39855dbeb/diffutils/src/me/xdrop/diffutils/DiffUtils.java is pretty much a 1:1 copy of a GPL licensed implementation. It doesn't hide this either:
I am pretty confident this counts as derivative work. |
Try this in your favourite AI tool: "If I rewrite some original Python code, licensed under GPL, into a different language (Java), does the original GPL still apply to the new codebase?" Here is what Google's Gemini said: "No, rewriting the code in a different language typically results in a new copyrighted work, and the GPL license wouldn't apply to the new Java codebase in itself. However, the GPL license might still affect how you can distribute your Java code if the original Python code interacts with GPL-licensed libraries or functions. |
I certainly wouldn't change the license of a mechanical translation of source code without consulting a lawyer. I mean you can do whatever you want to do in your own risk 🤷♂️ |
You can include "line-by-line rewrite" in the prompt, same result. But I understand your reasons for GPL license, unfortunately. Thank you for the quick answers! |
@maxbachmann Btw, the authors of the original GPL Python library, fuzzywuzzy perhaps realized their mistake with GPL, and transitioned the code to a new MIT-licensed repository - thefuzz. This is all under the same company - SeatGeek. So, you can re-align your code with the new repository (even pick up some recent fixes, as thefuzz seems frequently maintained) and re-publish your nice Java library under a MIT license too? |
I am not the author of this library. I am the author of There have been multiple people who wanted to work on this already, but they all disappeared before implementing anything 🤷♂️ |
has @xdrop abandoned this repo? |
There's a mit version in python
Can we have the same for java?
The license is the biggest issue i and 90%other developers are facing
And the worst thing is there is no alternate library in java with bare minimum performance like this library
I've searched everywhere
Levenshtein distance port for java is available but it performs very poorly for use case when you match users input (2-3chars) with list of strings
Eg matching "sai" with school names
The text was updated successfully, but these errors were encountered: