-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Demo considers 'MECHANICALLY SEPARATED CHICKEN' as vegan #9
Comments
Good point! will be added. Thx! |
Fixed! |
I disagree, the fundamental problem is still there. Yes, At the very least, search the entire string, don't just match the exact string. |
I understand what you mean, a wildcard search is not the answer as well... It could match parts which could make it worse than matching an exact string... |
@fluxsauce how about adding a regex search for obvious meat / fish species? |
That would work; I'd call it a component match and fill it with terms that shouldn't false positive, such as: pig, pork, lard, beef, ribs, fillet, poultry, chicken, turkey, eggs, sheep, mutton, lamb, goat, rabbit, caviar, roe, honey, venison, steak it'd be a shorter list than whole ingredients. |
I'd be wary of searching for particular substrings like "chicken", because you can easily also end up needing to add prefixes and such to avoid false negatives on things like "vegan chicken" or "chicken alternative", or "chicken tofu", etc. Seems like there's a different (better) solution out there, but I'm not sure what it is. The above solution(s) work somewhat if you prefer false negatives over false positives, though. |
It could be mitigated with a blacklist of known false positives like "chicken alternative". There's a reason why I personally avoid signature-based scanners, it's a constant "two steps forward and one step back" of exceptions. Try valid US street address parsing as an example, seems simple until it isn't :-) |
https://github.com/hmontazeri/is-vegan/blame/25e87fef5b88f92319f001c6af96ff658fddcbf2/README.md#L152
Consider using fuzzy matching with a degree of confidence instead of string matching.
The text was updated successfully, but these errors were encountered: