Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding WordNet as synonym generator #41

Merged
merged 5 commits into from
Mar 6, 2024
Merged

Conversation

HonzaCuhel
Copy link
Contributor

This PR includes:

  • creating an abstract class for synonym generation
  • adding WordNet synonym generator

Comparison of Mistral synonym generator vs WordNet:

Object Mistral Generated Synonyms WordNet Generated Synonyms
astronaut ['astronaut', 'cosmonaut', 'spaceman'] ['spaceman', 'astronaut', 'cosmonaut']
cat ['feline', 'kitty', 'puss'] ['throw up', 'bozo', 'kat']
dog ['cat', 'pup', 'hound'] ['chase', 'weenie', 'wienerwurst']
person ['individual', 'human', 'being'] ['mortal', 'individual', 'soul']
horse ['horse', 'equine', 'nag'] ['sawhorse', 'knight', 'gymnastic horse']
car ['automobile', 'vehicle', 'transportation'] ['cable car', 'automobile', 'elevator car']
alien ['Extraterrestrial', 'extraneous', 'foreign'] ['disaffect', 'extraterrestrial', 'foreigner']
city ['town', 'urban area', 'metropolis'] ['metropolis', 'urban center', 'city']
plane ['level', 'flat', 'even'] ['aeroplane', 'planer', 'shave']
airport ['airport', 'terminal', 'aviation'] ['aerodrome', 'airport', 'drome']
robot ['machine', 'automaton', 'mechanism'] ['golem', 'robot', 'automaton']
tractor ['Tractor', 'Truck', 'Vehicle'] ['tractor']
bus ['1 vehicle', '2 coach', '3 publictransport'] ['motorbus', 'passenger vehicle', 'autobus']
bicycle ['bicycle', 'bike', 'bicycle'] ['pedal', 'cycle', 'wheel']

Latency (measured on device with 30GB RAM, L4 24GB gpu) of generation 3 synonyms for 14 objects:

  • Mistral: 8 seconds
  • WordNet: 1 second

Copy link

github-actions bot commented Mar 1, 2024

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines Covered Coverage Threshold Status
919 444 48% 0% 🟢

New Files

File Coverage Status
datadreamer/prompt_generation/lm_synonym_generator.py 32% 🟢
datadreamer/prompt_generation/wordnet_synonym_generator.py 81% 🟢
TOTAL 56% 🟢

Modified Files

File Coverage Status
datadreamer/pipelines/generate_dataset_from_scratch.py 44% 🟢
datadreamer/prompt_generation/init.py 100% 🟢
datadreamer/prompt_generation/synonym_generator.py 84% 🟢
TOTAL 76% 🟢

updated for commit: 8b05c99 by action🐍

@sokovninn
Copy link
Member

@HonzaCuhel is it possible to configure wordnet to always return nouns, because for the word "bear" it returns verbs.

@HonzaCuhel
Copy link
Contributor Author

Yes, it's possible

Copy link

github-actions bot commented Mar 6, 2024

Test Results

  6 files  ± 0    6 suites  ±0   45m 51s ⏱️ - 2m 44s
 81 tests + 5   33 ✅ + 3   48 💤 + 2  0 ❌ ±0 
486 runs  +30  198 ✅ +18  288 💤 +12  0 ❌ ±0 

Results for commit 8b05c99. ± Comparison against base commit bd40543.

@HonzaCuhel
Copy link
Contributor Author

@sokovninn changed the code, so the generator only returns noun synonyms

@sokovninn sokovninn merged commit 3804ca0 into dev Mar 6, 2024
9 checks passed
@sokovninn sokovninn deleted the feature/wordnet-synonyms branch April 8, 2024 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants