Skip to content

Commit

Permalink
fixing data_directory not actually changing download directory and en…
Browse files Browse the repository at this point in the history
…forced spacy and thinc version requirements (#229)

Co-authored-by: Ethan Xia <[email protected]>
Co-authored-by: seanzhangkx8 <[email protected]>
  • Loading branch information
3 people authored Nov 8, 2024
1 parent 3a708b2 commit e1a1b19
Show file tree
Hide file tree
Showing 3 changed files with 13 additions and 7 deletions.
12 changes: 7 additions & 5 deletions convokit/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
import warnings
import zipfile
from typing import Dict

from .convokitConfig import ConvoKitConfig
import requests


Expand Down Expand Up @@ -108,15 +108,16 @@ def download(

custom_data_dir = data_dir

data_dir = os.path.expanduser("~/.convokit/")

config = ConvoKitConfig()
data_dir = config.data_directory
data_dir = os.path.expanduser(data_dir)
# pkg_resources.resource_filename("convokit", "")
if not os.path.exists(data_dir):
os.mkdir(data_dir)
if not os.path.exists(os.path.join(data_dir, "downloads")):
os.mkdir(os.path.join(data_dir, "downloads"))

dataset_path = os.path.join(data_dir, "downloads", name)
dataset_path = os.path.join(data_dir, name)

if custom_data_dir is not None:
dataset_path = os.path.join(custom_data_dir, name)
Expand Down Expand Up @@ -192,7 +193,8 @@ def download_local(name: str, data_dir: str):
:return: string path to local Corpus
"""
custom_data_dir = data_dir
data_dir = os.path.expanduser("~/.convokit/")
config = ConvoKitConfig()
data_dir = config.data_directory

# pkg_resources.resource_filename("convokit", "")
if not os.path.exists(data_dir):
Expand Down
5 changes: 4 additions & 1 deletion docs/source/troubleshooting.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,10 @@ Issues

**Error associated with Numpy**

ConvoKit currently requires Numpy 1.x.x, as Numpy 2.x is known to cause compatibility issues. Please verify your Numpy version. We are working on supporting Numpy 2.x and appreciate your understanding.
Pre Spacy 3.8.2 is not compatible with numpy 2.0.0+ due to compatibility issues with thinc. Spacy 3.8.2 is compatible with numpy 2.0.0+ but currently requires thinc to be >=8.3.0, <8.4.0, so as a temporary solution ConvoKit now enforces spacy>=3.8.2, thinc >=8.3.0, <8.4.0. We will continue to keep an eye on spacy releases and update the requirements if there are new releases targeting this issue.
For additional insight into the issue:
`spaCy issue #13528 <https://github.com/explosion/spaCy/issues/13528>`_
`thinc issue #939 <https://github.com/explosion/thinc/issues/939>`_

-----------------------------

Expand Down
3 changes: 2 additions & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@
"matplotlib>=3.0.0",
"pandas>=0.23.4",
"msgpack-numpy>=0.4.3.2",
"spacy>=2.3.5",
"spacy>=3.8.2",
"scipy>=1.1.0",
"scikit-learn>=0.20.0",
"nltk>=3.4",
Expand All @@ -56,6 +56,7 @@
"pymongo>=4.0",
"pyyaml>=5.4.1",
"dnspython>=1.16.0",
"thinc>=8.3.0,<8.4.0",
],
extras_require={
"craft": ["torch>=0.12"],
Expand Down

0 comments on commit e1a1b19

Please sign in to comment.