All analyses/models developed here are based on the data uploaded to the Mushroom Observer.
I am assuming you use linux. If you use conda, run
conda create --name <env> --file environment.yml
If you use pip, run
pip install -r requirements.txt
- Consider all ranks together. Alternative: Use only one rank.
- Map all IDs to the smallest ID within a synonym group that is not deprecated.
- Should there be no non-deprecated ID within a synonym group, map all IDs from the group to the smallest deprecated ID within the group.
- Use a sample of the peak around 2.5 as a holdout set, check balanced precision for different values of vote cache. Use tabular model for this experiment.
- We can map
lat
andlong
tonorth
,south
,east
, andwest
in a tabular model.