Skip to content
This repository has been archived by the owner on Jun 1, 2022. It is now read-only.

Update location providers based on cvs: / walgreens: / etc concordance identifiers #707

Closed
simonw opened this issue Jun 29, 2021 · 28 comments
Labels
importers Tools that import data into VIAL

Comments

@simonw
Copy link
Collaborator

simonw commented Jun 29, 2021

Part of #705.

Key idea here is to ensure that all of our locations with a cvs: concordance are attached to the CVS provider, ditto for walgreens: and various other common concordance authorities.

This will allow us to pick up the appointments_url from those providers, if it is set, when we serve locations to our new API and to the Mapbox export used on www.VaccinateTheStates.com

@simonw simonw added the importers Tools that import data into VIAL label Jun 29, 2021
@simonw
Copy link
Collaborator Author

simonw commented Jun 29, 2021

The https://vial.calltheshots.us/api/docs#post-apiupdatelocations /api/updateLocations API can be used for this - it accepts provider_name and provider_type which do this:

vial/vaccinate/api/views.py

Lines 157 to 161 in 9cb542e

if location_data.get("provider_type"):
kwargs["provider"] = Provider.objects.update_or_create(
name=location_data["provider_name"],
defaults={"provider_type": location_data["provider_type"]},
)[0]

@simonw
Copy link
Collaborator Author

simonw commented Jun 29, 2021

So I can run a script in a Jupyter notebook that retrieves the locations that need to be updated using https://vial.calltheshots.us/api/searchLocations?authority=cvs - maybe even add an extra search parameter for ?provider_null=1 to help with that.

@simonw
Copy link
Collaborator Author

simonw commented Jun 29, 2021

simonw added a commit that referenced this issue Jun 29, 2021
@simonw simonw changed the title Update locaiton providers based on cvs: / walgreens: / etc concordance identifiers Update location providers based on cvs: / walgreens: / etc concordance identifiers Jun 30, 2021
@simonw
Copy link
Collaborator Author

simonw commented Jun 30, 2021

Being able to use ?format=ids to return just the IDs of matching results would help with this (and all sorts of other things too).

@simonw
Copy link
Collaborator Author

simonw commented Jun 30, 2021

This means I can know get the ID of every single location that is tagged with the cvs: authority but does not currently have a provider using:

https://vial-staging.calltheshots.us/api/searchLocations?format=ids&provider_null=1&authority=cvs&all=1

@simonw
Copy link
Collaborator Author

simonw commented Jun 30, 2021

Here's a map of all locations attached to the provider called CVS: https://vial-staging.calltheshots.us/api/searchLocations?provider=CVS&format=map&all=1 - 3832 locations

All locations with a cvs: concordance identifier: https://vial-staging.calltheshots.us/api/searchLocations?authority=cvs&format=map&all=1 - 2798 locations

All locations with a cvs: authority that are not attached to any providers: https://vial-staging.calltheshots.us/api/searchLocations?authority=cvs&format=map&all=1&provider_null=1 - 109 locations

@simonw
Copy link
Collaborator Author

simonw commented Jun 30, 2021

Interestingly this suggests that, at least on staging, we have CVS locations that are NOT tagged with a cvs: concordance identifier.

@simonw
Copy link
Collaborator Author

simonw commented Jun 30, 2021

https://vial-staging.calltheshots.us/api/searchLocations?provider=CVS&format=map&all=1&exclude.authority=cvs shows a map of every location attached to the CVS provider that does NOT have a cvs: concordance. 1322 locations.

@simonw
Copy link
Collaborator Author

simonw commented Jun 30, 2021

Deploying this stuff to production to answer these questions there.

@simonw
Copy link
Collaborator Author

simonw commented Jun 30, 2021

In production we only have 72 locations with provider CVS that don't have a cvs: concordance, all of which are in California: https://vial.calltheshots.us/api/searchLocations?provider=CVS&format=map&all=1&exclude.authority=cvs

@simonw
Copy link
Collaborator Author

simonw commented Jun 30, 2021

Spotted something weird: https://vial.calltheshots.us/api/searchLocations?authority=cvs returns 17707 results. But that seems like a LOT of CVSs.

https://www.google.com/search?client=firefox-b-1-d&q=how+many+branches+of+CVS suggests there are 9,967 branches in the USA.

On a hunch, I pulled and de-duped the full list of IDs using this:

curl 'https://vial.calltheshots.us/api/searchLocations?authority=cvs&format=ids&all=1' -H 'Authorization: Bearer 30:...' \
  | jq | sort | uniq | wc -l

Which returned 10,883 - so there's a bug in the API where multiple results can be returned.

@simonw
Copy link
Collaborator Author

simonw commented Jun 30, 2021

This SQL query shows locations with multiple cvs: concordances:

select location.public_id, count(*), array_agg(authority || ':' || identifier)
from concordance_location join concordance_identifier on concordance_identifier.id = concordanceidentifier_id
join location on location_id = location.id
where authority = 'cvs' group by location.public_id
having count(*) > 1
order by count(*) desc

https://vial.calltheshots.us/dashboard/?sql=select+location.public_id%2C+count%28%2A%29%2C+array_agg%28authority+%7C%7C+%27%3A%27+%7C%7C+identifier%29%0D%0Afrom+concordance_location+join+concordance_identifier+on+concordance_identifier.id+%3D+concordanceidentifier_id%0D%0Ajoin+location+on+location_id+%3D+location.id%0D%0Awhere+authority+%3D+%27cvs%27+group+by+location.public_id%0D%0Ahaving+count%28%2A%29+%3E+1%0D%0Aorder+by+count%28%2A%29+desc%3AYNvYK31mkC1Kq8Tzr1xIQsunYgPUAqZJ2Axb-4VmNYg

There are a LOT - grouping by their duplicate count gives:

count	n
2	4193
3	358
4	259
5	189
6	64
7	9
8	2

@simonw
Copy link
Collaborator Author

simonw commented Jun 30, 2021

That fixed it:

curl 'https://vial.calltheshots.us/api/searchLocations?authority=cvs&format=ids&all=1' \
  -H 'Authorization: Bearer 30:...' \
  | jq | sort | uniq | wc -l
   10883

@simonw
Copy link
Collaborator Author

simonw commented Jun 30, 2021

https://vial.calltheshots.us/api/searchLocations?authority=cvs&exclude.provider=CVS&size=0 returns 7112 results - I'm going to assign them all to the CVS provider using the API.

@simonw
Copy link
Collaborator Author

simonw commented Jun 30, 2021

Running this now:

import httpx
from tqdm.notebook import tqdm

api_key = "30:..."
url = "https://vial.calltheshots.us/api/searchLocations?authority=cvs&exclude.provider=CVS&all=1&format=ids"
ids = httpx.get(url, headers={"Authorization": "Bearer {}".format(api_key)}).json()

def make_chunks(sequence, n):
    for i in range(0, len(sequence), n): 
        yield sequence[i:i + n]

chunks = list(make_chunks(ids, 100))

for chunk in tqdm(chunks):
    httpx.post("https://vial.calltheshots.us/api/updateLocations", timeout=20, json={
      "update": {
        id: {
          "provider_type": "Pharmacy",
          "provider_name": "CVS",
        } for id in chunk
      },
      "revision_comment": "Issue #707"
    }, headers={"Authorization": "Bearer {}".format(api_key)})

Hitting https://vial.calltheshots.us/api/searchLocations?authority=cvs&exclude.provider=CVS&size=0 while it is running shows the number reducing as it works.

@simonw
Copy link
Collaborator Author

simonw commented Jun 30, 2021

OK, what other authorities should I do?

Here's the top list: https://vial.calltheshots.us/dashboard/?sql=select%20%22authority%22%2C%20count%28%2A%29%20as%20n%20from%20%28select%20id%2C%20authority%2C%20identifier%2C%20created_at%20from%20concordance_identifier%29%20as%20results%20group%20by%20%22authority%22%20order%20by%20n%20desc%3AydXh_yEhn8L7UhP0pZuK5zTvFVCUmDm6ilDzvbORH7I

authority	n
placekey	89932
vaccinefinder_org	55414
us_carbon_health	46036
vaccinespotter_org	38904
getmyvax_org	32587
us_giscorps_vaccine_providers	23034
cvs	13934
walgreens	9015
google_places	8592
il_sfsites	8031
al_arcgis	7189
health_mart	4960
walmart	4679
tx_arcgis	4304
vtrcks	4284
rite_aid	4079
kroger	2713
wa_state	2526
publix	2377
fl_state	2303
sc_arcgis	2257
nc_myspot_gov	2041
md_arcgis	1731
mo_arcgis	1701
safeway	1660
wi_arcgis_map	1206
az_arcgis	1173
in_arcgis	1140
arcgis	1121
vaccinate_nj	1088
pa_arcgis	923
vaxfinder_gov	779
mn_gov	733
nyc_arcgis	707
la_tableau	703
ca_los_angeles_gov	694
ok_vaccinate_gov	639
sams	573
hyvee	552
ky_govstatus	551
costco	538
prepmod	505
ia_state	501
little_clinic	448
winn_dixie	431
tn_vaccinate_gov	428
albertson	390
sav_on	328
heb	311
weis	300
osco	288
meijer	256
stop_and_shop	253
il_juvare	240
wa_prepmod	237
vons	234
ct_gov	233
ct_covidvaccinefinder_gov	233
harris_teeter	229
giant_eagle	213
hart	207
vaccinefinder	205
smiths	184
jewel_osco	180
ri_arcgis	163
immunizenevada_org	162
fred_meyer	160
hannaford	156
ga_dph	153
giant_food	152
shop_rite	148
king_soopers	148
ak_arcgis	147
ak_clinic_list	147
ma_immunizations	138
acme	137
ct	129
price_chopper	129
duane_reade	126
frys_food_and_drug	123
tom_thumb	118
giant	115
sf_gov	110
food_city	108
kaiser_permanente	108
wegmans	103
thrifty_white	95
wyo_appt_portal	94
thrifty	87
_tag_provider	84
ingles	84
co_colorado_gov	82
cub_pharmacy	80
ralphs	77
ct_state	64
covidvaccinefinder_gov	60
pick_n_save	59
brookshire	58
dillons	54
genoa_healthcare	54
shaws	46
ny_northwell_health	43
raleys	41
marianos	40
pavilions	37
market_street	36
united_supermarket	36
city_market	33
al_jefferson	33
food_lion	32
randalls	32
qfc	32
albertsons_market	29
pharmaca	27
haggen	26
harps	24
ilvaccine_org	24
az_pinal_ph_vaccinelocations_gov	23
ca_metrolink	23
us_physicians_immediate	23
homeland	21
hartig	21
market_32	20
kaiser_health_plan	20
harmons	18
medicap	18
harveys	17
metro_market	12
carrs	11
bakers	11
star_market	10
wv_dhhr	9
dc_district	9
pay_less	9
gerbes	6
kta_super_stores	6
big_y	6
az_ph_pinal_clinics_gov	6
tx_memorialhermann	5
fresco_y_mas	5
jayc	5
amigos	3
drugco	2
_issue	2
market_bistro	1
pak_n_save	1

@simonw
Copy link
Collaborator Author

simonw commented Jun 30, 2021

I'm going to start with these ones:

  • - cvs
  • - walgreens
  • - health_mart
  • - walmart
  • - rite_aid
  • - kroger
  • - safeway
  • - sams
  • - hyvee
  • - costco

@simonw
Copy link
Collaborator Author

simonw commented Jun 30, 2021

@simonw
Copy link
Collaborator Author

simonw commented Jun 30, 2021

@simonw
Copy link
Collaborator Author

simonw commented Jun 30, 2021

I'm going to update all of the California "Rite-Aid Pharmacy" records to use "Rite-Aid" as their provider instead, then I'll rename the "Rite-Aid Pharmacy" provider in VIAL to "OBSOLETE Rite-Aid DO NOT USE"

Done that, now https://vial.calltheshots.us/api/searchLocations?provider=Rite-Aid&format=map&all=1 has 2228 locations.

@simonw
Copy link
Collaborator Author

simonw commented Jun 30, 2021

@simonw
Copy link
Collaborator Author

simonw commented Jun 30, 2021

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
importers Tools that import data into VIAL
Projects
None yet
Development

No branches or pull requests

1 participant