A collection of R scripts to import PATSTAT (Autumn version 2020) into a PostgreSQL database, set up indicies and foreign keys, and create some summary statistics and auxiliary variables.
- First, set up a PosttGreSQL server (check here for instructions)
- Unzip the PATSTAT zip files 2 times (leading to a lot of
csv
files such astlsXXX_partXX.csv
) in the folder of the correspondingR
script. This is inconvenient but necessary, since theR
internalunzip()
- Run the
RMD
scriptmain_notebook.Rmd
placed in the same folder as the PATSTAT zip.files. You only need to enter your personal details to connect to your database in thedbConnect()
call.
Warning: While the data inpurt and creation of the foreign keys runs pretty smooth, but the creation of the indices takes quite some time. Better run it over night. An alternative is also to run the calls one-by-one.
Have fun!