python-intensive

Credits:

Much of these materials were adapted from those produced by Software Carpentry. Thank you!

Project: Using Python for Information Retrieval

In this unit, we'll use python to turn a bunch of loose text documents into a real-life database. (Note: This database was created for a project by R. Terman and E. Voeten, and was processed using much the same process as you'll be learning here.)

The lecture and problem set will leverage your new python skills, especially working with text, lists, and dictionaries; writing for-loops, conditional statements, and functions; and "thinking" like a programmer.

About the Data

We'll be creating a database from Universal Period Review outcome reports.

The Universal Periodic Review (UPR) is a process run by the United Nations Human Rights Council, which involves a periodic review of the human rights records of all 193 UN Member States.

Reviews take place through an interactive discussion between the State under review and other UN Member States. During this discussion any UN Member State can pose questions, comments and/or make recommendations to the States under review. States under review can then respond, stating which recommendations they reject, accept, will consider, etc. Reports are then drawn up detailing this discussion.

We will be analyzing outcome reports from the 2014 Universal Period Reviews of 42 countries, which we retrieved here and formatted as text documents.

The goal is to convert these semi-structured texts to a tabular dataset of recommendations with the following variables:

Text of recommendation (text)
Country to which the recommendation is directed (to)
Country that is making the recommendation (from)
The year when the review took place (year)
The response to the recommendation, i.e. whether the reviewed country rejects, accepts, etc (decision)

In other words, we want to turn this:

into this:

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
Day_1		Day_1
Day_2		Day_2
Day_3		Day_3
Day_4		Day_4
img		img
.gitignore		.gitignore
Glossary.md		Glossary.md
Install.md		Install.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

python-intensive

Credits:

Project: Using Python for Information Retrieval

About

Releases

Packages

Languages

admndrsn/python-intensive

Folders and files

Latest commit

History

Repository files navigation

python-intensive

Credits:

Project: Using Python for Information Retrieval

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages