Skip to content

1 Potential Questions

souribe edited this page Nov 5, 2017 · 1 revision

Welcome to the project-dataPlus wiki!

Potential Data Science Questions (Need 3 questions with accompanying paragraph):

Questions should answer some of the following:

  • Goals: What are you trying to use data science to do? What decision are you informing?
  • Prior Knowledge: What do you know or assume beforehand? Why?
  • Social Relationship: How may you interact with other stakeholders when conducting your analysis?
  • Models: How may you represent your objects of analysis?
  • Choices: What options exist for the decision you are informing?
  • Outcomes: Given these choices, what are the outcomes of deciding on a given choice?
  1. College Scorecard Data https://collegescorecard.ed.gov/data/documentation/

Possible Question(s):

  • Have 4-year universities in the US taken an effort to diversify their college campus via applicant's acceptance/admission?
  • (Sopheak’s preferred question) To what extent does household income impact the decision for colleges to accept a student to the campus?

We are hoping to utilize College Scorecard Data to analyze various factors colleges consider when accepting students to colleges. There are many factors such as household income, student relative location, student demographics, standardized exam score, and etc. If we were to take a holistic view of all or some colleges on their yearly admission demographics, we would be able to find out their admission trends and students demographics they are considering when accepting students to their campus. For example, do prestigious schools weigh a student's SAT/ACT scores more than their background (household income)? Or does a school take any substantial effort to increase the admission POC students or diversity their campus? We can learn the admission trends, share, and provide insight to students as a means to supplement their college application/exploration.
We assume that colleges, private or public, take a strong consideration on how they spend their budget annually. With that in mind, their expenses on things such as student resources, class sizes, buildings renovation, may reflect on their revenue. Therefore, colleges may choose to accept students who are more likely to bring in revenue (either via tuition or donation in future years) than stronger applicants of a lower income family.

  1. Q: How realistic is the matching percentage on Netflix?

We want to use data science to observe if the match ratings on Netflix are actually realistically designed such that they provide tv shows and movies that the user would enjoy watching based on their previously liked tv shows and movies. We know that netflix has a matching system but we are assuming that every new TV show or movie shown on your home screen under each genre as well as under “Suggestions for You” has been chosen based on your previously watched TV show or movie. We will try to contact Netflix and ask them if they would be willing to share their algorithm for educational purposes. We would also survey users asking if they found suggested tv shows and movies to be things they would like to watch.

Goals: What are you trying to use data science to do? What decision are you informing? Prior Knowledge: What do you know or assume beforehand? Why? Social Relationship: How may you interact with other stakeholders when conducting your analysis?

  1. Q: Do restaurants in affluent areas get better reviews and rating? The goal in this search is to find if the socio-economic of the area affect restaurant's reviews. In lower-class communities, restaurant owners often do not have the budget to improve their store's appearance for better experience. For food review sites such as Yelp and TripAdvisor, they have the location of the business, type of business, and average ratings which can be used to analyze. The outcome of these results will allow us to understand if the socio-economic of the area does affect the ratings of restaurants.
Clone this wiki locally