Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mid-term Peer Review -- zh378 #10

Open
zxhuang96 opened this issue Nov 11, 2019 · 0 comments
Open

Mid-term Peer Review -- zh378 #10

zxhuang96 opened this issue Nov 11, 2019 · 0 comments

Comments

@zxhuang96
Copy link

This project aims to use big data techniques to predict whether a building permit will be issued, and how long it will take in days for a permit application to move from the filing stage to the issued stage. The data sets used are the SF Permit Data and the NYC Permit Data, which contain a large number of records and highly useful features. The results of the projects can benefit many groups of people, such as homeowners, property developers, real estate agents, etc.

I like many aspects of the project and the team’s achievements. First, the team provided detailed discussions of the physical limitations of the data sets and their plans to overcome these challenges. Second, the team produced nice visualizations of the data sets and made legible analyses of the plots. Third, the team has run great preliminary analyses of the data set, and promising future plans to improve the results have been put forward.

Here are several suggestions that I hope can be useful for the team:

  • It would be better if a more detailed and clearer description of the data sets can be provided, such as how many features are present, what is the type of each feature, etc.
  • It is mentioned in the report that there are about 1300 data points that are missing location data, so it is advisable for the team to discuss how they plan to handle these missing values.
  • The team mentioned at the end of the report that some additional data sets, such as crime rates data and transportation access data, will be needed. I would recommend the team to introduce how they plan to get these data, and what potential challenges in using these data sets might exist.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant