This is a Machine Learning Project dealing in Predicting Housing Prices in the city of Boston, the capital of Massachusetts in the United States. The dataset (Boston Housing Price) was taken from the StatLib library which is maintained at Carnegie Mellon University and is freely available for download from the UCI Machine Learning Repository or we can load it directly from scikit-learn Library. The dataset consists of 506 observations of 14 attributes(features). The median value of house price in $10000s, denoted by MEDV, is the outcome or the dependent variable in our model. Below is a brief description of each feature and the outcome in our dataset: Variables:
- 1.CRIM – per capita crime rate by town
- 2.ZN – proportion of residential land zoned for lots over 25,000 sq.ft
- 3.CHAS – Charles River dummy variable (1 if tract bounds river; else 0)
- 4.NOX – nitric oxides concentration (parts per 10 million)
- 5.RM – average number of rooms per dwelling
- 6.AGE – proportion of owner-occupied units built prior to 1940
- 7.DIS – weighted distances to five Boston employment centres
- 8.RAD – index of accessibility to radial highways
- 9.INDUS – proportion of non-retail business acres per town
- 10.TAX – full-value property-tax rate per $10,000
- 11.PTRATIO – pupil-teacher ratio by town
- 12.B – 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town
- 13.LSTAT - % lower status of the population
- 14.MEDV – Median value of owner-occupied homes in $10000’s
This Projects uses the following softwares and libraries:
- Jupyter Notebook
- Python
- numpy
- pandas
- matplotlib
- seaborn
- scikit-learn
- In this project i have implemented a Linear Regression Model from scratch with gradient descent as an optimization technique and RSquared(r2score) and Root Mean Squared(RMSE) error as evaluation metrics.
- LinearRegression Model from scikit-learn library.
- DecisionTree Regression Model from scikit-learn library.
- Random Forest Regression Model from scikit-learn library.
The Aim is to analyze the data(Boston Housing) ,build a model that best performs on the data and to predict value of a house given its features.
- You can find complete code and details in .ipynb notebook.