Analysis of flight delay data using logistic regression, random forest, and k-nearest neighbors classification
A large travel agency has asked us to predict whether a flight will be canceled based on several factors. The agency can sell tickets for only three airlines (AA, UA, and DL) and would like to be able to advise its customers on which airline has the least risk of cancellation. Using the dataset provided:
- Build a model to predict whether a flight will be canceled.
- Write your own function that uses the model output to predict whether a future flight will be canceled.
- Provide fully commented code and model output for your analysis.
- Provide a recommendation on which airline is most reliable.