In order to view the HTML file in your browser, please visit: https://htmlpreview.github.io/ and paste the link: https://github.com/i-a-m-s-k/Cardiovascular-Diseases-Analysis/blob/main/SI544_Final_Project.html
The National Health Interview Survey (NHIS) has monitored the health of the nation since 1957. This is one of the data sets provided by the National Cardiovascular Disease (CVD) Surveillance System. This data set aims to provide a comprehensive picture of the public health burden of CVDs and associated risk factors in the United States. The data are organized by location (region) and indicator, and they include CVDs (e.g., heart failure) and risk factors (e.g., hypertension).
Reference: Data Provided by Centers for Disease Control and Prevention, National Center for Chronic Disease Prevention and Health Promotion, Division of Health Disease and Stroke Prevention (DHDSP), National Cardiovascular Disease Surveillance System. https://chronicdata.cdc.gov/Heart-Disease-Stroke-Prevention/Behavioral-Risk-Factor-Surveillance-System-BRFSS-N/ikwk-8git Accessed on: 1st December 2021
The National Health Interview Survey (NHIS) has monitored the health of the nation since 1957. This is one of the data sets provided by the National Cardiovascular Disease (CVD) Surveillance System. The purpose of this project is to explore and analyze the Behavioral Risk Factor Surveillance System (BRFSS) data set provided by the Centers for Disease Control and Prevention (CDC) to gain insights into the public health burden of CVDs and associated risk factors in the United States.
The main purpose of this project is to analyze the BRFSS data set and explore the relationships between various factors related to cardiovascular diseases and risk factors. The project aims to identify trends, patterns, and associations in the data to help understand the public health burden of CVDs and associated risk factors in the United States.
The analysis begins by importing the BRFSS data set into R and cleaning the data by removing rows with missing values. The cleaned data set is then merged with another data set containing category scores in binary format, where 1 denotes "cardiovascular diseases" and 0 denotes "risk factors".
The analysis includes exploratory data analysis, which involves visualizing the data using plots and summarizing the data using descriptive statistics. The plots include bar plots, scatter plots, and box plots, which are used to identify relationships between various factors related to cardiovascular diseases and risk factors.
The analysis also includes inferential statistics, which involves hypothesis testing and confidence interval estimation. Hypothesis testing is used to test whether there is a significant difference between two groups, while confidence interval estimation is used to estimate the population parameter of interest.
Overall, the analysis provides insights into the public health burden of CVDs and associated risk factors in the United States and can be used to inform public health policies and interventions aimed at reducing the burden of CVDs and associated risk factors.