Skip to content

suhyun0115/ml_project

Repository files navigation

Machine Learning Project

๋‹น๋‡จ๋ณ‘ ์˜ˆ์ธก ๋จธ์‹ ๋Ÿฌ๋‹ ๋ถ„์„ ํ”„๋กœ์ ํŠธ ์ž…๋‹ˆ๋‹ค.


๐Ÿ‘จโ€๐Ÿ‘งโ€๐Ÿ‘ฆ Team DBDBDeep ์†Œ๊ฐœ

์กฐ์„œํ˜„ ๊น€์œ ์ง„ ์ด์ˆ˜ํ˜„

์กฐ์„œํ˜„ / ๊น€์œ ์ง„ / ์ด์ˆ˜ํ˜„


๐Ÿ—’๏ธ ๋ชฉ์ฐจ(INDEX)

โ€ƒโ€‚ โ… . ํ”„๋กœ์ ํŠธ Concept ๋ฐ ๋ถ„์„ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์†Œ๊ฐœ
โ€ƒโ€‚ โ…ก. ํ”„๋กœ์ ํŠธ ๋ฐฉํ–ฅ
โ€ƒโ€‚ โ…ข. DataSets & ๋ถ„์„๋ณ€์ˆ˜
โ€ƒโ€‚ โ…ฃ. Data ์ „์ฒ˜๋ฆฌ (dataset ์ •๋ณด ๋ฐ ๊ฐ€๊ณต)
โ€ƒโ€‚ โ…ค. Machine-Learning (Model ์ •๋ณด)
โ€ƒโ€‚ โ…ฅ. ์ตœ์ข… Model
โ€ƒโ€‚ โ…ฆ. ์„œ๋น„์Šคํ™”


INDEX. โ…  ํ”„๋กœ์ ํŠธ Concept & ๋ถ„์„ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์†Œ๊ฐœ

Concept

[ํ™˜๊ฒฝ์  ์š”์ธ(์ƒํ™œ์Šต๊ด€) ๋‹น๋‡จ ์˜ˆ์ธก]

๐Ÿ“š skill

  • Programming
  • Framework
  • Tools
  • Git
import pandas
import numpy
import sklearn
import streamlit
import joblib
import wordcloud

INDEX. โ…ก ํ”„๋กœ์ ํŠธ ๋ฐฉํ–ฅ

ํ”„๋กœ์ ํŠธ๋ฐฉํ–ฅ

  • ๋‹น๋‡จ์˜ ๋ฐœ๋ณ‘ ์›์ธ ๋ถ„์„

ย ย ย  1) ์œ ์ „์  ์›์ธ

ย ย ย  โ—‹ ์ทŒ์žฅ์˜ ๋ฒ ํƒ€์„ธํฌ ์œ ์ „์  ๊ฒฐํ•จ
ย ย ย  โ—‹ ์ธ์Š๋ฆฐ ์ˆ˜์šฉ์ฒด ์œ ์ „์  ๊ฒฐํ•จ
ย ย ย  โ—‹ ์ธ์Š๋ฆฐ ์ž‘์šฉ๋ ฅ์„ ๊ฐ์†Œ์‹œํ‚ค๋Š” ์œ ์ „์ž

ย ย ย  2) ํ™˜๊ฒฝ์  ์›์ธ

ย ย ย  โ—‹ ์ŠคํŠธ๋ ˆ์Šค
ย ย ย  โ—‹ ๋…ธํ™”
ย ย ย  โ—‹ ๋น„๋งŒ์ฆ
ย ย ย  โ—‹ ์šด๋™๋ถ€์กฑ
ย ย ย  โ—‹ ๊ฐ์—ผ
ย ย ย  โ—‹ ์™ธ์ƒ
ย ย ย  โ—‹ ์ˆ˜์ˆ 
ย ย ย  โ—‹ ์ž„์‹  ๋ฐ ์•ฝ๋ฌผ
ย ย ย  โ—‹ ๋‚˜์œ ์‹์ƒํ™œ

INDEX. โ…ข DataSets & ๋ถ„์„๋ณ€์ˆ˜

  • DataSets

  • ์‚ฌ์ „ ๋ฐ์ดํ„ฐ ๋ถ„์„

    • ๋ถ„์„ํ•  ์ปฌ๋Ÿผ์„ ์ถ”๋ ค๋‚ด๊ธฐ ์œ„ํ•ด ๊ฒฐ์ธก์น˜ ๊ฐ’์— fillna()ํ•จ์ˆ˜ ์‚ฌ์šฉ -> 0์œผ๋กœ ๋Œ€์ฒด

    fillna

  • ๋ถ„์„ ๋ณ€์ˆ˜

NHIS_2018

# ๋‹น๋‡จ๋ณ‘ ๋ถ„์„ ๋ณ€์ˆ˜ ์„ ์ •
import pandas as pd
df_a = pd.read_csv('samadult.csv')
df_a = df_a[['SEX','AGE_P','R_MARITL','DIBEV1','HYPEV','PREGNOW','DEP_2','AFLHCA18','BMI',
            'AFLHC29_','AFLHC31_','AFLHC32_','AFLHC33_','SMKEV','ALC1YR','CHLEV','VIGNO',
            'AUSUALPL','ASICNHC','HIT1A']]

diabetes_age_count3 cholesterol

diabetes_age_sex

INDEX. โ…ฃ Data ์ „์ฒ˜๋ฆฌ (dataset ์ •๋ณด ๋ฐ ๊ฐ€๊ณต)

  • ์‚ฌ์šฉํ•œ colunms

    • ํ™˜๊ฒฝ, ์ƒํ™œ ์š”์ธ์— ๋”ฐ๋ผ ๋‹น๋‡จ ๋ฐœ๋ณ‘๋ฅ ์— ์˜ํ–ฅ์„ ์ค„ ๊ฒƒ์ด๋ผ ํŒ๋‹จ
    • ์„ฑ๋ณ„, ์—ฐ๋ น ๋“ฑ ์ธ๊ตฌํ†ต๊ณ„ํ•™์  ์š”์ธ ํฌํ•จํ•œ ํ™˜๊ฒฝ ์š”์ธ๋ณ„ ๋‹น๋‡จ ์˜ˆ์ธก์„ ์ง„ํ–‰

image

  • Data perprocessing

  1. ๊ธฐ๋ณธ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„ ์ƒ์„ฑ
  2. EDA (ํƒ์ƒ‰์  ๋ฐ์ดํ„ฐ ๋ถ„์„)
  3. ์„ค๋ฌธ ๋‹ต๋ณ€ 1/2๋กœ ์ •ํ˜•ํ™”
  4. null๊ฐ’ ๋ฐ ์ด์ƒ์น˜ ๋ฐ์ดํ„ฐ ์ •์ œ
  5. ์ปฌ๋Ÿผ๋ช… ์žฌ๊ตฌ์„ฑ
    • df_01, df_02, df_03, df_04, df_05, df_06์œผ๋กœ ์žฌ๊ตฌ์„ฑ
  6. ์Šค์ผ€์ผ๋ง ๋ฐ ๋ฐ์ดํ„ฐ csv ์ €์žฅ
    • one-hot encoding
    • _1 ๋Œ€์‹  _yes๋กœ ๋ณ€๊ฒฝ

INDEX. โ…ค Machine-Learning (Model ์ •๋ณด)

  1. SVC
  2. Decisiontree
  3. KNN
  4. Adaboost
  5. Naivebayes
  6. Randomforest
  7. XGBoost

image image

INDEX. โ…ฅ ์ตœ์ข… Model

  • Adaboost

image

INDEX. โ…ฆ ์„œ๋น„์Šคํ™”

image

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •