dev : neural network 대충 만들어봄

Jyeo-Archive · May 9, 2018 · c48f982 · c48f982
1 parent f10b176
commit c48f982
Show file tree

Hide file tree

Showing 4 changed files with 357 additions and 2 deletions.
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,3 @@
+.vscode/
+
+__pycache__/
diff --git a/README.md b/README.md
@@ -1,2 +1,141 @@
-# da-vinci-code-ai
-친구한테 지고 빡쳐서 만든 다빈치코드 보드게임 AI
+# Monorial
+Monorial(모노리얼) is an AI for the [Coda](https://en.wikipedia.org/wiki/Coda_(board_game))  game.
+
+Coda is usually known as Da Vinci Code(다빈치 코드).
+
+## Gameplay of Coda
+
+- https://en.wikipedia.org/wiki/Coda_(board_game)
+- https://namu.wiki/w/%EB%8B%A4%EB%B9%88%EC%B9%98%20%EC%BD%94%EB%93%9C(%EB%B3%B4%EB%93%9C%20%EA%B2%8C%EC%9E%84)
+
+## Implementation
+
+### Basic concept of idea
+Machine Learning needs...
+
+- data
+- output
+- target function
+- algorithm to minimize loss
+
+On our application...
+
+- data : AI가 알고 있는, 현재 게임판의 타일 상태
+- output : 상대의 타일 하나를 reasoning한 결과
+- target function : 현제 게임판의 타일 상태를 입력으로 하고, 상대의 타일 하나를 예측해서 가져온다.
+- algorithm to minimize loss : 상대의 타일 하나를 맞추는 데 성공했는가?
+
+장기적으로 게임을 바라보고 bluffing 등의 전략을 하는 경우도 있지만 이는 생각하지 않는다.
+
+### Things to consider
+
+- 조커가 들어온 경우 이를 어디에 place할지도 AI가 직접 정해야 함
+- 어떤 타일을 reasoning하는 것이 가장 유리할지 선택해야 함
+
+### End-to-end machine learning
+조커 place 문제만 나중에 살펴보고, 일단 기본적으로 end-to-end machine learning(사람의 개입 없이 출력을 얻음)으로 하고 프로그램 두 대를 서로 dual시키며 플레이하는 방식으로 이들을 학습시킬 데이터를 얻기로 함 
+
+따라서 개발 순서는...
+
+1. neural network (초기 가중치는 랜덤으로 적절히 조절)
+2. input data, get output
+3. output과 answer(상대의 타일)를 비교, get loss
+4. 기울기 구해 갱신
+5. 1~4를 자동화시켜서 서로 플레이하며 배우는 방식으로 자동화
+6. profit!!! (과연 여기까지 올 수 있을까)
+7. 만약 되면 GUI(웹서버 열던가)
+
+몰라... 일단 다 야매로 시작하는 거야!
+
+## Development
+
+### 딥러닝 특)
+무식한 본인 기준이다.
+
+1. 개요 부분만 보면 뭔 내용일지 알 것 같은데 정작 몇 페이지 더 가면 생전 듣도 보다 못한 수식이며 기호가 등장
+2. 이렇게 활용해야 할지 감이 안 잡힘
+
+### 야매로 AI 코딩하기
+```Python
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+```
+일단 무작정 `app.py` 파일을 만들었다.
+```Python
+# 1. neural network (초기 가중치는 랜덤으로 적절히 조절)
+# 2. input data, get output
+# 3. output과 answer(상대의 타일)를 비교, get loss
+# 4. 기울기 구해 갱신
+# 5. 1~4를 자동화시켜서 서로 플레이하며 배우는 방식으로 자동화
+# 6. profit!!! (과연 여기까지 올 수 있을까)
+# 7. 만약 되면 GUI(웹서버 열던가)
+```
+위에서 생각했던 개발 순서를 주석으로 달아두었다.
+
+가만히 생각해보니 1번 단계부터 구현하기 매우 어렵네...
+
+<밑바닥부터 시작하는 딥러닝>의 예제 코드를 이용하려 했건만 그것도 쉽지 않다.
+
+그냥 꼼수부릴 생각은 버리고 처음부터 읽으면서 직접 짜야겠다.
+
+### 신경망 각 층의 배열 형상
+
+![picture 1](images/pic_1.png)
+
+입력은 모든 타일 `(0~11 + 조커)*(흑&백)=13*2=26`, 26개 원소로 구성된 현재 타일 상태이고,
+
+출력 역시 원소 26개로 구성된 1차원 배열이다.
+
+일단은 위 그림과 같이 3층으로 구성된 신경망부터 만들어보자.
+
+입력과 출력의 각 원소는 상태에 따라 아래 세 값 중 하나를 가진다.
+
+- `0(상태를 알 수 없음)`
+- `1(AI가 가진 타일)`
+- `2(상대방이 가진 타일)`
+
+### 형상을 구현해따
+```Python
+>>> net = Network(26, [50, 100], 26)
+>>> net.params['w1'].shape
+(26, 50)
+>>> net.params['w2'].shape
+(50, 100)
+>>> net.params['w3'].shape
+(100, 26)
+```
+이렇게 하면 될 것 가따.
+
+```
+w1 shape : (26, 50)
+L b1 shape : (50,)
+w2 shape : (50, 100)
+L b2 shape : (100,)
+w3 shape : (100, 26)
+L b3 shape : (26,)
+```
+편향까지 보면 요렇게 나온다.
+
+### 기울기 산출을 하는데...
+```Python
+>>> from app import *
+>>> x = np.random.rand(1, 26)
+>>> t = np.random.rand(1, 26)
+>>> net = Network(26, [50, 100], 26)
+>>> grads = net.numerical_gradient(x, t)
+(에러 발생)
+PS C:\Users\JunhoYeo\Documents\GitHub\monorial>
+```
+
+헉!! 책에 나온 수치 미분 함수를 썼더니 `Python의 작동이 중지되었습니다`라면서 terminated되었다ㅠㅠ
+
+결국 오차역전파법을 써야 하나 보다...
+
+책 예제코드 써서 일단 3층짜리 신경망에 적용해뒀다.
+
+일단 커밋하고 나머지는 학교 컴실에서 해야지
+
+## Reference
+- https://namu.wiki/w/%EA%B8%B0%EA%B3%84%ED%95%99%EC%8A%B5
+- http://sanghyukchun.github.io/76
+- https://github.com/WegraLee/deep-learning-from-scratch
diff --git a/app.py b/app.py
@@ -0,0 +1,213 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+
+# 1. neural network (초기 가중치는 랜덤으로 적절히 조절)
+import sys
+import collections
+import numpy as np
+
+# sys.setrecursionlimit(10000) # 최대 재귀 깊이 설정
+
+def sigmoid(x): # sigmoid(시그모이드) 함수 : 은닉층에서 사용하는 활성화 함수
+    return 1 / (1 + np.exp(-x))
+
+def softmax(a): # softmax(소프트맥스) 함수 : 출력층에서 사용하는 활성화 함수
+    c = np.max(a)
+    exp_a = np.exp(a - c)
+    return exp_a / np.sum(exp_a)
+
+def cross_entropy_error(y, t): # 교차 엔트로피 오차 함수 : 손실 함수로 사용
+    # 배치 데이터 처리를 하게 될 수도 있으므로 모두를 처리할 수 있도록 구현된 코드를 사용
+    if y.ndim == 1:
+        t = t.reshape(1, t.size)
+        y = y.reshape(1, y.size)
+    batch_size = y.shape[0]
+    return -np.sum(np.log(y[np.arange(batch_size), t.astype('int64')])) / batch_size
+    '''
+    >>> t = [0, 0, 1]
+    >>> y = [0.5, 0.5]
+    >>> cross_entropy_error(np.array(y), np.array(t))
+    2.0794415416798357
+    '''
+    # 위와 같이 테스트한 결과 정상적으로 작동함
+
+class Relu:
+    def __init__(self):
+        self.mask = None
+
+    def forward(self, x):
+        self.mask = (x <= 0)
+        out = x.copy()
+        out[self.mask] = 0
+        return out
+
+    def backward(self, dout):
+        dout[self.mask] = 0
+        dx = dout
+        return dx
+
+class Affine:
+    def __init__(self, W, b):
+        self.W = W
+        self.b = b
+        self.x = None
+        self.original_x_shape = None
+        # 가중치와 편향 매개변수의 미분
+        self.dW = None
+        self.db = None
+
+    def forward(self, x):
+        # 텐서 대응
+        self.original_x_shape = x.shape
+        x = x.reshape(x.shape[0], -1)
+        self.x = x
+        out = np.dot(self.x, self.W) + self.b
+        return out
+
+    def backward(self, dout):
+        dx = np.dot(dout, self.W.T)
+        self.dW = np.dot(self.x.T, dout)
+        self.db = np.sum(dout, axis=0)
+        dx = dx.reshape(*self.original_x_shape)  # 입력 데이터 모양 변경(텐서 대응)
+        return dx
+
+class SoftmaxWithLoss:
+    def __init__(self):
+        self.loss = None # 손실함수
+        self.y = None    # softmax의 출력
+        self.t = None    # 정답 레이블(원-핫 인코딩 형태)
+
+    def forward(self, x, t):
+        self.t = t
+        self.y = softmax(x)
+        self.loss = cross_entropy_error(self.y, self.t)
+        return self.loss
+
+    def backward(self, dout=1):
+        batch_size = self.t.shape[0]
+        if self.t.size == self.y.size: # 정답 레이블이 원-핫 인코딩 형태일 때
+            dx = (self.y - self.t) / batch_size
+        else:
+            dx = self.y.copy()
+            dx[np.arange(batch_size), self.t] -= 1
+            dx = dx / batch_size
+        return dx
+
+class Network:
+    # input_size : 입력층의 뉴런 수 (제 1층)
+    # hidden_size : 은닉층의 뉴런 수
+        # hidden_size[0] : 첫 번째 은닉층의 뉴런 수 (제 2층)
+        # hidden_size[1] : 두 번째 은닉층의 뉴런 수 (제 3층)
+    # output_size : 출력층의 뉴런 수
+    def __init__(self, input_size, hidden_size, output_size):
+        weight_init_std = 0.01
+        self.params = {}
+        self.params['w1'] = weight_init_std * np.random.randn(input_size, hidden_size[0])
+        self.params['b1'] = np.zeros(hidden_size[0])
+        self.params['w2'] = weight_init_std * np.random.randn(hidden_size[0], hidden_size[1])
+        self.params['b2'] = np.zeros(hidden_size[1])
+        self.params['w3'] = weight_init_std * np.random.randn(hidden_size[1], output_size)
+        self.params['b3'] = np.zeros(output_size)
+        # 계층 생성
+        self.layers = collections.OrderedDict()
+        self.layers['Affine1'] = Affine(self.params['w1'], self.params['b1'])
+        self.layers['Relu1'] = Relu()
+        self.layers['Affine2'] = Affine(self.params['w2'], self.params['b2'])
+        self.layers['Relu2'] = Relu()
+        self.layers['Affine3'] = Affine(self.params['w3'], self.params['b3'])
+        self.lastLayer = SoftmaxWithLoss()
+
+    # def predict(self, x):
+    #     w1, w2, w3 = self.params['w1'], self.params['w2'], self.params['w3']
+    #     b1, b2, b3 = self.params['b1'], self.params['b2'], self.params['b3']
+    #     # 1층 
+    #     a1 = np.dot(x, w1) + b1
+    #     z1 = sigmoid(a1)
+    #     # 2층
+    #     a2 = np.dot(z1, w2) + b2 
+    #     z2 = sigmoid(a2)
+    #     # 3층
+    #     a3 = np.dot(z2, w3) + b3
+    #     y = softmax(a3)
+    #     return y
+    def predict(self, x):
+        for layer in self.layers.values():
+            x = layer.forward(x)
+        return x
+
+    # def loss(self, x, t):
+    #     y = self.predict(x)
+    #     return cross_entropy_error(y, t)
+    def loss(self, x, t):
+        y = self.predict(x)
+        return self.lastLayer.forward(y, t)
+
+    # def accuracy(self, x, t):
+    #     y = self.predict(x)
+    #     y = np.argmax(y, axis=1)
+    #     t = np.argmax(t, axis=1)
+    #     return np.sum(y == t) / float(x.shape[0])
+    def accuracy(self, x, t):
+        y = self.predict(x)
+        y = np.argmax(y, axis=1)
+        if t.ndim != 1 : 
+            t = np.argmax(t, axis=1)
+        accuracy = np.sum(y == t) / float(x.shape[0])
+        return accuracy
+
+    # def numerical_gradient(self, x, t):
+    #     loss_W = lambda W: self.loss(x, t)
+    #     grads = {}
+    #     grads['w1'] = self.numerical_gradient(loss_W, self.params['w1'])
+    #     grads['b1'] = self.numerical_gradient(loss_W, self.params['b1'])
+    #     grads['w2'] = self.numerical_gradient(loss_W, self.params['w2'])
+    #     grads['b2'] = self.numerical_gradient(loss_W, self.params['b2'])
+    #     grads['w3'] = self.numerical_gradient(loss_W, self.params['w3'])
+    #     grads['b3'] = self.numerical_gradient(loss_W, self.params['b3'])
+    #     return grads
+
+    def gradient(self, x, t):
+        # forward
+        self.loss(x, t)
+        # backward
+        dout = 1
+        dout = self.lastLayer.backward(dout)
+        layers = list(self.layers.values())
+        layers.reverse()
+        for layer in layers:
+            dout = layer.backward(dout)
+        # 결과 저장
+        grads = {}
+        grads['w1'], grads['b1'] = self.layers['Affine1'].dW, self.layers['Affine1'].db
+        grads['w2'], grads['b2'] = self.layers['Affine2'].dW, self.layers['Affine2'].db
+        grads['w3'], grads['b3'] = self.layers['Affine3'].dW, self.layers['Affine3'].db
+        return grads
+
+if __name__ == '__main__':
+    net = Network(26, [50, 100], 26)
+    # x = np.array([1.0, 0.5])
+    # y = network.predict(x)
+    # print(y)
+    print('w1 shape : ' + str(net.params['w1'].shape))
+    print('L b1 shape : ' + str(net.params['b1'].shape))
+    print('w2 shape : ' + str(net.params['w2'].shape))
+    print('L b2 shape : ' + str(net.params['b2'].shape))
+    print('w3 shape : ' + str(net.params['w3'].shape))
+    print('L b3 shape : ' + str(net.params['b3'].shape))
+
+    # x = np.array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25]])
+    # print(x)
+    # y = net.predict(x)
+    # print(y)
+
+# 2. input data, get output
+
+# 3. output과 answer(상대의 타일)를 비교, get loss
+
+# 4. 기울기 구해 갱신
+
+# 5. 1~4를 자동화시켜서 서로 플레이하며 배우는 방식으로 자동화
+
+# 6. profit!!! (과연 여기까지 올 수 있을까)
+
+# 7. 만약 되면 GUI(웹서버 열던가)
diff --git a/images/pic_1.png b/images/pic_1.png