[DL] soft max, 다중분류모델

티스토리 뷰

Artificial Intelligence/Deep Learning

[DL] soft max, 다중분류모델

inee0727 2022. 7. 4. 20:58

soft max(다중분류)

소프트 맥스는 세개 이상으로 분류하는 다중 클래스 분류에서 사용되는 활성화 함수이다.
소프트 맥스 함수는 분류될 클래스가 n개라고 할때 n 차원의 벡터를 입력받아 각 클래스에 속할 확률을 추정한다. 소프트 맥스는 확률의 총합이 1이므로 어떤 분류에 속할 확률이 가장 높을지를 쉽게 인지 할 수 있다.
분류라는 것은 로지스틱 회귀 모델에 의해 파생되는 것이므로 기계학습에 대해 자세히 알기 위해서는 로지스틱 회귀 모델에 대해 자세히 알 필요가 있다.

HOW TO 다중분류모델?

1. one hot encoding 처리를 한다. (by 케라스 / 사이킷런 / 판다스 겟더미)
from tensorflow.keras.utils import to_categorical
y = to_categorical(y)

원-핫 인코딩(One-Hot Encoding)은 여러 기법 중 단어를 표현하는 가장 기본적인 표현 방법

문자를 기계가 이해할 수 있는 숫자로 바꾼 결과 또는 그 과정을 임베딩(Embedding)이라고 한다.
가장 간단한 형태의 임베딩은 문장에 어떤 단어가 많이 쓰였는지를 파악하여 글쓴이의 의도를 알 수 있다.

원-핫 인코딩은 단어 집합의 크기를 벡터의 차원으로 하고, 표현하고 싶은 단어의 인덱스에 1의 값을 부여하고,
다른 인덱스에는 0을 부여하는 벡터 표현 방식이다. 이렇게 표현된 벡터를 원-핫 벡터(One-Hot Vector)라고 한다.
텍스트를 숫자로 표현하는 대표적인 방법으로 하나의 1과 수많은 0으로 표현하는 방법이다.

2. 데이터 분류 시 셔플 주의!! shuffle=True

3. y 분류 값에 대한 숫자만큼(y 값의 종류만큼)을 노드로 빼주며, 마지막 레이러의 활성화 함수를 softmax로 설정한다.

print(np.unique(y)) >> y 라벨 값 : 3
예) model.add(Dense(3, activation='softmax'))

4. loss로 categorical_crossentropy 설정

- 이진분류모델에서는 loss로 binary_crossentropy
- 다중분류모델에서는 loss로 categorical_crossentropy

5. accuracy_score 구할 때 : argmax 사용

라이브러리 호출

import numpy as np
from sklearn.datasets import load_iris #1. 데이터
from sklearn.model_selection import train_test_split #1. 데이터

from tensorflow.python.keras.models import Sequential #2. 모델구성
from tensorflow.python.keras.layers import Dense #2. 모델구성

1. 데이터_불러오기

datasets = load_iris()

x = datasets['data']
y = datasets.target

print(datasets.DESCR)
print(datasets.feature_names)

150행 : Number of Instances: 150 (50 in each of three classes : Iris-Setosa / Iris-Versicolour / Iris-Virginica)
4열 : sepal length / sepal width / petal length / petal width

1. 데이터_one hot encoding 처리

print('x,y shape :',x.shape, y.shape) #(150, 4) #(150, )
print("y의 라벨값(y의 고유값) :", np.unique(y)) #y의 라벨값(y의 고유값) [0 1 2]

from tensorflow.keras.utils import to_categorical 
y = to_categorical(y)

print('one hot encoding 처리 후 y shape :',y.shape) #(150,3)

x,y shape : (150, 4) (150,)

y의 라벨값(y의 고유값) : [0 1 2]

one hot encoding 처리 후 y shape : (150, 3)

1. 데이터_train과 test로 데이터 나누기

x_train, x_test, y_train, y_test = train_test_split( x, y, train_size = 0.8, shuffle=True, random_state=68 )

2. 모델구성

model = Sequential()
model.add(Dense(5,input_dim = 4))
model.add(Dense(10, activation='relu'))
model.add(Dense(25, activation='relu'))
model.add(Dense(20, activation='relu'))
model.add(Dense(3, activation='softmax'))

- sigmoid가 극 값으로 가면 미분값이 매우 작아지는 현상이 일어나 relu 등장 (기울기 소실 = gradient vanishing)
- relu는 0에서 1사이로 출력값이 제한되지도 않고, 0 이상일 때 일정한 기울기를 가지고 있어 gradient vanishing 문제 해결

3. 컴파일, 훈련

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

from tensorflow.python.keras.callbacks import EarlyStopping
earlyStopping = EarlyStopping(monitor='val_loss', patience=300, mode='auto', verbose=1, 
                              restore_best_weights=True)        


hist = model.fit(x_train, y_train, epochs=100, batch_size=50,validation_split=0.2, callbacks=[earlyStopping],verbose=1)

이진분류모델에서는 loss로 binary_crossentropy
다중분류모델에서는 loss로 categorical_crossentropy

4-1. 평가

#[loss, acc 출력방법 1]
loss, acc = model.evaluate(x_test, y_test)
print('loss : ' , loss)
print('accuracy : ', acc) 

#[loss, acc 출력방법 2]
results = model.evaluate(x_test, y_test)
print('loss : ' , results[0])
print('accuracy : ', results[1])

1/1 [==============================] - 0s 23ms/step - loss: 0.2238 - accuracy: 0.9333

loss : 0.22375254333019257

accuracy : 0.9333333373069763

4-2. 예측 (argmax 사용)

from sklearn.metrics import r2_score, accuracy_score  

y_test = np.argmax(y_test, axis= 1)
print(y_test)

y_predict = model.predict(x_test)
y_predict = np.argmax(y_predict, axis= 1)
print(y_predict)

acc= accuracy_score(y_test, y_predict)
print('acc스코어 : ', acc)

[2 0 0 1 1 1 1 0 1 0 0 0 2 2 1 1 2 2 2 2 2 1 0 2 1 0 2 0 2 2]

[2 0 0 1 1 1 1 0 1 0 0 0 2 2 1 1 2 1 2 2 2 1 0 2 1 0 2 0 2 1]

acc스코어 : 0.9333333333333333

'Artificial Intelligence > Deep Learning' 카테고리의 다른 글

[DL] summary()로 모델구조 확인 (0)	2022.07.06
[DL] 회귀/분류모델 표 정리 (0)	2022.07.05
[DL] validation/metrics/val_loss/early stopping/history (0)	2022.07.02
[DL] activation 활성화함수 (0)	2022.07.02
[DL] DNN(Deep Neural Net) 전체적인 Flow (0)	2022.06.29

공지사항

최근에 올라온 글

최근에 달린 댓글

Total

Today

Yesterday

링크

TAG more

« 2025/05 »
일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

글 보관함

minimalism

티스토리 뷰