🔥알림🔥
① 테디노트 유튜브 - 구경하러 가기!
② LangChain 한국어 튜토리얼 바로가기 👀
③ 랭체인 노트 무료 전자책(wikidocs) 바로가기 🙌

5 분 소요

TensorFlow 2.0의 ImageDataGenerator를 활용하여 Image 데이터를 로컬 폴더에서 로딩 후 Generator를 통해 Image Augmentation과 모델에 Feed 할 수 있는 Generator를 만들어 보도록 하겠습니다.

ImageDataGenerator만 잘 활용해도 적은 Image 데이터 Augmentation으로 생성된 다양한 Data를 Generation할 수 있다는 장점이 있습니다.

Image Augmentation을 적용한 후에는 Convolution Neural Network가 사진으로부터 어떻게 feature extraction을 하는지 직접 시각화해보고, model을 만들어 ImageDataGenerator를 통한 학습까지 진행해 보도록 하겠습니다.



import urllib.request
import zipfile
import numpy as np
import os

from IPython.display import Image

import tensorflow as tf
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dropout, Dense
from tensorflow.keras.models import Sequential
from tensorflow.keras.preprocessing.image import ImageDataGenerator
%%javascript
IPython.OutputArea.auto_scroll_threshold = 50;

STEP 1. Load Dataset & Define Folder

고양이를 분류하는 classification 문제입니다.

_URL = 'https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip'
path_to_zip = tf.keras.utils.get_file('cats_and_dogs.zip', origin=_URL, extract=True)
PATH = os.path.join(os.path.dirname(path_to_zip), 'cats_and_dogs_filtered')

download 받은 폴더의 경로를 출력합니다

os.listdir(PATH)
['vectorize.py', 'train', 'validation']

trainvalidation폴더로 구분하여 Dataset을 제공하고 있습니다.

각각의 폴더에 대한 ImageDataGenerator를 만들어 줍니다.

train_path = os.path.join(PATH, 'train')
validation_path = os.path.join(PATH, 'validation')

STEP 2. ImageDataGenerator

이미지의 RGB 값이 0~255 값으로 표현되는데 우리는 0~1 값으로 바꿔줍니다.

original_datagen = ImageDataGenerator(rescale=1./255)
  • rescale: 이미지의 픽셀 값을 조정
  • rotation_range: 이미지 회전
  • width_shift_range: 가로 방향으로 이동
  • height_shift_range: 세로 방향으로 이동
  • shear_range: 이미지 굴절
  • zoom_range: 이미지 확대
  • horizontal_flip: 횡 방향으로 이미지 반전
  • fill_mode: 이미지를 이동이나 굴절시켰을 때 빈 픽셀 값에 대하여 값을 채우는 방식
training_datagen = ImageDataGenerator(
    rescale=1. / 255,
    rotation_range=30,
    width_shift_range=0.1,
    height_shift_range=0.1,
    shear_range=0.1,
    zoom_range=0.1,
    horizontal_flip=True,
    fill_mode='nearest')

STEP 3. Make Generator

flow_from_directory

original_generator = original_datagen.flow_from_directory(train_path, 
                                                          batch_size=128, 
                                                          target_size=(150, 150), 
                                                          class_mode='binary'
                                                         )
Found 2000 images belonging to 2 classes.
training_generator = training_datagen.flow_from_directory(train_path, 
                                                          batch_size=128, 
                                                          shuffle=True,
                                                          target_size=(150, 150), 
                                                          class_mode='binary'
                                                         )
Found 2000 images belonging to 2 classes.
validation_generator = training_datagen.flow_from_directory(validation_path, 
                                                            batch_size=128, 
                                                            shuffle=True,
                                                            target_size=(150, 150), 
                                                            class_mode='binary'
                                                           )
Found 1000 images belonging to 2 classes.

STEP 4. 시각화 해보기

import matplotlib.pyplot as plt

%matplotlib inline
class_map = {
    0: 'Cats',
    1: 'Dogs', 
}
print('오리지널 사진 파일')

for x, y in original_generator:
    print(x.shape, y.shape)
    print(y[0])
    
    fig, axes = plt.subplots(2, 5)
    fig.set_size_inches(15, 6)
    for i in range(10):
        axes[i//5, i%5].imshow(x[i])
        axes[i//5, i%5].set_title(class_map[int(y[i])], fontsize=15)
        axes[i//5, i%5].axis('off')
    plt.show()
    break
    
print('Augmentation 적용한 사진 파일')
    
for x, y in training_generator:
    print(x.shape, y.shape)
    print(y[0])
    
    fig, axes = plt.subplots(2, 5)
    fig.set_size_inches(15, 6)
    for i in range(10):
        axes[i//5, i%5].imshow(x[i])
        axes[i//5, i%5].set_title(class_map[int(y[i])], fontsize=15)
        axes[i//5, i%5].axis('off')
    
    plt.show()
    break
오리지널 사진 파일
(128, 150, 150, 3) (128,)
1.0
Augmentation 적용한 사진 파일
(128, 150, 150, 3) (128,)
0.0

Convolution Neural Network (CNN)

CNN - activation - Pooling 과정을 통해 이미지 부분 부분의 주요한 Feature 들을 추출해 냅니다.

CNN을 통해 우리는 다양한 1개의 이미지를 filter를 거친 다수의 이미지로 출력합니다.

filter의 사이즈는 3 X 3 필터를 자주 사용합니다

또한, 3 X 3 필터를 거친 이미지의 사이즈는 2px 만큼 사이즈가 줄어듭니다.

Image('https://devblogs.nvidia.com/wp-content/uploads/2015/11/fig1.png')
for x, y in original_generator:
    pic = x[:5]
    break
pic.shape
(5, 150, 150, 3)
plt.imshow(pic[0])
<matplotlib.image.AxesImage at 0x7f4b8fadd828>

Conv2D를 거친 후

conv2d = Conv2D(64, (3, 3), input_shape=(150, 150, 3))
conv2d_activation = Conv2D(64, (3, 3), activation='relu', input_shape=(150, 150, 3))
fig, axes = plt.subplots(8, 8)
fig.set_size_inches(16, 16)
for i in range(64):
    axes[i//8, i%8].imshow(conv2d(pic)[0,:,:,i], cmap='gray')
    axes[i//8, i%8].axis('off')
fig, axes = plt.subplots(8, 8)
fig.set_size_inches(16, 16)
for i in range(64):
    axes[i//8, i%8].imshow(conv2d_activation(pic)[0,:,:,i], cmap='gray')
    axes[i//8, i%8].axis('off')
plt.imshow(conv2d(pic)[0,:,:,34], cmap='gray')
<matplotlib.image.AxesImage at 0x7f4b74351e10>
plt.imshow(conv2d_activation(pic)[0,:,:,34], cmap='gray')
<matplotlib.image.AxesImage at 0x7f4b742190f0>

MaxPooling2D를 거친 후

fig, axes = plt.subplots(8, 8)
fig.set_size_inches(16, 16)
for i in range(64):
    axes[i//8, i%8].imshow(MaxPooling2D(2, 2)(conv2d(pic))[0, :, :, i], cmap='gray')
    axes[i//8, i%8].axis('off')

STEP 5. Build Model

model = Sequential([
    Conv2D(16, (3, 3), padding='same', activation='relu', input_shape=(150, 150, 3)),
    MaxPooling2D(2, 2), 
    Conv2D(32, (3, 3), padding='same', activation='relu'),
    MaxPooling2D(2, 2), 
    Conv2D(64, (3, 3), padding='same', activation='relu'),
    MaxPooling2D(2, 2), 
    Flatten(), 
    Dense(512, activation='relu'),
    Dense(1, activation='sigmoid'),
])
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_2 (Conv2D)            (None, 150, 150, 16)      448       
_________________________________________________________________
max_pooling2d_64 (MaxPooling (None, 75, 75, 16)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 75, 75, 32)        4640      
_________________________________________________________________
max_pooling2d_65 (MaxPooling (None, 37, 37, 32)        0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 37, 37, 64)        18496     
_________________________________________________________________
max_pooling2d_66 (MaxPooling (None, 18, 18, 64)        0         
_________________________________________________________________
flatten (Flatten)            (None, 20736)             0         
_________________________________________________________________
dense (Dense)                (None, 512)               10617344  
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 513       
=================================================================
Total params: 10,641,441
Trainable params: 10,641,441
Non-trainable params: 0
_________________________________________________________________

STEP 6. optimizer, loss

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['acc'])

STEP 7. train

epochs=25
history = model.fit(training_generator, 
                    validation_data=validation_generator,
                    epochs=epochs)
Train for 16 steps, validate for 8 steps
Epoch 1/25
16/16 [==============================] - 14s 878ms/step - loss: 0.8127 - acc: 0.5020 - val_loss: 0.6907 - val_acc: 0.4990
Epoch 2/25
16/16 [==============================] - 14s 851ms/step - loss: 0.6878 - acc: 0.5315 - val_loss: 0.6773 - val_acc: 0.6490
Epoch 3/25
16/16 [==============================] - 14s 848ms/step - loss: 0.6688 - acc: 0.5855 - val_loss: 0.6407 - val_acc: 0.6620
Epoch 4/25
16/16 [==============================] - 14s 846ms/step - loss: 0.6417 - acc: 0.6290 - val_loss: 0.6237 - val_acc: 0.6410
Epoch 5/25
16/16 [==============================] - 14s 845ms/step - loss: 0.6258 - acc: 0.6350 - val_loss: 0.6527 - val_acc: 0.6170
Epoch 6/25
16/16 [==============================] - 13s 841ms/step - loss: 0.6159 - acc: 0.6640 - val_loss: 0.6147 - val_acc: 0.6560
Epoch 7/25
16/16 [==============================] - 13s 841ms/step - loss: 0.5955 - acc: 0.6785 - val_loss: 0.6009 - val_acc: 0.6820
Epoch 8/25
16/16 [==============================] - 14s 846ms/step - loss: 0.5797 - acc: 0.6990 - val_loss: 0.5807 - val_acc: 0.6870
Epoch 9/25
16/16 [==============================] - 13s 843ms/step - loss: 0.5707 - acc: 0.6975 - val_loss: 0.5891 - val_acc: 0.6770
Epoch 10/25
16/16 [==============================] - 13s 842ms/step - loss: 0.5648 - acc: 0.7160 - val_loss: 0.6064 - val_acc: 0.6550
Epoch 11/25
16/16 [==============================] - 13s 840ms/step - loss: 0.5712 - acc: 0.7045 - val_loss: 0.5826 - val_acc: 0.6950
Epoch 12/25
16/16 [==============================] - 14s 847ms/step - loss: 0.5332 - acc: 0.7375 - val_loss: 0.5847 - val_acc: 0.6790
Epoch 13/25
16/16 [==============================] - 14s 846ms/step - loss: 0.5168 - acc: 0.7410 - val_loss: 0.5584 - val_acc: 0.7070
Epoch 14/25
16/16 [==============================] - 13s 843ms/step - loss: 0.5211 - acc: 0.7450 - val_loss: 0.5489 - val_acc: 0.7050
Epoch 15/25
16/16 [==============================] - 14s 846ms/step - loss: 0.5078 - acc: 0.7455 - val_loss: 0.5565 - val_acc: 0.7280
Epoch 16/25
16/16 [==============================] - 13s 841ms/step - loss: 0.5068 - acc: 0.7575 - val_loss: 0.5402 - val_acc: 0.7270
Epoch 17/25
16/16 [==============================] - 14s 846ms/step - loss: 0.4909 - acc: 0.7605 - val_loss: 0.5196 - val_acc: 0.7410
Epoch 18/25
16/16 [==============================] - 13s 841ms/step - loss: 0.4974 - acc: 0.7625 - val_loss: 0.5265 - val_acc: 0.7390
Epoch 19/25
16/16 [==============================] - 13s 843ms/step - loss: 0.4712 - acc: 0.7825 - val_loss: 0.5444 - val_acc: 0.7130
Epoch 20/25
16/16 [==============================] - 13s 839ms/step - loss: 0.4724 - acc: 0.7690 - val_loss: 0.5204 - val_acc: 0.7420
Epoch 21/25
16/16 [==============================] - 14s 849ms/step - loss: 0.4628 - acc: 0.7730 - val_loss: 0.5220 - val_acc: 0.7370
Epoch 22/25
16/16 [==============================] - 13s 842ms/step - loss: 0.4556 - acc: 0.7850 - val_loss: 0.5058 - val_acc: 0.7570
Epoch 23/25
16/16 [==============================] - 13s 840ms/step - loss: 0.4505 - acc: 0.7885 - val_loss: 0.5409 - val_acc: 0.7270
Epoch 24/25
16/16 [==============================] - 14s 844ms/step - loss: 0.4599 - acc: 0.7840 - val_loss: 0.5111 - val_acc: 0.7370
Epoch 25/25
16/16 [==============================] - 13s 843ms/step - loss: 0.4474 - acc: 0.7790 - val_loss: 0.4963 - val_acc: 0.7490

STEP 8. 학습결과 Visualization

plt.figure(figsize=(9, 6))
plt.plot(np.arange(1, epochs+1), history.history['loss'])
plt.plot(np.arange(1, epochs+1), history.history['val_loss'])
plt.title('Loss / Val Loss', fontsize=20)
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend(['loss', 'val_loss'], fontsize=15)
plt.show()
plt.figure(figsize=(9, 6))
plt.plot(np.arange(1, epochs+1), history.history['acc'])
plt.plot(np.arange(1, epochs+1), history.history['val_acc'])
plt.title('Acc / Val Acc', fontsize=20)
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend(['acc', 'val_acc'], fontsize=15)
plt.show()

댓글남기기