일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | |||
5 | 6 | 7 | 8 | 9 | 10 | 11 |
12 | 13 | 14 | 15 | 16 | 17 | 18 |
19 | 20 | 21 | 22 | 23 | 24 | 25 |
26 | 27 | 28 | 29 | 30 | 31 |
- recommender system
- CNN
- neural network
- cs231n
- 컴퓨터 비전
- 딥러닝
- 머신러닝
- pre-trained
- Computer Vision
- 백준
- Unsupervised learning
- 신경망
- petal to metal
- Regularization
- C++
- 로지스틱 회귀
- Kaggle
- 그래픽스
- Vision
- CPP
- 컴퓨터 그래픽스
- 인공지능
- logistic regression
- 파이썬
- 비용함수
- Support Vector Machine
- 추천 시스템
- SGD
- OpenGL
- SVM
- Today
- Total
kwan's note
캐글 -petal to medtal (95% correct w. efficient net) 본문
저번에는 google net 의 version 3인 inception v3를 이용해서 petal to metal의 classification을 진행하였습니다.
이번에는 조금 더 효율을 높이고자 augmentation을 하였고 또 inception v3모델이 아닌 최근 가장 강력한 모델중 하나인 efficient net을 이용하였습니다.
www.kaggle.com/c/tpu-getting-started
reminder-by-kwan.tistory.com/119
inception v3모델도 효율적인 모델이긴 하나 google net의 기본적인 idea를 이용하여 전개된 모델이고 inception v4와는 다르게 resnet의 아이디어를 활용하지 않은 모델입니다.
resnet의 rediual-connection을 이용하면 학습에 방해되는 (필요 없는 노드나 overfitting을 낳는 layer등)을 뛰어넘어 학습을 진행하고 전파할 수 있으므로 굉장히 효율적이게 됩니다.
여기서는 하지만 efficient net을 사용하고자 했습니다.
필터의 개수를 늘리는 width scaling 와 레이어의 개수를 늘리는 depth scaling 그리고 input image의 해상도를 높이는 resolution scaling의 최적값을 찾아 이용하는 방식입니다.
이를 이용하여 상위 14%의 성적을 받았습니다.
AUTO = tf.data.AUTOTUNE
def decode_image(image_data):
image = tf.image.decode_jpeg(image_data, channels=3)
image = tf.cast(image, tf.float32) / 255.0 # convert image to floats in [0, 1] range
image = tf.reshape(image, [*IMAGE_SIZE, 3]) # explicit size needed for TPU
return image
def read_labeled_tfrecord(example):
LABELED_TFREC_FORMAT = {
"image": tf.io.FixedLenFeature([], tf.string), # tf.string means bytestring
"class": tf.io.FixedLenFeature([], tf.int64), # shape [] means single element
}
example = tf.io.parse_single_example(example, LABELED_TFREC_FORMAT)
image = decode_image(example['image'])
label = tf.cast(example['class'], tf.int32)
return image, label # returns a dataset of (image, label) pairs
def read_unlabeled_tfrecord(example):
UNLABELED_TFREC_FORMAT = {
"image": tf.io.FixedLenFeature([], tf.string), # tf.string means bytestring
"id": tf.io.FixedLenFeature([], tf.string), # shape [] means single element
# class is missing, this competitions's challenge is to predict flower classes for the test dataset
}
example = tf.io.parse_single_example(example, UNLABELED_TFREC_FORMAT)
image = decode_image(example['image'])
idnum = example['id']
return image, idnum # returns a dataset of image(s)
def load_dataset(filenames, labeled=True, ordered=False):
# Read from TFRecords. For optimal performance, reading from multiple files at once and
# disregarding data order. Order does not matter since we will be shuffling the data anyway.
ignore_order = tf.data.Options()
if not ordered:
ignore_order.experimental_deterministic = False # disable order, increase speed
dataset = tf.data.TFRecordDataset(filenames) # automatically interleaves reads from multiple files
dataset = dataset.with_options(ignore_order) # uses data as soon as it streams in, rather than in its original order
dataset = dataset.map(read_labeled_tfrecord if labeled else read_unlabeled_tfrecord)
# returns a dataset of (image, label) pairs if labeled=True or (image, id) pairs if labeled=False
return dataset
def get_validation_dataset():
dataset = load_dataset(tf.io.gfile.glob(GCS_DS_PATH + '/tfrecords-jpeg-512x512/val/*.tfrec'), labeled=True, ordered=False)
dataset = dataset.batch(BATCH_SIZE)
dataset = dataset.cache()
return dataset
def get_test_dataset(ordered=False):
dataset = load_dataset(tf.io.gfile.glob(GCS_DS_PATH + '/tfrecords-jpeg-512x512/test/*.tfrec'), labeled=False, ordered=ordered)
dataset = dataset.batch(BATCH_SIZE)
return dataset
def data_augment(image, label):
crop_size = tf.random.uniform([], int(HEIGHT*.8), HEIGHT, dtype=tf.int32)
image = tf.image.random_flip_left_right(image)
image = tf.image.random_flip_up_down(image)
image = tf.image.random_saturation(image, lower=0.7, upper=1.5)
image = tf.image.random_contrast(image, lower=0.9, upper=1.5)
image = tf.image.random_brightness(image, max_delta=.2)
# image = tf.image.adjust_gamma(image, gamma=.6)
image = tf.image.random_crop(image, size=[crop_size, crop_size, CHANNELS])
image = tf.image.resize(image, size=[HEIGHT, WIDTH])
return image, label
def get_training_dataset():
dataset = load_dataset(tf.io.gfile.glob(GCS_DS_PATH + '/tfrecords-jpeg-512x512/train/*.tfrec'), labeled=True)
dataset = dataset.map(data_augment, num_parallel_calls=AUTO)
dataset = dataset.repeat() # the training dataset must repeat for several epochs
dataset = dataset.shuffle(100000)
dataset = dataset.batch(BATCH_SIZE)
dataset = dataset.prefetch(AUTO) # prefetch next batch while training (autotune prefetch buffer size)
return dataset
training_dataset = get_training_dataset()
validation_dataset = get_validation_dataset()
데이터셋을 만들었습니다.
이때 augmentation을 진행하였는데 flip, saturation, contrasity,brightness를 조절하였습니다.
저번에 shuffle의 크기를 데이터의 크기보다 작게 설정하여 정확히 shuffle이 안되어서 이번에는 충분히 높은 숫자로 설정하였습니다.
with strategy.scope():
pt_model = efficient.EfficientNetB7(
input_shape=(512, 512, 3),
weights='imagenet',
include_top=False
)
"""pt_model= tf.keras.applications.InceptionV3(
include_top=False, weights='imagenet', input_tensor=None,
input_shape=[*IMAGE_SIZE, 3]
)"""
model = tf.keras.Sequential([
pt_model,
layers.GlobalAveragePooling2D(),
layers.Dense(104, activation='softmax'),
])
model.compile(
optimizer='adam',
loss = 'sparse_categorical_crossentropy',
metrics=['sparse_categorical_accuracy']
)
model.summary()
historical = model.fit(training_dataset,
steps_per_epoch=STEPS_PER_EPOCH,
epochs=EPOCHS,
validation_data=validation_dataset, callbacks=[early_stopping])
위와같이 pretrained model을 이용하여 돌렸는데 마지막에 제대로 학습되지 않고 일부 진동하는 모습을 보였습니다.
진동폭이 적지 않아 learning rate를 decay 시키는것이 더 나을것 같다는생각을 했는데 일단은 optimizer로 adam을 사용하였고 마지막층만 학습하는것이 목적이었기 때문에 그대로 사용을 했다.
Downloading data from https://github.com/Callidior/keras-applications/releases/download/efficientnet/efficientnet-b7_weights_tf_dim_ordering_tf_kernels_autoaugment_notop.h5
258441216/258434480 [==============================] - 3s 0us/step
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
efficientnet-b7 (Functional) (None, 16, 16, 2560) 64097680
_________________________________________________________________
global_average_pooling2d (Gl (None, 2560) 0
_________________________________________________________________
dense (Dense) (None, 104) 266344
=================================================================
Total params: 64,364,024
Trainable params: 64,053,304
Non-trainable params: 310,720
_________________________________________________________________
Epoch 1/30
99/99 [==============================] - 758s 1s/step - loss: 2.5024 - sparse_categorical_accuracy: 0.4525 - val_loss: 1.4078 - val_sparse_categorical_accuracy: 0.7047
Epoch 2/30
99/99 [==============================] - 93s 939ms/step - loss: 0.6331 - sparse_categorical_accuracy: 0.8323 - val_loss: 0.6531 - val_sparse_categorical_accuracy: 0.8370
Epoch 3/30
99/99 [==============================] - 93s 938ms/step - loss: 0.4097 - sparse_categorical_accuracy: 0.8907 - val_loss: 0.5397 - val_sparse_categorical_accuracy: 0.8739
Epoch 4/30
99/99 [==============================] - 92s 926ms/step - loss: 0.3325 - sparse_categorical_accuracy: 0.9049 - val_loss: 0.4794 - val_sparse_categorical_accuracy: 0.8933
Epoch 5/30
99/99 [==============================] - 92s 925ms/step - loss: 0.2311 - sparse_categorical_accuracy: 0.9389 - val_loss: 0.4324 - val_sparse_categorical_accuracy: 0.9033
Epoch 6/30
99/99 [==============================] - 92s 929ms/step - loss: 0.2045 - sparse_categorical_accuracy: 0.9410 - val_loss: 0.4142 - val_sparse_categorical_accuracy: 0.9022
Epoch 7/30
99/99 [==============================] - 92s 929ms/step - loss: 0.1844 - sparse_categorical_accuracy: 0.9462 - val_loss: 0.4292 - val_sparse_categorical_accuracy: 0.9027
Epoch 8/30
99/99 [==============================] - 92s 925ms/step - loss: 0.1397 - sparse_categorical_accuracy: 0.9585 - val_loss: 0.4287 - val_sparse_categorical_accuracy: 0.9127
Epoch 9/30
99/99 [==============================] - 92s 927ms/step - loss: 0.1346 - sparse_categorical_accuracy: 0.9624 - val_loss: 0.4026 - val_sparse_categorical_accuracy: 0.9141
Epoch 10/30
99/99 [==============================] - 92s 929ms/step - loss: 0.1052 - sparse_categorical_accuracy: 0.9682 - val_loss: 0.4498 - val_sparse_categorical_accuracy: 0.9076
Epoch 11/30
99/99 [==============================] - 92s 927ms/step - loss: 0.1031 - sparse_categorical_accuracy: 0.9708 - val_loss: 0.3650 - val_sparse_categorical_accuracy: 0.9251
Epoch 12/30
99/99 [==============================] - 92s 929ms/step - loss: 0.0902 - sparse_categorical_accuracy: 0.9737 - val_loss: 0.4129 - val_sparse_categorical_accuracy: 0.9200
Epoch 13/30
99/99 [==============================] - 91s 924ms/step - loss: 0.0956 - sparse_categorical_accuracy: 0.9705 - val_loss: 0.4477 - val_sparse_categorical_accuracy: 0.9197
Epoch 14/30
99/99 [==============================] - 92s 927ms/step - loss: 0.0849 - sparse_categorical_accuracy: 0.9755 - val_loss: 0.6063 - val_sparse_categorical_accuracy: 0.8947
Epoch 15/30
99/99 [==============================] - 92s 930ms/step - loss: 0.0891 - sparse_categorical_accuracy: 0.9715 - val_loss: 0.4078 - val_sparse_categorical_accuracy: 0.9235
Epoch 16/30
99/99 [==============================] - 92s 928ms/step - loss: 0.0758 - sparse_categorical_accuracy: 0.9761 - val_loss: 0.5323 - val_sparse_categorical_accuracy: 0.9138
Epoch 00016: early stopping
다음으로 이제 trianable하게 동결을 풀어서 학습시키고자 하는데 위에 진동이 있었으므로 학습 rate를 어느정도 작게 해서 발산하지 않도록 고려했다.
with strategy.scope():
model.trainable=True
model.compile(
optimizer=tf.keras.optimizers.SGD(lr=0.01,momentum=0.9,decay=0.001),
loss = 'sparse_categorical_crossentropy',
metrics=['sparse_categorical_accuracy']
)
historical = model.fit(training_dataset,
epochs=EPOCHS,
steps_per_epoch=STEPS_PER_EPOCH,
validation_data=validation_dataset, callbacks=[early_stopping])
Epoch 1/30
99/99 [==============================] - 686s 1s/step - loss: 0.0600 - sparse_categorical_accuracy: 0.9845 - val_loss: 0.3759 - val_sparse_categorical_accuracy: 0.9340
Epoch 2/30
99/99 [==============================] - 91s 922ms/step - loss: 0.0397 - sparse_categorical_accuracy: 0.9871 - val_loss: 0.3406 - val_sparse_categorical_accuracy: 0.9383
Epoch 3/30
99/99 [==============================] - 91s 922ms/step - loss: 0.0365 - sparse_categorical_accuracy: 0.9892 - val_loss: 0.3212 - val_sparse_categorical_accuracy: 0.9407
Epoch 4/30
99/99 [==============================] - 90s 913ms/step - loss: 0.0272 - sparse_categorical_accuracy: 0.9927 - val_loss: 0.3171 - val_sparse_categorical_accuracy: 0.9418
Epoch 5/30
99/99 [==============================] - 90s 913ms/step - loss: 0.0301 - sparse_categorical_accuracy: 0.9910 - val_loss: 0.3100 - val_sparse_categorical_accuracy: 0.9426
Epoch 6/30
99/99 [==============================] - 90s 911ms/step - loss: 0.0266 - sparse_categorical_accuracy: 0.9907 - val_loss: 0.3052 - val_sparse_categorical_accuracy: 0.9432
Epoch 7/30
99/99 [==============================] - 90s 914ms/step - loss: 0.0240 - sparse_categorical_accuracy: 0.9942 - val_loss: 0.3037 - val_sparse_categorical_accuracy: 0.9432
Epoch 8/30
99/99 [==============================] - 90s 907ms/step - loss: 0.0180 - sparse_categorical_accuracy: 0.9953 - val_loss: 0.3002 - val_sparse_categorical_accuracy: 0.9453
Epoch 9/30
99/99 [==============================] - 90s 909ms/step - loss: 0.0201 - sparse_categorical_accuracy: 0.9935 - val_loss: 0.2993 - val_sparse_categorical_accuracy: 0.9459
Epoch 10/30
99/99 [==============================] - 90s 907ms/step - loss: 0.0185 - sparse_categorical_accuracy: 0.9953 - val_loss: 0.2989 - val_sparse_categorical_accuracy: 0.9445
Epoch 11/30
99/99 [==============================] - 90s 911ms/step - loss: 0.0183 - sparse_categorical_accuracy: 0.9953 - val_loss: 0.2965 - val_sparse_categorical_accuracy: 0.9453
Epoch 12/30
99/99 [==============================] - 90s 908ms/step - loss: 0.0156 - sparse_categorical_accuracy: 0.9961 - val_loss: 0.2945 - val_sparse_categorical_accuracy: 0.9459
Epoch 13/30
99/99 [==============================] - 90s 913ms/step - loss: 0.0204 - sparse_categorical_accuracy: 0.9943 - val_loss: 0.2928 - val_sparse_categorical_accuracy: 0.9453
Epoch 14/30
99/99 [==============================] - 90s 909ms/step - loss: 0.0178 - sparse_categorical_accuracy: 0.9955 - val_loss: 0.2925 - val_sparse_categorical_accuracy: 0.9467
Epoch 15/30
99/99 [==============================] - 90s 910ms/step - loss: 0.0170 - sparse_categorical_accuracy: 0.9951 - val_loss: 0.2923 - val_sparse_categorical_accuracy: 0.9467
Epoch 16/30
99/99 [==============================] - 90s 910ms/step - loss: 0.0164 - sparse_categorical_accuracy: 0.9964 - val_loss: 0.2903 - val_sparse_categorical_accuracy: 0.9461
Epoch 17/30
99/99 [==============================] - 90s 907ms/step - loss: 0.0146 - sparse_categorical_accuracy: 0.9963 - val_loss: 0.2904 - val_sparse_categorical_accuracy: 0.9477
Epoch 18/30
99/99 [==============================] - 90s 907ms/step - loss: 0.0128 - sparse_categorical_accuracy: 0.9965 - val_loss: 0.2915 - val_sparse_categorical_accuracy: 0.9459
Epoch 00018: early stopping
오히려 이번에는 너무 learning rate가 낮아서 멈춘게 아닌가 싶은 생각도 들었지만
early stopping의 min delta가 0.003으로 너무낮지는 않았던점, sparse_categorical_accuracy가 0.9965까지 꽤 많이 학습되었던점을봐서 어느정도 affordable하다고 생각하고 더이상 학습을 진행시키지는 않았다.
위 모델을 제출했고 accuracy 0.94358로 상위 19%의 성적을 받았다
www.kaggle.com/kwanyun/efficient-pre-trained?scriptVersionId=56427518
여기서 위에서 바꾸고자 했던 파라미터 튜닝을 통해(첫번째 학습에선 learning rate을 낮추고, 두번째 학습에선 learning rate를 높였다) 0.95165의 accuracy로 상위 14%의 성적을 받았다.
'project > others' 카테고리의 다른 글
[C++] 행렬 라이브러리 구현 (simple matrix library) (0) | 2022.07.04 |
---|---|
Simple tetris from scratch(C++) (0) | 2022.06.04 |
캐글 -petal to medtal (pre trained model) (0) | 2021.03.07 |
fashion mnist (CNN 을 이용한 fashion mnist 분류) (1) | 2021.02.28 |
캐글 타이타닉 -titanic machine learning (kaggle) (0) | 2021.02.04 |