hye-log

[๋ถ€์ŠคํŠธ์บ ํ”„ AI Tech]WEEK 04_DAY 16 ๋ณธ๋ฌธ

Boostcourse/AI Tech 4๊ธฐ

[๋ถ€์ŠคํŠธ์บ ํ”„ AI Tech]WEEK 04_DAY 16

iihye_ 2022. 10. 13. 01:42

๐Ÿš€ ๊ฐœ๋ณ„ํ•™์Šต


[1] Computer Vision ์ด๋ž€

1. Course Overview

1) AI๋Š” ์‚ฌ๋žŒ์˜ ์ง€๋Šฅ์„ ์ปดํ“จํ„ฐ๋กœ ๊ตฌํ˜„ํ•œ ๊ฒƒ

2) ์˜ค๊ฐ(sight, sound, touch, taste, smell) + ๋ณต์žกํ•œ ๊ฐ๊ฐ(social, face, touch, speech)๋กœ๋„ ์ •๋ณด ํš๋“ ๊ฐ€๋Šฅ

3) ์ธ๊ฐ„์€ ๋ˆˆ์œผ๋กœ ์–ด๋–ค ์žฅ๋ฉด์„ ๊ด€์ฐฐํ•˜๋ฉด -> ์ˆ˜์ •์ฒด ๋’ค์— ์ƒ์ด ๋งบํžˆ๊ณ  -> ๋‡Œ์— ์ „๋‹ฌ๋˜์–ด -> ํ•ด์„ํ•จ

4) ์ปดํ“จํ„ฐ๋Š” ์นด๋ฉ”๋ผ๋ฅผ ํ†ตํ•ด ์˜์ƒ์„ ์ƒ์„ฑํ•˜๊ณ  -> ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ํ†ตํ•ด ํ•ด์„ํ•˜๊ณ  -> high-level representation์œผ๋กœ ๋ณ€ํ™˜ํ•จ

5) computer graphics : ์ •๋ณด๋ฅผ ํ†ตํ•ด ์˜์ƒ์„ rendering ํ•˜๋Š” ๊ฒƒ

- computer vision : computer graphics์˜ ์—ญ๊ณผ์ •

6) ๋จธ์‹ ๋Ÿฌ๋‹์—์„œ ์ปดํ“จํ„ฐ ๋น„์ „์˜ ์—ฐ๊ตฌ

- input(๋ฐ์ดํ„ฐ ์ถ”์ถœ) -> feature extraction(์ „๋ฌธ๊ฐ€๊ฐ€ ํŠน์ง• ์ถ”์ถœ) -> classification(๋ถ„๋ฅ˜) -> output(๊ฒฐ๊ณผ)

7) ๋”ฅ๋Ÿฌ๋‹์—์„œ ์ปดํ“จํ„ฐ ๋น„์ „์˜ ์—ฐ๊ตฌ

- input(๋ฐ์ดํ„ฐ ์ถ”์ถœ) -> feature extraction + classification(end-to-end ๋ฐฉ์‹์˜ ํŠน์ง• ์ถ”์ถœ๊ณผ ๋ถ„๋ฅ˜) -> output(๊ฒฐ๊ณผ)

 

2. Image Classification

1) Input(์˜์ƒ) -> Classifier(๋ถ„๋ฅ˜) -> Output(์นดํ…Œ๊ณ ๋ฆฌ๋‚˜ ํด๋ž˜์Šค์™€ ๊ฐ™์€ ์ถœ๋ ฅ)

2) k Nearest Neightbors(k-NN) ๋ฐฉ์‹์œผ๋กœ ๋ถ„๋ฅ˜ ๋ฌธ์ œ ํ•ด๊ฒฐ

- ์ด ์„ธ์ƒ์˜ ๋ชจ๋“  ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์€ ๋ถˆ๊ฐ€ํ•˜๊ณ , image ๊ฐ„ ๊ด€๋ จ์„ฑ ์ •์˜๋„ ์–ด๋ ค์›€

3) Convolutional Neural Networks(CNN)

- ๋ฐฉ๋Œ€ํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ์‹œ์Šคํ…œ์— ๋…น์—ฌ ๋„ฃ์Œ

- Fully Connected Network(FCN)์€ ๋ชจ๋“  pixel์„ ๊ณ ๋ คํ•˜์—ฌ ํ•˜๋‚˜์˜ ํŠน์ง•์„ ๊ด€์ฐฐ

- Locally Connected Network์€ ๊ณต๊ฐ„์  ํŠน์„ฑ์„ ๊ณ ๋ คํ•˜์—ฌ ๊ตญ๋ถ€์ ์œผ๋กœ ์„ ํƒ

- backbone์— ๋”ฐ๋ผ ๋‹ค์–‘ํ•œ task ์ˆ˜ํ–‰

 

3. CNN architectures for image classification

1) AlexNet

- Conv - Pool - Conv - Pool - FC -FC ๊ตฌ์กฐ

- ๋” ๊นŠ์–ด์ง„ ๋ ˆ์ด์–ด(7 hidden layers)

- ๋” ๋งŽ์•„์ง„ ํŒŒ๋ผ๋ฏธํ„ฐ(60 million parameters)

- ๋” ๋งŽ์€ ๋ฐ์ดํ„ฐ๋กœ ํ•™์Šต(ImageNet), ReLU ์‚ฌ์šฉ

- GPU ๋ฌธ์ œ๋กœ ๋‘ ๊ฐˆ๋ž˜๋กœ ๋‚˜๋ˆ„์–ด์„œ ๊ทธ๋ ค์ง„ ๊ตฌ์กฐ๊ฐ€ ํŠน์ง•. ์ผ๋ถ€์—์„œ cross connection์œผ๋กœ ์—ฐ๊ฒฐ๋˜์–ด ์žˆ์Œ

- Local Response Normalization(LRN) ์‚ฌ์šฉ -> ์ง€๊ธˆ์€ Batch Normalization ์‚ฌ์šฉ

- 11×11 conv filter ์‚ฌ์šฉ

- K×K conv, stride 1, P×P pool layer๊ฐ€ ์žˆ์„ ๋•Œ, ์ƒ์„ฑ๋˜๋Š” layer์˜ ํฌ๊ธฐ๋Š” (P+K-1)×(P+K-1)

2) VGGNet

- ๋” ๊นŠ์–ด์ง„ ๊ตฌ์กฐ(16, 19 layers)

- ๋” ๊ฐ„๋‹จํ•œ ๊ตฌ์กฐ(3×3 conv filters, 2×2 max pooling ์‚ฌ์šฉ)

- ๋” ๋‚˜์€ ์„ฑ๋Šฅ(AlexNet๋ณด๋‹ค ์„ฑ๋Šฅ ํ–ฅ์ƒ)

- ๋” ๋‚˜์€ ์ผ๋ฐ˜ํ™”(fine-tuning ์—†์ด๋„ ๋‹ค๋ฅธ task์— ์ ์šฉ ๊ฐ€๋Šฅ)


[2] Data Augmentation

1. Data augmentation

1) Neural Network๋Š” ๋ฐ์ดํ„ฐ์…‹์˜ ํŠน์ง•์„ ์••์ถ•ํ•˜์—ฌ ํ•™์Šตํ•จ

2) train ๋ฐ์ดํ„ฐ๋Š” ์‹ค์ œ ๋ฐ์ดํ„ฐ์˜ ๊ทนํžˆ ์ผ๋ถ€์— ํ•ด๋‹น

3) train ๋ฐ์ดํ„ฐ์™€ ์‹ค์ œ ๋ฐ์ดํ„ฐ ๊ฐ„์˜ gap์ด ์กด์žฌ

- ์˜ˆ๋ฅผ ๋“ค์–ด train ๋ฐ์ดํ„ฐ์…‹์ด ๋ฐ์€ ์˜์ƒ์„ ์œ„์ฃผ๋กœ ๋ชจ๋ธ ํ•™์Šต์„ ์ง„ํ–‰ํ–ˆ์„ ๋•Œ, ์‹ค์ œ ๋ฐ์ดํ„ฐ๋กœ ์–ด๋‘์šด ์˜์ƒ์ด ์ฃผ์–ด์ง€๋ฉด ๋ชจ๋ธ์ด ํ˜ผ๋ž€์Šค๋Ÿฌ์šธ ์ˆ˜ ์žˆ์Œ

4) augmentation data๋Š” train ๋ฐ์ดํ„ฐ์™€ ์‹ค์ œ ๋ฐ์ดํ„ฐ๊ฐ„ gap์„ ์ค„์—ฌ์ฃผ๊ณ  ๋ฐ์ดํ„ฐ๋ฅผ ํ’๋ถ€ํ•˜๊ฒŒ ๋งŒ๋“ค์–ด์คŒ

5) crop, shear, brightness, perspective, rotate ๋“ฑ์„ ํ™œ์šฉ

 

2. Leveraging pre-trained infromation

1) ๋†’์€ ํ’ˆ์งˆ์˜ ๋ฐ์ดํ„ฐ์…‹์„ ์–ป๊ธฐ ์œ„ํ•ด์„œ๋Š” ๊ทธ์— ๋งž๋Š” label์ด ํ•„์š”ํ•จ

2) transfer learning์€ ์‚ฌ์ „ ์ง€์‹์„ ํ™œ์šฉํ•ด์„œ ์—ฐ๊ด€๋œ task์— ์ ์€ ๋…ธ๋ ฅ์œผ๋กœ๋„ ๋†’์€ ์„ฑ๋Šฅ์„ ๋„๋‹ฌํ•˜๋Š” ๊ฒƒ์ด ๋ชฉํ‘œ

3) ์ ‘๊ทผ ๋ฐฉ๋ฒ• 1. conv layer ๋’ค์— ์ƒˆ๋กœ์šด FC layer๋ฅผ ๋ถ™์ด๊ธฐ

4) ์ ‘๊ทผ ๋ฐฉ๋ฒ• 2. conv layer๋Š” ๋‚ฎ์€ learning rate๋กœ ๋Š๋ฆฌ๊ฒŒ ํ•™์Šตํ•˜๊ณ , ์ƒˆ๋กœ์šด FC layer๋Š” ๋†’์€ learning rate๋กœ ๋น ๋ฅด๊ฒŒ ํ•™์Šตํ•˜๊ธฐ

5) knowledge distillation(์ง€์‹ ์ฆ๋ฅ˜)

- teacher-student learning : teacher(big model)์˜ ์ง€์‹์„ student(small model)๋กœ ์ „๋‹ฌํ•˜๋Š” ๋ฐฉ์‹

- Knowledge distillation : teacher model์˜ ๊ฒฐ๊ณผ์™€ ๋น„์Šทํ•˜๊ฒŒ ์˜ˆ์ธก

- T(Softmax with temperature) : softmax์˜ ๊ฐ’์„ smoothํ•˜๊ฒŒ ๋ฒŒ๋ ค์คŒ

์˜ˆ) (5, 10) -> (Normal Softmax) (0.0067, 0.9933)

                    -> (Softmax with temperature) (0.4875, 0.5125)

6) Hard label vs. Soft label

- Hard label(One-hot vector) : ์ •๋‹ต์€ N๊ฐœ ์ค‘ ํ•˜๋‚˜

- Soft label : 0~1 ์‚ฌ์ด์˜ ๊ฐ’์œผ๋กœ ๊ฒฐ๊ณผ๊ฐ’ ์กด์žฌ

7) Distillation Loss

- KL div(Soft label, Soft prediction)

- teacher network์™€ student network ๊ฐ„์˜ ์ฐจ์ด (teacher network๊ฐ€ ์•„๋Š” ๊ฒƒ์„ ํ•™์Šต)

8) Student Loss

- CrossEntropy(Hard label, Soft prediction)

- student network์™€ true label ๊ฐ„์˜ ์ฐจ์ด (right answer๋ฅผ ํ•™์Šต)

 

3. Leveraging unlabeled datset for trainign

1) Semi-supervised learning : Unsupervised(No label) + Fully supervised(fully labeled)

2) Pseudo-labeling unlabeled data using pre-trained model

- labeled data๋กœ๋ถ€ํ„ฐ ๋ชจ๋ธ ํ•™์Šต

- pre-trained model์„ ์‚ฌ์šฉํ•˜์—ฌ unlabeled data์—์„œ label ์˜ˆ์ธก

- labeled dataset๊ณผ pseudo-labeled dataset์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ ํ•™์Šต

3) Self-training : Augmentation + Teacher-Student networks + Semi-supervised learning

- labeled data๋กœ teacher model ํ•™์Šต

- teacher mdoel์„ ์‚ฌ์šฉํ•˜์—ฌ unlabeled data์˜ pseudo-label ์ƒ์„ฑ

- labeled data์™€ unlabled data๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ student model ํ•™์Šต

- student model์„ ์ƒˆ๋กœ์šด teacher model๋กœ ๋ณด๊ณ , ์ƒˆ๋กœ์šด student ๋ชจ๋ธ์„ ํ•™์Šต

- ์ด ๊ณผ์ •์„ ๋ฐ˜๋ณต

 



๐Ÿš€ ์˜ค๋Š˜์˜ ํšŒ๊ณ 

์˜ค์ „์—๋Š” ์ปดํ“จํ„ฐ ๋น„์ „ ๋„๋ฉ”์ธ ๊ฐ•์˜๋ฅผ ๋“ค์—ˆ๋‹ค! ๊ฐ•์˜ ๋“ฃ๊ณ  ๊ณผ์ œ ํ•ด๋ณด๋Š”๋ฐ Resnet ๊ตฌํ˜„๊ณผ Augmentation์„ ์ด์šฉํ•˜์—ฌ ๋ชจ๋ธ ํ•™์Šตํ•˜๋Š” ๊ณผ์ œ๋ฅผ ํ•ด๋ณด์•˜๋‹ค. Resnet ๊ตฌํ˜„์€ ์–ด๋ ต๋‹ค.. ๋‚ด์ผ ๋‹ค์‹œ ํ•œ ๋ฒˆ ๋„์ „ํ•ด๋ณด๊ธฐ๋กœ.. ํ”ผ์–ด์„ธ์…˜ ๋•Œ์—๋Š” Level 2 ๊ณผ์ •์—์„œ ์ƒˆ๋กœ์šด ํŒ€ ๋นŒ๋”ฉ์„ ์–ด๋–ป๊ฒŒ ํ• ์ง€, ๊ฐ์ž ๊ด€์‹ฌ ์žˆ๋Š” ๋ถ„์•ผ๋Š” ์–ด๋–ค ๊ฒƒ์ด ์žˆ๋Š”์ง€ ์ด์•ผ๊ธฐํ•˜๋Š” ์‹œ๊ฐ„์„ ๊ฐ€์กŒ๋‹ค. ๋‚˜๋Š” ํŒ€์› ์ฐพ๊ธฐ์— ์•„์ง ๊ธ€์„ ๋ชป ์˜ฌ๋ ค์„œ ์–ด๋–ค ํ”„๋กœ์ ํŠธ ํ•˜๊ณ  ์‹ถ์€์ง€, ์–ด๋–ค ํ”„๋กœ์ ํŠธ ๊ฒฝํ—˜ํ•ด๋ดค๋Š”์ง€ ์ •๋ฆฌํ•ด๋ดค๋‹ค. ์—ฌ์ „ํžˆ ์–ด๋–ค ๋ถ„์•ผ๋กœ ๊ฐ€์•ผํ• ์ง€๋Š” ์ง„์งœ ์ž˜ ๋ชจ๋ฅด๊ฒ ๋‹ค.. ๋ถ€์บ  ์Šค๋ชฐํ†ก ์‹œ๊ฐ„์—๋Š” ZEP์—์„œ ๊ฐ„๋‹จํ•œ ๋„คํŠธ์›Œํ‚น ํ•˜๋Š” ์‹œ๊ฐ„์„ ๊ฐ€์กŒ๋Š”๋ฐ, ZEP ๋ฉ”ํƒ€๋ฒ„์Šค ์‹ ๊ธฐํ–ˆ๋‹ค +_+ ๋‹ค์–‘ํ•œ ๋ถ„๋“ค๋„ ๋งŒ๋‚˜๋ฉด์„œ ํ™•์‹คํ•˜๊ฒŒ ์ž๊ธฐ ๋ถ„์•ผ๊ฐ€ ์ •ํ•ด์ง„ ๋ถ„๋“ค์ด ์ •๋ง ๋ถ€๋Ÿฌ์› ๋‹ค.. ํŒ€ ๊พธ๋ฆฌ๊ธฐ ์ „์— ํ™•์‹คํ•˜๊ฒŒ ๋ถ„์•ผ ํ•˜๋‚˜๊ฐ€ ์ •ํ•ด์•ผ๊ฒ ๋‹ค๊ณ  ์ƒ๊ฐํ–ˆ๋‹ค...

 

728x90
Comments