hye-log

[๋ถ€์ŠคํŠธ์บ ํ”„ AI Tech]WEEK 04_DAY 17 ๋ณธ๋ฌธ

Boostcourse/AI Tech 4๊ธฐ

[๋ถ€์ŠคํŠธ์บ ํ”„ AI Tech]WEEK 04_DAY 17

iihye_ 2022. 10. 14. 02:54

๐Ÿš€ ๊ฐœ๋ณ„ํ•™์Šต


[3] Image Classification 2

1. Problems with deeper layers

1) ๋” ๊นŠ์€ ๋„คํŠธ์›Œํฌ -> ๋” ๋†’์€ ์„ฑ๋Šฅ ํ•ญ์ƒ ๊ทธ๋Ÿฐ๊ฐ€?

- ์•„๋‹ˆ๋‹ค

- ๊ณ„์‚ฐ ๋ณต์žก๋„ ์ฆ๊ฐ€

- degradation problem : gradient vanishing(๋„ˆ๋ฌด ์ž‘์•„์ง)/exploding(๋„ˆ๋ฌด ์ปค์ง) ๋ฌธ์ œ

 

2. CNN architecture for image classification 2

1) GoogleNet

(1) Inception module

- 1×1, 3×3, 5×5 conv filter ์‚ฌ์šฉ

- 1×1 conv๋กœ ๊ณ„์‚ฐ๋Ÿ‰์„ ์ค„์ž„ (๊ณต๊ฐ„ ํฌ๊ธฐ๋Š” ๋ณ€ํ•˜์ง€ ์•Š๊ณ , ์ฑ„๋„ ์ˆ˜๋งŒ ๋ฐ”๋€œ)

(2) ์ „์ฒด์ ์ธ ๊ตฌ์กฐ

- Stem network

- ์ „์ฒด์ ์œผ๋กœ Inception module์„ ์Œ“์€ ๊ตฌ์กฐ

- ๋ชจ๋“ˆ ์ค‘๊ฐ„์— Auxiliary classifier ์ถ”๊ฐ€ : gradient vanishing์„ ๋ฐฉ์ง€

- ์ตœ์ข… ์ถœ๋ ฅ Classifier๋Š” single FC layer

Szegedy, Christian, et al. "Going deeper with convolutions."  Proceedings of the IEEE conference on computer vision and pattern recognition . 2015.

2) ResNet

(1) depth๊ฐ€ ๊นŠ์–ด์ง€๋ฉด ์„ฑ๋Šฅ์ด ๋†’์•„์ง€๋‚˜?

He, Kaiming, et al. "Deep residual learning for image recognition."  Proceedings of the IEEE conference on computer vision and pattern recognition . 2016.

-  train error๊ฐ€ 56-layer์ผ ๋•Œ 20-layer ์ผ ๋•Œ๋ณด๋‹ค ๋†’์Œ

- overfitting์ด ์•„๋‹ˆ๋ผ optimization ๋ฌธ์ œ์ž„

(2) ์ž”์—ฌ ๋ถ€๋ถ„์„ ํ•™์Šตํ•˜๋Š” residual block์„ ์ œ์•ˆ

- O(2^n)์˜ ๊ฒฝ๋กœ๋ฅผ ๊ณ ๋ ค

He, Kaiming, et al. "Deep residual learning for image recognition."  Proceedings of the IEEE conference on computer vision and pattern recognition . 2016.

(3) ์ „์ฒด์ ์ธ ๊ตฌ์กฐ

- initialization conv layer

- 3×3 conv block(residual block) ์Œ“๊ธฐ

- stride 2๋กœ down-sampling

- ์ตœ์ข… ์ถœ๋ ฅ Classifier๋Š” single FC layer

3) DenseNet

- ์ฑ„๋„ ์ถ•์„ ๊ธฐ์ค€์œผ๋กœ cocnatenated

- ๋ฐ”๋กœ ์ง์ „ ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์ด์ „ ๋ ˆ์ด์–ด๋ฅผ ๋„˜๊ฒจ์ฃผ๊ธฐ ๋•Œ๋ฌธ์— ์ƒ์œ„ ๋ ˆ์ด์–ด๊ฐ€ ํ•˜์œ„ ๋ ˆ์ด์–ด์˜ ํŠน์„ฑ์„ ์žฌ์ฐธ์กฐ

4) SENet

- squeeze : ๊ณต๊ฐ„ ์ •๋ณด๋ฅผ ์—†์• ๊ณ  ์ฑ„๋„์˜ ๋ถ„ํฌ๋ฅผ ๊ตฌํ•จ

- excitation : FC layer ํ•˜๋‚˜๋ฅผ ํ†ตํ•ด ์ฑ„๋„ ๊ฐ„ ์—ฐ๊ด€์„ฑ์„ ๊ณ ๋ คํ•˜์—ฌ attention score ์ƒ์„ฑ

5) Efficientnet

- ๊ธฐ์กด์˜ width scailing(GoogleNet Inception module, DenseNet), depth scailing(ResNet), resolution scailing๊ณผ ๋‹ค๋ฅด๊ฒŒ compound scailing์„ ํ†ตํ•ด deep, wide, high resolution ๋„คํŠธ์›Œํฌ๋ฅผ ์ƒ์„ฑ


[4] Semantic segmentation

1. Semantic segmentation

1) image classification์€ ์˜์ƒ ๋‹จ์œ„๋กœ ๋ถ„๋ฅ˜ -> semantic semgnetation์€ ํ”ฝ์…€ ๋‹จ์œ„๋กœ ๋ถ„๋ฅ˜

2) ์˜๋ฃŒ ์˜์ƒ, ์ž์œจ ์ฃผํ–‰, ํฌํ† ๊ทธ๋ผํ”ผ ๋“ฑ์— ์‚ฌ์šฉ

 

2. Semantic semgnetation architectures

1) Fully Convolutional Networks(FCN)

- end-to-end ๊ตฌ์กฐ : input๋ถ€ํ„ฐ output๊นŒ์ง€ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•œ neural network ํ˜•ํƒœ

- Fully connected layer : ์ฐจ์› ๋ฒกํ„ฐ๋ฅผ ๊ณ ์ •ํ•˜๊ณ  ๊ณต๊ฐ„ ์ •๋ณด๋ฅผ ๋ฒ„๋ฆผ -> image classification

- Fully convolutional layer : ๊ณต๊ฐ„ ์œ„์น˜๋งˆ๋‹ค classification -> semantic segmentation

2) 1×1 convolution์„ ํ†ตํ•œ fully connected layers ํ•ด์„

- Fully connected layer : ์˜์ƒ์„ ๊ธด ๋ฒกํ„ฐ ํ˜•ํƒœ๋กœ ๋งŒ๋“ค์–ด์„œ ๊ณต๊ฐ„ ์ •๋ณด๋ฅผ ๊ณ ๋ คํ•˜์ง€ ์•Š์Œ

- Fully convolutional layer : ์˜์ƒ์˜ ๊ฐ ์œ„์น˜๋งˆ๋‹ค flatten์„ ์‹œํ‚จ ํ›„ ๊ฐ๊ฐ์— ๋Œ€ํ•ด์„œ feature map ์ƒ์„ฑ

- ๊ธฐ์กด ์ด๋ฏธ์ง€๋ฅผ down-sampling ํ•˜๋‹ค๋ณด๋‹ˆ ์ž‘์€ feature map์ด ์ƒ์„ฑ๋จ

3) Upsampling

- receptive field์˜ ํฌ๊ธฐ๊ฐ€ ์ž‘์•„์„œ ์˜์ƒ์˜ ์ „๋ฐ˜์ ์ธ context๋ฅผ ํŒŒ์•…ํ•˜๊ธฐ ์–ด๋ ค์›€

- transposed convolution : ์ค‘์ฒฉ์ด ์ƒ๊ธฐ์ง€ ์•Š๋„๋ก ๋”ํ•ด์•ผ ํ•จ

- upsample and convolution : Nearest-neighnor(NN), bilinear ์‚ฌ์šฉ

4) ํ•ด์ƒ๋„์— ๋”ฐ๋ฅธ ํŠน์„ฑ

- high-resolution : low-level, detail, local

- low-resolution : semantic, holistic, global

5) U-Net

Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. "U-net: Convolutional networks for biomedical image segmentation."  International Conference on Medical image computing and computer-assisted intervention . Springer, Cham, 2015.

- Fully convolutional networks

- Contracting path(down-sampling) : ํ•ด์ƒ๋„ ์ค„์ด๊ณ , ์ฑ„๋„ ์ˆ˜๋ฅผ ๋Š˜๋ฆผ

- Expanding path(up-sampling) : ํ•ด์ƒ๋„ ๋†’์ด๊ณ , ์ฑ„๋„ ์ˆ˜๋ฅผ ์ค„์ž„

6) DeepLab

- DeepLab v1์˜ ํŠน์ง• : Conditional Random Fields(CRFs) - post-processes(ํ›„์ฒ˜๋ฆฌ)

- DeepLab v2์˜ ํŠน์ง• : Dilated convolution(Atrous convolution) - ๋” ๋„“์€ recptive field ๊ณ ๋ ค

- DeepLab v3+์˜ ํŠน์ง• : Dilated convolution, Atrous spatial pyramid pooling


๐Ÿš€ ์˜ค๋Š˜์˜ ํšŒ๊ณ 

์˜ค๋Š˜ ํ•˜๋ฃจ์˜ ์‹œ์ž‘์€ ๋ฉ˜ํ† ๋ง์œผ๋กœ! ์•„๋ฌด๋ž˜๋„ ํŒ€ ๋นŒ๋”ฉ์ด ์‚ฌ์‹ค์ƒ ์‹œ์ž‘๋˜์—ˆ๋‹ค๋ณด๋‹ˆ level 2, 3์˜ ๋ฐฉํ–ฅ์„ฑ์— ๊ด€ํ•œ ์ด์•ผ๊ธฐ๋ฅผ ๋‚˜๋ˆ„์—ˆ๋‹ค. ๋ฉ˜ํ† ๋‹˜๊ป˜์„œ ์ง€๋‚œ ๊ธฐ์ˆ˜์˜ ๊นƒํ—ˆ๋ธŒ๋ฅผ ์†Œ๊ฐœํ•ด์ฃผ์…จ๋Š”๋ฐ, ํ™•์‹คํžˆ ์ง€๋‚œ ๊ธฐ์ˆ˜๊ฐ€ ํ•œ  ํ”„๋กœ์ ํŠธ๋“ค์„ ๋ณด๋‹ˆ๊นŒ ์–ด๋–ค ์‹์œผ๋กœ ํ”„๋กœ์ ํŠธ๋ฅผ ์šด์˜ํ•˜๋Š”์ง€ ์•ฝ๊ฐ„ ๊ฐ์ด ์žกํžŒ๊ฑฐ ๊ฐ™๋‹ค.. ๊ทธ๋ฆฌ๊ณ  ๊ฐ€์žฅ ๋‹คํ–‰์ด๋ผ๊ณ  ์—ฌ๊ฒผ๋˜ ๊ฑด ์•„๋ฌด๋ž˜๋„ ์•„์ง๊นŒ์ง€ ์ฃผ์ œ ์žก๋Š” ๊ฒƒ ์ž์ฒด๊ฐ€ ํž˜๋“ค์—ˆ๋Š”๋ฐ ๊ธฐ์—… ์—ฐ๊ณ„ ์ฃผ์ œ๋ฅผ ์šด์˜ํ•˜๋Š” ๊ฒƒ์œผ๋กœ ํŒŒ์•…๋˜์–ด์„œ ๋งŒ์•ฝ์— ์—ฐ๊ณ„ ์ฃผ์ œ๋ฅผ ์šด์˜ํ•œ๋‹ค๋ฉด ๋ฌด์กฐ๊ฑด ๊ธฐ์—… ์ฃผ์ œ๋ฅผ ์žก์•„์•ผ๊ฒ ๋‹ค๋Š” ์ƒ๊ฐ์ด ๋“ค์—ˆ๋‹ค. ์šฐ์„  ๋ฌธ์ œ ์ •์˜๋ถ€ํ„ฐ ๋ช…ํ™•ํ•˜๊ณ , ํšŒ์‚ฌ์˜ ๋‹ˆ์ฆˆ๋ฅผ ํŒŒ์•…ํ•˜๊ณ  ์„ฑ๊ณต์ ์œผ๋กœ(?) ์ˆ˜ํ–‰ํ•œ๋‹ค๋ฉด ์ทจ์—…์œผ๋กœ ์—ฐ๊ณ„๋  ํ™•๋ฅ ์ด ๋†’์œผ๋‹ˆ.. ์•„์ง 2์ฃผ๋‚˜ ๋‚จ์€๊ฑฐ ๊ฐ™์ง€๋งŒ ๊ทธ๋ž˜๋„ ์กฐ๋งˆ์กฐ๋งˆํ•œ ๋งˆ์Œ์€ ์–ด์ฉ” ์ˆ˜๊ฐ€ ์—†๋Š”๊ฑฐ ๊ฐ™๋‹ค. ๋ฉ˜ํ† ๋ง ๋๋‚˜๊ณ  ๊ฐ•์˜ ํ•˜๋‚˜ ๋“ฃ๊ณ , ์˜ค๋Š˜์€ ์ ์‹ฌ๋„ ๋จน๊ณ , ๋‹ค์‹œ ๊ฐ•์˜ ๋“ฃ๊ณ , ๊ณผ์ œํ•˜๊ณ .. ์‚ฌ์‹ค์ƒ ๋˜‘๊ฐ™์€ ํ•˜๋ฃจ์˜ ๋ฐ˜๋ณต. ํ™•์‹คํžˆ ๋„๋ฉ”์ธ ์ด๋ก ์œผ๋กœ ๋“ค์–ด์˜ค๋ฉด์„œ๋ถ€ํ„ฐ ํ•˜๋‚˜์˜ ๊ฐ•์˜ ์•ˆ์— ์—ฌ๋Ÿฌ ๋…ผ๋ฌธ์ด ๋“ค์–ด์žˆ๋Š”๋ฐ, ์ด๊ฑฐ ๋‹ค ์ฝ์–ด๋ณผ ์‹œ๊ฐ„์ด ์žˆ์œผ๋ฉด ์ข‹๊ฒ ์ง€๋งŒ ๊ฐ•์˜์— ์น˜์ด๊ณ , ๊ณผ์ œ์— ์น˜์ด๋‹ค๋ณด๋ฉด ๋ชธ์ด ๋‚จ์•„๋‚˜์งˆ ์•Š๋Š”๋‹ค..(๋ฒ„ํ…จ์ค˜ ์ œ๋ฐœ 0-0) ์˜คํ›„์—๋Š” ๋งˆ์Šคํ„ฐํด๋ž˜์Šค๊ฐ€ ์ด์–ด์กŒ๋Š”๋ฐ ์˜ค๋žœ๋งŒ์— ๋ตˆ๋Š” ์•ˆ์ˆ˜๋นˆ๋‹˜..(๋‚ด์  ์นœ๋ฐ€๊ฐ) ํ™•์‹คํžˆ ์ž๊ธฐ๊ฐ€ ์ข‹์•„ํ•˜๋Š” ๋ถ„์•ผ๋ฅผ ์ž˜ํ•œ๋‹ค๋Š” ๊ฒƒ์€ ๋ณต ๋ฐ›์€ ์ผ์ด๋‹ค. ์‹œ๊ฐํ™”์— ๋Œ€ํ•œ ์ค‘์š”์„ฑ์„ ๋‹ค์‹œ ํ•œ ๋ฒˆ ๋Š๋‚„ ์ˆ˜ ์žˆ์—ˆ๋‹ค! 

728x90
Comments