Attention is All You Need(2017)

Original Paper - Attention Is All You Need (https://arxiv.org/abs/1706.03762)

Abstract

์ง€๋ฐฐ์ ์œผ๋กœ ๋งŽ์ด ์‚ฌ์šฉ๋˜๋Š” ์‹œํ€€์Šค ๋ณ€ํ™˜ ๋ชจ๋ธ์€ ์ธ์ฝ”๋”์™€ ๋””์ฝ”๋”๋ฅผ ํฌํ•จํ•˜๋ฉฐ ๋ณต์žกํ•œ Recurrent / Convolutional NN์œผ๋กœ ๊ตฌ์„ฑ๋œ๋‹ค. ๊ฐ€์žฅ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋‚ด๋Š” ๋ชจ๋ธ๋„ ์ธ์ฝ”๋”์™€ ๋””์ฝ”๋”๋ฅผ attention mechanism์œผ๋กœ ์—ฐ๊ฒฐํ•˜๋Š” ๊ตฌ์กฐ์ด๋‹ค.

์ด ๋…ผ๋ฌธ์—์„  Recurrent/Convolutional ๊ตฌ์กฐ๋ฅผ ์™„์ „ํžˆ ์—†์• ๊ณ  Attention mechanism์— ๊ธฐ๋ฐ˜ํ•œ, ์ƒˆ๋กญ๊ณ  ๊ฐ„๋‹จํ•œ ๋„คํŠธ์›Œํฌ ๊ตฌ์กฐ - Transformer๋ฅผ ์†Œ๊ฐœํ•œ๋‹ค. ๊ธฐ๊ณ„๋ฒˆ์—ญ ์ƒ์˜ ์„ฑ๋Šฅ์„ ๋น„๊ตํ–ˆ์„ ๋•Œ ๋” ๋ณ‘๋ ฌํ™” ์‰ฝ๊ณ , ์ ๊ฒŒ trainํ•ด๋„ ๋” ์ข‹์€ ์„ฑ๋Šฅ์„ ๋‚ด์—ˆ๋‹ค.

-์ˆ˜์น˜์  ์„ฑ๋Šฅ ๋‚ด์šฉ ์ƒ๋žต-

๋˜ํ•œ Transformer๊ฐ€ ๋‹ค๋ฅธ ์ž‘์—…์—์„œ๋„ ์ž˜ ์ผ๋ฐ˜ํ™”๋จ์„ ๋ณด์˜€๋‹ค.

Last updated