๐Ÿ•
AI Paper Study
  • AI Paper Study
  • Computer Vision
    • SRCNN(2015)
      • Introduction
      • CNN for SR
      • Experiment
      • ๊ตฌํ˜„ํ•ด๋ณด๊ธฐ
    • DnCNN(2016)
      • Introduction
      • Related Work
      • DnCNN Model
      • Experiment
      • ๊ตฌํ˜„ํ•ด๋ณด๊ธฐ
    • CycleGAN(2017)
      • Introduction
      • Formulation
      • Results
      • ๊ตฌํ˜„ํ•ด๋ณด๊ธฐ
  • Language Computation
    • Attention is All You Need(2017)
      • Introduction & Background
      • Model Architecture
      • Appendix - Positional Encoding ๊ฑฐ๋ฆฌ ์ฆ๋ช…
  • ML Statistics
    • VAE(2013)
      • Introduction
      • Problem Setting
      • Method
      • Variational Auto-Encoder
      • ๊ตฌํ˜„ํ•ด๋ณด๊ธฐ
      • Appendix - KL Divergence ์ ๋ถ„
  • ์ง๊ด€์  ์ดํ•ด
    • Seq2Seq
      • Ko-En Translation
Powered by GitBook
On this page

Was this helpful?

  1. Language Computation

Attention is All You Need(2017)

Original Paper - Attention Is All You Need (https://arxiv.org/abs/1706.03762)

Abstract

์ง€๋ฐฐ์ ์œผ๋กœ ๋งŽ์ด ์‚ฌ์šฉ๋˜๋Š” ์‹œํ€€์Šค ๋ณ€ํ™˜ ๋ชจ๋ธ์€ ์ธ์ฝ”๋”์™€ ๋””์ฝ”๋”๋ฅผ ํฌํ•จํ•˜๋ฉฐ ๋ณต์žกํ•œ Recurrent / Convolutional NN์œผ๋กœ ๊ตฌ์„ฑ๋œ๋‹ค. ๊ฐ€์žฅ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋‚ด๋Š” ๋ชจ๋ธ๋„ ์ธ์ฝ”๋”์™€ ๋””์ฝ”๋”๋ฅผ attention mechanism์œผ๋กœ ์—ฐ๊ฒฐํ•˜๋Š” ๊ตฌ์กฐ์ด๋‹ค.

์ด ๋…ผ๋ฌธ์—์„  Recurrent/Convolutional ๊ตฌ์กฐ๋ฅผ ์™„์ „ํžˆ ์—†์• ๊ณ  Attention mechanism์— ๊ธฐ๋ฐ˜ํ•œ, ์ƒˆ๋กญ๊ณ  ๊ฐ„๋‹จํ•œ ๋„คํŠธ์›Œํฌ ๊ตฌ์กฐ - Transformer๋ฅผ ์†Œ๊ฐœํ•œ๋‹ค. ๊ธฐ๊ณ„๋ฒˆ์—ญ ์ƒ์˜ ์„ฑ๋Šฅ์„ ๋น„๊ตํ–ˆ์„ ๋•Œ ๋” ๋ณ‘๋ ฌํ™” ์‰ฝ๊ณ , ์ ๊ฒŒ trainํ•ด๋„ ๋” ์ข‹์€ ์„ฑ๋Šฅ์„ ๋‚ด์—ˆ๋‹ค.

-์ˆ˜์น˜์  ์„ฑ๋Šฅ ๋‚ด์šฉ ์ƒ๋žต-

๋˜ํ•œ Transformer๊ฐ€ ๋‹ค๋ฅธ ์ž‘์—…์—์„œ๋„ ์ž˜ ์ผ๋ฐ˜ํ™”๋จ์„ ๋ณด์˜€๋‹ค.

Previous๊ตฌํ˜„ํ•ด๋ณด๊ธฐNextIntroduction & Background

Last updated 3 years ago

Was this helpful?