Chapter 3 / Artificial Neural Net

Binary Classification

Tensors

Tensor๋ฅผ ๋งŒ๋“ค ๋•Œ๋Š”

x = torch.tensor([[1,2,3],[4,5,6],[7,8,9]])

tensor์˜ members

tensor์— ๊ฐ€ํ•  ์ˆ˜ ์žˆ๋Š” ์กฐ์ž‘๋“ค

์—ฐ์‚ฐ๋“ค

Autograd

Gradient Descent ์ž๋™ ๊ตฌํ˜„

Gradient ์ ์šฉ ๋ฐฉ๋ฒ• - ํ…์„œ ์„ ์–ธ ์‹œ ๋‹ค์Œ๊ณผ ๊ฐ™์ด requires_grad ๋ฅผ True๋กœ ์„ค์ •ํ•˜๋ฉด tensor.grad์— gradient๋ฅผ ์ €์žฅ

w = torch.tensor(1.0, requires_grad=True)

์˜ˆ์‹œ(backward())

a = (w*3)**2
a.backward() #๋ฏธ๋ถ„ ์—ญ๊ณ„์‚ฐ
print(w.grad) #๋ฏธ๋ถ„๊ฐ’ ์ถœ๋ ฅ

๊ฐ„๋‹จํ•œ GD

lr = 0.8 # learning rate
random_tensor = torch.randn(10000, dtype = torch.float)

for i in range(20000): # ๋ฐ˜๋ณต
    random_tensor.requires_grad_(True) #gradient ์ €์žฅ ON
    hypothesis = weird_function(random_tensor)
    loss = dist_loss(hypothesis,broken_img) # loss ๊ณ„์‚ฐ
    loss.backward() # backpropagation

    with torch.no_grad(): # gradient ๊ณ„์‚ฐ ์—†์ด ์ง„ํ–‰ํ•˜๋Š” scope
        random_tensor = random_tensor - lr*random_tensor.grad

ANN

๊ฐ€์ค‘์น˜ ํ–‰๋ ฌ์— ์ด์ „ ์ธต์˜ ๋ฒกํ„ฐ๋ฅผ ๊ณฑํ•˜๊ณ  bias๋ฅผ ๋”ํ•˜๊ณ  ๋น„์„ ํ˜•ํ™”๋ฅผ ์œ„ํ•ด ํ™œ์„ฑํ™”ํ•จ์ˆ˜ ์ ์šฉ

torch.nn.Linear( in, out)์œผ๋กœ wx+b, torch.nn.ReLU()๋กœ ReLU, torch.nn.Sigmoid()๋กœ ์‹œ๊ทธ๋ชจ์ด๋“œ ํ™œ์„ฑํ™” ํ•จ์ˆ˜ ์ ์šฉ

torch.nn.BCELoss()๋กœ Binary Cross Entropy loss๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ๊ฐ์ฒด๋ฅผ ์ƒ์„ฑ, torch.optim.SGD(parameters,lr)๋กœ ํ•ด๋‹น ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ๋Œ€์ƒ์œผ๋กœ ํ•˜๋Š” optimizer๋ฅผ ์ƒ์„ฑ.

model.eval()๊ณผ model.train()์€ ๊ฐ๊ฐ ์‹คํ–‰ ๋ชจ๋“œ์™€ ํ•™์Šต ๋ชจ๋“œ๋ฅผ ์Šค์œ„์นญํ•  ์ˆ˜ ์žˆ๋‹ค.

optimizer.zero_grad()๋ฅผ ์ด์šฉํ•ด gradient๋ฅผ 0์œผ๋กœ ์žฌ์„ค์ •ํ•ด ์ƒˆ๋กœ์šด gradient๋ฅผ ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•จ.

train ๋ชจ๋“œ๋กœ ๋ณ€๊ฒฝํ•œ ํ›„,

๋ชจ๋ธ์— ๋ฐ์ดํ„ฐ๋ฅผ ๋„ฃ์–ด output๊ณ„์‚ฐ, loss๋ฅผ ๊ณ„์‚ฐ, loss๋ฅผ backpropagation, optimizer์œผ๋กœ parameter์— ์ €์žฅ๋œ gradient๋ฅผ ์ด์šฉํ•ด ์ตœ์ ํ™” - ๋ฅผ ๋ฐ˜๋ณตํ•œ๋‹ค.

3์žฅ์˜ ANN์€ XOR ์ด์ง„ ๋ถ„๋ฅ˜๋ผ๊ณ  ๋ณผ ์ˆ˜ ์žˆ๋‹ค. ํ•˜๋‚˜์˜ linear ๊ตฌ๋ถ„์œผ๋กœ ๋ถ„๋ฅ˜ํ•  ์ˆ˜ ์—†์–ด ์—ฌ๋Ÿฌ ์ธต์„ ์‚ฌ์šฉํ•œ๋‹ค.

BCE

Binary Cross Entropy๋Š” Cross Entropy์˜ ํŠน์ˆ˜ํ•œ ๊ฒฝ์šฐ(ํ™•๋ฅ ๋ถ„ํฌํ•จ์ˆ˜๊ฐ€ scalar๊ฐ’์ด๊ณ , ์ด์ง„์œผ๋กœ ๋ถ„๋ฅ˜ํ•  ๋•Œ) ์ด๋‹ค. Cross Entropy๋Š” ๋™์ผํ•œ Event Space(์—ฌ๊ธฐ์„  Sample Space์˜ ์›์†Œ๊ฐ€ 2๊ฐœ๋ผ ๋ณผ ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ Event Space๋Š” {O, X})์—์„œ ๋‹ค๋ฅธ ๋‘ ํ™•๋ฅ ๋ถ„ํฌ(์—ฌ๊ธฐ์„  ๋ฐ์ดํ„ฐ์˜ ๋ถ„ํฌ์™€ ์˜ˆ์ธก์˜ ๋ถ„ํฌ)์˜ ์ •๋ณด๋Ÿ‰ ์ฐจ์ด๋ฅผ ๊ณ„์‚ฐํ•œ๋‹ค.

3์žฅ์˜ ๋ชจ๋ธ์—์„œ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ์€ ๋‘ ํด๋ž˜์Šค ์ค‘ ํ•˜๋‚˜์˜ ํด๋ž˜์Šค์— ๋Œ€ํ•ด ์†ํ•  ํ™•๋ฅ ์ด๋ฏ€๋กœ ํ•˜๋‚˜์˜ ํ™•๋ฅ  ๊ฐ’์„ ๊ฐ€์ง€๋ฉฐ ํ™•๋ฅ ์งˆ๋Ÿ‰ํ•จ์ˆ˜๋Š” y, 1-y ๋กœ ๊ฐ ํด๋ž˜์Šค์— ํ• ๋‹น๋œ๋‹ค๊ณ  ๋ณผ ์ˆ˜ ์žˆ๋‹ค. ๊ทธ๋Ÿฌ๋ฉด BCE๋Š”

y๋Š” ์ฐธ๊ฐ’(์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ), \hat y๋Š” ์˜ˆ์ธก๊ฐ’์ด๋‹ค.

Reuse Parameter/Model

๋ชจ๋ธ ํ•™์Šต ํ›„ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์ €์žฅํ–ˆ๋‹ค๊ฐ€ ๋‹ค์‹œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค. torch.save(model.statedict(),'filename') ์œผ๋กœ ์ €์žฅ ํ›„, model.load_state_dict(torch.load('filename')) ์œผ๋กœ restore ํ•  ์ˆ˜ ์žˆ๋‹ค.

Last updated