Chapter 4 / Deep Neural Net

Multi-Class Classification

Fasion MNIST ๋ฐ์ดํ„ฐ์…‹, Deep Neural Net์„ ์ด์šฉํ•ด ํŒจ์…˜ ์•„์ดํ…œ(class=10)์„ ๋ถ„๋ฅ˜ํ•˜๋Š” ๋ชจ๋ธ์„ ๋งŒ๋“ญ๋‹ˆ๋‹ค.

Dataset

Fasion MNIST ๋ฐ์ดํ„ฐ์…‹ ์‚ฌ์šฉ

28x28 ํ”ฝ์…€์˜ 70000๊ฐœ์˜ ํ‘๋ฐฑ ์ด๋ฏธ์ง€๋กœ ํŒจ์…˜ ์•„์ดํ…œ์„ 10๊ฐ€์ง€ ์นดํ…Œ๊ณ ๋ฆฌ๋กœ ๋‚˜๋ˆˆ ๋ฐ์ดํ„ฐ์…‹

MNIST๊ฐ€ ์ˆซ์ž ๋ฐ์ดํ„ฐ๋ฅผ 28x28, 10๊ฐœ๋กœ ๋‚˜๋ˆˆ ๊ฒƒ์ฒ˜๋Ÿผ ๋น„์Šทํ•œ ํ˜•ํƒœ, torchvision์—์„œ ๋ฐ์ดํ„ฐ์…‹์„ ๋กœ๋“œ

torchvision, utils

transform = transforms.Compose([transform๋ฆฌ์ŠคํŠธ]) ๋กœ ๋ณ€ํ™˜๊ธฐ๋ฅผ ๋งŒ๋“ค๊ณ , torchvision.datasets์—์„œ ๋ฐ์ดํ„ฐ์…‹์„ ๋กœ๋“œํ•  ๋•Œ transform ํ‚ค์›Œ๋“œ์— transform์„ ํ• ๋‹นํ•ด์ฃผ๋ฉด ๋ฐ์ดํ„ฐ์…‹์„ transform ๋ฆฌ์ŠคํŠธ ์ˆœ์„œ๋Œ€๋กœ ๋ณ€ํ™˜

trainset = datasets.FashionMNIST(root='./.data/', train=True, download=True, transform=transform)
testset = datasets.FashionMNIST(root='./.data/',train=False,download=True,transform=transform)

์‚ฌ์šฉ๋œ ๊ตฌ๋ฌธ๋“ค

Model

๋‹ค์Œ ์ฝ”๋“œ๋กœ CUDA ํ™•์ธ

USE_CUDA = torch.cuda.is_available()
DEVICE = torch.device('cuda' if USE_CUDA else 'cpu')

๋ชจ๋ธ์˜ ๊ตฌ์กฐ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ œ์ž‘

model.to(DEVICE) ๋ฅผ ํ†ตํ•ด CUDA๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Œ

torch.nn.Module ๋กœ๋ถ€ํ„ฐ ์„œ๋ธŒํด๋ž˜์‹ฑํ•ด ๋งŒ๋“  ๋ชจ๋ธ์ด๋‹ค. nn.Module์„ ์‚ฌ์šฉํ•˜๋ฉด ์ง๊ด€์ ์œผ๋กœ ์ปค์Šคํ…€ ๋ชจ๋ธ์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค.

class Net(nn.Module):
    def __init__(self):
        super(Net,self).__init__()
        self.fc1 = nn.Linear(784,256)
        self.fc2 = nn.Linear(256,128)
        self.fc3 = nn.Linear(128,10)

    def forward(self,x):
        x = x.view(-1,784)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

Optimization

๋‹ค์Œ ์ฝ”๋“œ๋Š” train, evaluate ๋ถ€๋ถ„์•„๋‹ค.

def train(model, train_loader, optimizer):
    model.train() #๋ชจ๋“œ ๋ณ€๊ฒฝ
    for batch_idx, (data, target) in enumerate(train_loader): #๋ชจ๋“  ๋ฏธ๋‹ˆ๋ฐฐ์น˜ ๋ฐ˜๋ณต
        data, target = data.to(DEVICE), target.to(DEVICE) #๋ฐ์ดํ„ฐ ์ด๋™
        optimizer.zero_grad() #grad ์ดˆ๊ธฐํ™”
        output = model(data) #forward propagation
        loss = F.cross_entropy(output,target) # loss ๊ณ„์‚ฐ
        loss.backward() #gradient ๊ณ„์‚ฐ
        optimizer.step() #optimize

def evaluate(model, test_loader):
    model.eval() #๋ชจ๋“œ ๋ณ€๊ฒฝ
    test_loss = 0
    correct = 0
    with torch.no_grad(): #gradient ๊ณ„์‚ฐ ์•ˆํ•˜๋Š” scope
        for data, target in test_loader:
            data,target = data.to(DEVICE), target.to(DEVICE)
            output = model(data) 
            test_loss += F.cross_entropy(output,target,reduction='sum').item() #reduction='sum'์œผ๋กœ ๋ฐฐ์น˜ ์›์†Œ๋“ค์˜ ํ•ฉ ๊ตฌํ•˜๊ธฐ
            pred = output.max(1,keepdim=True)[1] #index ๊ตฌํ•˜๊ธฐ(argmax)
            correct += pred.eq(target.view_as(pred)).sum().item() # view_as๋กœ ๋ชจ์–‘ ์ผ์น˜, ๋ชจ๋“  ๋ฐฐ์น˜ ์›์†Œ์— ๋Œ€ํ•ด ์ผ์น˜ํ•˜๋ฉด 1, ๋ชจ๋‘ ํ•ฉํ•˜๊ธฐ
    test_loss /= len(test_loader.dataset)
    test_accuracy = 100. * correct / len(test_loader.dataset)
    return test_loss, test_accuracy
    
for epoch in range(1,EPOCHS+1):
    train(model,train_loader,optimizer) #train
    test_loss, test_accuracy = evaluate(model,test_loader) #test
    print(f'{epoch} Test Loss: {test_loss:.4f}, Accuracy: {test_accuracy:.2f}')

ํ•™์Šต ๊ณผ์ •์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

  1. data์™€ target์„ GPU(CPU)๋กœ ์ด๋™

  2. optimizer์˜ gradient๋ฅผ ์ดˆ๊ธฐํ™”

  3. data์— ๋Œ€ํ•œ model์˜ ์ถœ๋ ฅ ์—ฐ์‚ฐ

  4. output์— ๋Œ€ํ•œ target๊ณผ์˜ loss ๊ณ„์‚ฐ

  5. loss๋ฅผ ์—ญ๋ฏธ๋ถ„ํ•ด gradient ํ• ๋‹น

  6. optimizer์— ํ• ๋‹น๋œ parameter๋“ค์˜ ๊ณ„์‚ฐ๋œ gradient๋ฅผ ์ ์šฉ

ํ‰๊ฐ€ ๋ฐฉ๋ฒ•์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

  1. data์™€ target์„ GPU(CPU)๋กœ ์ด๋™

  2. data์— ๋Œ€ํ•œ model์˜ ์ถœ๋ ฅ ์—ฐ์‚ฐ

  3. ์ถœ๋ ฅ๊ณผ target์˜ loss ๊ณ„์‚ฐ, ๋ชจ๋“  batch์— ๋Œ€ํ•ด ํ•ฉํ•จ

  4. ์ถœ๋ ฅ์˜ argmax๋ฅผ ๊ตฌํ•ด ์–ผ๋งˆ๋‚˜ ๋งŽ์ด ์ •๋‹ต์„ ๋งžํ˜”๋Š”์ง€ ์—ฐ์‚ฐ

Data Augmentation

์ด๋ฏธ์ง€๋ฅผ ๋ฌด์ž‘์œ„๋กœ ๋’ค์ง‘์–ด ๋ฐ์ดํ„ฐ์…‹์˜ ํฌ๊ธฐ๋ฅผ ๋Š˜๋ ค ํ•™์Šต์„ ๋” ์ž˜ ํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋งŒ๋“ ๋‹ค. ๋ณธ ์ฑ•ํ„ฐ์—์„œ RandomHorizontalFlip์„ transform์— ์ถ”๊ฐ€ํ•ด ๋ฐ์ดํ„ฐ๋ฅผ ๋Š˜๋ฆฐ๋‹ค.

Dropout

๊ณผ์ ํ•ฉ์€ ์ ์€ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด ์˜ˆ์ธก์ด ํ•™์Šต ์˜ค์ฐจ๋ฅผ ์ค„์ด๋Š” ๋ฐ ๊ณผํ•˜๊ฒŒ ํ•™์Šตํ•˜๊ณ  ์‹ค์ œ ์ผ๋ฐ˜์  ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด์„  ์ผ๋ฐ˜ํ™”ํ•˜์ง€ ๋ชปํ•˜๋Š” ๊ฒฝ์šฐ๋ฅผ ๋งํ•œ๋‹ค. train loss๋Š” ๊ณ„์† ์ค„์–ด๋“ค์ง€๋งŒ, validation loss๋Š” ์ฆ๊ฐ€ํ•˜๋Š” ์‹œ์ ์ด ์žˆ๋Š”๋ฐ, ์ด ์‹œ์ ์—์„œ ํ•™์Šต์„ ์ข…๋ฃŒํ•ด์•ผ ์ ์ ˆํ•œ ์˜ˆ์ธก์„ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค.

dropout์€ ์‹ ๊ฒฝ๋ง์—์„œ ๋‹ค์Œ layer๋กœ ์ด๋™ํ•  ๋•Œ ์ผ์ • ํ™•๋ฅ ๋กœ node๊ฐ€ ์—†๋Š” ๊ฒƒ์ฒ˜๋Ÿผ ์ด๋™ํ•œ๋‹ค. ๊ณ„์‚ฐํ•œ ๊ฒฐ๊ณผ์—์„œ ์ผ๋ถ€ node์˜ ๊ฒฐ๊ณผ๋ฅผ ์ผ์ • ํ™•๋ฅ ๋กœ ์ œ๊ฑฐํ•ด ๊ณผ์ ํ•ฉ์„ ๋ฐฉ์ง€ํ•œ๋‹ค.

๊ฐ„๋‹จํžˆ dropout ํ•จ์ˆ˜๋ฅผ ๊ฑฐ์นจ์œผ๋กœ์จ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค.

Last updated