ML/AI/SW Developer

Precourse PyTorch Basic Tensor Manipulation

1. Shape convention

# 2d
|t| = (batch size, dim)

# 3d
|t| = (batch size, width, height) # IMAGE
|t| = (batch size, length, dim) # NLP

2. Numpy와 유사하게 Tensor 다루기 가능

import torch

# slicing
t = torch.FloatTensor([1,2,3,4])

t[0:2] # [1,2,3]
t[2,-1] # [3,4]

# Broadcasting -> 자동으로 계산해줌으로 주의!
m1 = torch.FloatTensor([[1,1]])
m2 = torch.FloatTensor([[2,2]])
m1 + m2 # [[5, 5]]

m2 = torch.FloatTensor([[2]])
m1 + m2 # [[3, 3]]

m2 = torch.FloatTensor([[3], [4]])
m1 + m2 # [[4, 4], [5, 5]]

# View (Reshape)
t = np.array([[[0,1,2],
               [3,4,5]],
               
              [[6,7,8],
               [9,10,11]]])
ft = torch.FloatTensor(t) # (2, 2, 3)

ft.view([-1, 3]) # (4, 3)
ft.view([-1, 1, 3]) # (4, 1, 3)

# squeeze: 1차원 제거 / unsqueeze : 원하는 곳에 1차원 생성
# concat
# stacking
x = torch.FloatTensor([1, 2])
y = torch.FloatTensor([3, 4])
z = torch.FloatTensor([5, 6])

torch.stack([x,y,z]) # [[1,2], [3,4], [5,6]]
torch.stack([x,y,z], dim=1) # [[1,2,3], [4,5,6]]

# ones_like(x), zeros_like(x)

3. Linear regression

import torch
import torch.optimizer as optim

x_train = torch.FloatTensor([[1], [2], [3]])
y_train = troch.FloatTensor([[2], [4], [6]])

W = torch.zeros(1, requires_grad=True) # 학습할 변수
b = torch.zeros(1, requires_grad=True)

optimizer = optim.SGD([W, b], lr=0.001)

epochs = 100
for e in range(1, 1+epochs):
   hyp = x_train*W + b
   cost = torch.mean((hyp - y_train)**2)

   optimizer.zero_grad()
   cost.backward()
   optimizer.step()

   if e % 5 == 0:
      print(cost.item())

4. Deeper 이해하기?

  • cost function
    • MSE(mean square error), MAE(mean abolute error) 등
    • $ {\partial cost \over \partial W}=\nabla W$
      • cost를 줄이기 위해, grad W 값에 일정상수를 곱한 값을 빼는 방식
      • $ W := W - \alpha \nabla W $
      • $ \alpha : $ Learning rate
      • $ W :$ Gradient
  • W -= lr * gradient (in torch)