오차역전파 - 단순한 계층 구현하기

728x90

곱셈 계층

모든 계층은 forward()와 backward()라는 공통의 메서드(인터페이스)를 갖도록 구현한다.
forward()는 순전파, backward()는 역전파를 처리한다.

# 5.4.1 곱셈 계층

class MulLayer:
    def __init__(self):
        self.x = None
        self.y = None
    
    def forward(self, x, y):
        self.x = x
        self.y = y
        out = x * y
        
        return out
    
    def backward(self, dout):
        dx = dout * self.y # x와 y를 바꾼다.
        dy = dout * self.x
        
        return dx, dy

forward()에서는 x와 y인수로 받고 곱해서 반환
backward()에서는 상류에서 넘어온 미분(dout)에 순전파 때의 값을 '서로 바꿔' 곱한 후 하류로 흘린다.

사과 예제

apple = 100
apple_num = 2
tax = 1.1

# 계층들
mul_apple_layer = MulLayer()
mul_tax_layer = MulLayer()

# 순전파
apple_price = mul_apple_layer.forward(apple, apple_num) # 100,2 순전파 x,y
price = mul_tax_layer.forward(apple_price, tax) # 

print("순전파 : ",price) # 220

# 역전파

dprice = 1
dapple_price, dtax = mul_tax_layer.backward(dprice)
dapple, dapple_num = mul_apple_layer.backward(dapple_price)

print("역전파 : ", dapple,dapple_num, dtax) # 2.2 110~ 200

backward() 호출 순서는 순전파 메서드의 반대
backward()가 받는 인수는 '순전파의 출력에 대한 미분'이다.
mul_apple_layer라는 곱셈 계층은 순전파때는 apple_price를 출력한다.
역전파 때는 apple_price의 미분 값인 dapple_price를 인수로 받는다.

덧셈 계층

덧셈 계층은 *1로 상류에서 내려온 미분(dout)을 그대로 하류로 흘린다.

# 덧셈 계층
class AddLayer:
    def __init__(self):
        pass
    
    def forward(self, x,y):
        out = x+y
        return out
    
    def backward(self, dout):
        dx = dout * 1
        dy = dout * 1
        return dx, dy

사과2개 귤3개

apple = 100
apple_num = 2
orange = 150
orange_num = 3
tax = 1.1

# 계층들
mul_apple_layer = MulLayer()
mul_orange_layer = MulLayer()
add_apple_orange_layer = AddLayer()
mul_tax_layer = MulLayer()


# 순전파
apple_price = mul_apple_layer.forward(apple, apple_num) # 100,2 순전파 x,y
orange_price = mul_orange_layer.forward(orange, orange_num)
all_price = add_apple_orange_layer.forward(apple_price, orange_price)
price = mul_tax_layer.forward(all_price, tax) # 

# 역전파
dprice = 1
dall_price, dtax = mul_tax_layer.backward(dprice)
dapple_price, dorange_price = add_apple_orange_layer.backward(dall_price)
dorange, dorange_num = mul_orange_layer.backward(dorange_price)
dapple, dapple_num = mul_apple_layer.backward(dapple_price)


print("순전파 : ",price) # 220
print("역전파 : ", dapple_num, dapple, dorange, dorange_num, dtax) # 2.2 110~ 200