Pytorch_RNN_long_sequence

RNN Long Sequence 다루기

모두를 위한 딥러닝 - 파이토치 강의 참고

RNN intro를 통해 hello 예제를 살펴보았다.
비슷하게 one hot encoding을 통해 hello보다 긴 문장을 다루어보고자 한다.
먼저, 문장으로부터 각 알파벳과 인덱스를 매칭시키는 dict를 만들고 이를 활용하고자 한다.

sentence = ("if you want to build a ship, don't drum up people together to "
            "collect wood and don't assign them tasks and work, but rather "
            "teach them to long for the endless immensity of the sea.")

char_set = list(set(sentence))
char_dic = {c:i for i, c in enumerate(char_set)}

윈도우 크기를 정하고 그에 맞게 문장을 잘라서 input data X로 사용하고 하나의 character만큼 오른쪽으로 쉬프트한 윈도우 크기의 문장을 Y로 사용하고자 한다.

예를 들어, 윈도우 크기가 5라면 처음 X는
X = if yo --> Y = f you
가 되는 방식이다.

이 방법을 통해 문장 전체를 윈도우 크기로 순회해서 데이터X와 Y를 만든다.

x_data = []
y_data = []

for i in range(0, len(sentence) - sequence_length):
    x = sentence[i: i+sequence_length]
    y = sentence[i+1: i+sequence_length+1]

    x_data.append([char_dic[c] for c in x])
    y_data.append([char_dic[c] for c in y])

x_one_hot = [np.eye(dic_size)[x] for x in x_data]


X = torch.FloatTensor(x_one_hot).to(device)
Y = torch.LongTensor(y_data).to(device)

이제 원핫인코딩된 X로부터 라벨값 Y를 예측하는 문제의 데이터가 만들어졌다.
RNN모델을 만들때 이전과는 다르게 두 층의 RNN layer와 fully connected layer를 이용해보고자 한다.

class Net(torch.nn.Module):
    def __init__(self, input_dim, hidden_dim, layers):
        super(Net, self).__init__()
        self.rnn = torch.nn.RNN(input_dim, hidden_dim, num_layers=layers, batch_first=True)
        # num_layers를 설정함으로써 RNN layer를 여러겹 쌓을 수 있다.
        self.fc = torch.nn.Linear(hidden_dim, hidden_dim, bias=True)

    def forward(self, x):
        x, _status = self.rnn(x)
        x = self.fc(x)
        return x

net = Net(dic_size, hiddend_size, 2).to(device)

위의 모델을 통해 학습하면서 예측을 통해 완성되는 문장을 확인해본다.

criterion = torch.nn.CrossEntropyLoss().to(device)
optimizer = optim.Adam(net.parameters(), lr=learning_rate)
for i in range(100):
    optimizer.zero_grad()
    outputs = net(X)
    loss = criterion(outputs.view(-1,dic_size), Y.view(-1))
    loss.backward()
    optimizer.step()

    results = outputs.argmax(dim=2)
    predict_str = ''

    for j, result in enumerate(results):
        if j==0:
            predict_str += ''.join([char_set[t] for t in result])
        else:
            predict_str += char_set[result[-1]]

    print(predict_str)

마지막에 j가 0일때와 아닐때로 나눈 이유는 첫 번째 예측에서 sequence_length만큼의 문장을 만들어주면 그 이후로부터는 result에서 마지막 문자를 제외하고는 이전 문장과 동일하기 때문이다.
예를 들어 sequence_length가 5이고 j가 0일때 “if yo” 라는 문장을 만들었고 다음 result에서는 “f you”를 예측했다면 이미 이전 문장과 그 다음 문장에서 “f yo” 라는 단어들은 겹치게된다.
따라서, j가 0일때 sequence_length만큼의 문장을 만들고 그 이후로는 result의 마지막 값만을 가져와서 predict_str에 이어붙이게 만들어준다.

Full Code

# Python, Pytorch

Pytorch_RNN_long_sequence

RNN Long Sequence 다루기

모두를 위한 딥러닝 - 파이토치 강의 참고

Full Code

Like this article? Support the author with

Comments

Links

Categories

Tag Cloud

Recent

Archives

Tags

Recent

Archives

Tags

Your browser is out-of-date!