WMT German to English news translation Using SockEye Toolkit

This tutorial we will train a German to English Sockeye model on a dataset from the Conference on Machine Translation (WMT) 2017.

Setup
Sockeye expects tokenized data as the input. For this tutorial we use data that has already been tokenized for us. However, keep this in mind for any other data set you want to use with Sockeye. In addition to tokenization we will split words into subwords using Byte Pair Encoding (BPE). In order to do so we use a tool called subword-nmt. Run the following commands to set up the tool:
git clone https://github.com/rsennrich/subword-nmt.git
export PYTHONPATH=$(pwd)/subword-nmt:$PYTHONPATH

We will visualize training progress using Tensorboard. Install it using:
pip install tensorboard
More on: https://awslabs.github.io/sockeye/tutorials/wmt.html