Subword-nmt Basic Use

Subword-nmt basic Use:
1 . Generate BPE model and dictionary
subword-nmt leanr-joint-bpe-and-vocab 
--input corpus.path -s 30000 
--output en.bpe 
--write-vocabulary dict.en.txt

2. Segment the corpus according to the BPE model
subword-nmt apply-bpe -c en.bpe < corpus.path >  corpus.bpe

3. Use Fairseq to train the model based on the dictionary and corpus
3.1Split corpus into training set, verification set, and test set :
sed -n 1,1000000p corpus.bpe > train.en 

3.2 Execute the preprocess file
python $FILE/preprocess.py 
--source-lang en --target-lang zh 
--trainpref $DATA/train --validpref $DATA 
--destdir $DATA/preprocess
--srcdict dict.en.txt
--tgtdict dict.zh.txt