Name: Artikel Code Saluran Edukasi
Rating: 4 (200 reviews)

Subword-nmt Basic Use

Subword-nmt basic Use:
1 . Generate BPE model and dictionary

subword-nmt leanr-joint-bpe-and-vocab 
--input corpus.path -s 30000 
--output en.bpe 
--write-vocabulary dict.en.txt

2. Segment the corpus according to the BPE model

subword-nmt apply-bpe -c en.bpe < corpus.path >  corpus.bpe

3. Use Fairseq to train the model based on the dictionary and corpus
3.1Split corpus into training set, verification set, and test set :

sed -n 1,1000000p corpus.bpe > train.en

3.2 Execute the preprocess file

python $FILE/preprocess.py 
--source-lang en --target-lang zh 
--trainpref $DATA/train --validpref $DATA 
--destdir $DATA/preprocess
--srcdict dict.en.txt
--tgtdict dict.zh.txt

Subword-nmt Basic Use

Pembahasan Subword-nmt Basic Use

Tag

Popular Posts