Skip to content

How to create dictionary dict.lg.txt in MASS supNMT #172

@Ashmari

Description

@Ashmari

I tried MASS unsupNMT and I then tried with supNMT but I'm getting this. And I am not clear about creating the dict.lg.txt
Do we need to create data directory manually as given in instructions?

I am getting this error after running generate_enzh_data.sh

Namespace(alignfile=None, cpu=False, criterion='cross_entropy', dataset_impl='cached', destdir='data//processed/', fp16=False, fp16_init_scale=128, fp16_scale_tolerance=0.0, fp16_scale_window=None, joined_dictionary=False, log_format=None, log_interval=1000, lr_scheduler='fixed', memory_efficient_fp16=False, min_loss_scale=0.0001, no_progress_bar=False, nwordssrc=-1, nwordstgt=-1, only_source=True, optimizer='nag', padding_factor=8, seed=1, source_lang='en', srcdict='data//mono//dict.en.txt', target_lang=None, task='cross_lingual_lm', tbmf_wrapper=False, tensorboard_logdir='', testpref=None, tgtdict=None, threshold_loss_scale=None, thresholdsrc=0, thresholdtgt=0, trainpref='data//mono//train', user_dir=None, validpref='data//mono//valid', workers=20)
Traceback (most recent call last):
File "/home/ashmari/anaconda3/envs/MassN/bin/fairseq-preprocess", line 8, in
sys.exit(cli_main())
File "/home/ashmari/anaconda3/envs/MassN/lib/python3.7/site-packages/fairseq_cli/preprocess.py", line 267, in cli_main
main(args)
File "/home/ashmari/anaconda3/envs/MassN/lib/python3.7/site-packages/fairseq_cli/preprocess.py", line 80, in main
src_dict = task.load_dictionary(args.srcdict)
File "/home/ashmari/anaconda3/envs/MassN/lib/python3.7/site-packages/fairseq/tasks/cross_lingual_lm.py", line 82, in load_dictionary
return MaskedLMDictionary.load(filename)
File "/home/ashmari/anaconda3/envs/MassN/lib/python3.7/site-packages/fairseq/data/dictionary.py", line 181, in load
raise fnfe
File "/home/ashmari/anaconda3/envs/MassN/lib/python3.7/site-packages/fairseq/data/dictionary.py", line 175, in load
with open(f, 'r', encoding='utf-8') as fd:
FileNotFoundError: [Errno 2] No such file or directory: 'data//mono//dict.en.txt'
mv: cannot stat 'data//processed//train.en-None.en.bin': No such file or directory
mv: cannot stat 'data//processed//train.en-None.en.idx': No such file or directory
mv: cannot stat 'data//processed//valid.en-None.en.bin': No such file or directory
mv: cannot stat 'data//processed//valid.en-None.en.idx': No such file or directory
Namespace(alignfile=None, cpu=False, criterion='cross_entropy', dataset_impl='cached', destdir='data//processed/', fp16=False, fp16_init_scale=128, fp16_scale_tolerance=0.0, fp16_scale_window=None, joined_dictionary=False, log_format=None, log_interval=1000, lr_scheduler='fixed', memory_efficient_fp16=False, min_loss_scale=0.0001, no_progress_bar=False, nwordssrc=-1, nwordstgt=-1, only_source=True, optimizer='nag', padding_factor=8, seed=1, source_lang='zh', srcdict='data//mono//dict.zh.txt', target_lang=None, task='cross_lingual_lm', tbmf_wrapper=False, tensorboard_logdir='', testpref=None, tgtdict=None, threshold_loss_scale=None, thresholdsrc=0, thresholdtgt=0, trainpref='data//mono//train', user_dir=None, validpref='data//mono//valid', workers=20)
Traceback (most recent call last):
File "/home/ashmari/anaconda3/envs/MassN/bin/fairseq-preprocess", line 8, in
sys.exit(cli_main())
File "/home/ashmari/anaconda3/envs/MassN/lib/python3.7/site-packages/fairseq_cli/preprocess.py", line 267, in cli_main
main(args)
File "/home/ashmari/anaconda3/envs/MassN/lib/python3.7/site-packages/fairseq_cli/preprocess.py", line 80, in main
src_dict = task.load_dictionary(args.srcdict)
File "/home/ashmari/anaconda3/envs/MassN/lib/python3.7/site-packages/fairseq/tasks/cross_lingual_lm.py", line 82, in load_dictionary
return MaskedLMDictionary.load(filename)
File "/home/ashmari/anaconda3/envs/MassN/lib/python3.7/site-packages/fairseq/data/dictionary.py", line 181, in load
raise fnfe
File "/home/ashmari/anaconda3/envs/MassN/lib/python3.7/site-packages/fairseq/data/dictionary.py", line 175, in load
with open(f, 'r', encoding='utf-8') as fd:
FileNotFoundError: [Errno 2] No such file or directory: 'data//mono//dict.zh.txt'

What is going wrong? Please help.
Thank you

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions