However, when I tried to run the model created by learning with the code in the current repository on the fairseq system, the following problem occurred and inference was not possible.
INFO:__main__:Extracting hubert acoustic features...
Traceback (most recent call last):
File "~/fairseq/examples/textless_nlp/gslm/speech2unit/clustering/quantize_with_kmeans.py", line 141, in <module>
main(args, logger)
File "~/fairseq/examples/textless_nlp/gslm/speech2unit/clustering/quantize_with_kmeans.py", line 98, in main
features_batch = get_features(
File "~/fairseq/examples/textless_nlp/gslm/speech2unit/pretrained/utils.py", line 73, in get_features
generator, num_files = get_feature_iterator(
File "~/fairseq/examples/textless_nlp/gslm/speech2unit/pretrained/utils.py", line 58, in get_feature_iterator
reader = feature_reader_cls(
File "~/fairseq/examples/textless_nlp/gslm/speech2unit/pretrained/hubert_feature_reader.py", line 23, in __init__
) = fairseq.checkpoint_utils.load_model_ensemble_and_task(
File "~/fairseq/fairseq/checkpoint_utils.py", line 461, in load_model_ensemble_and_task
task = tasks.setup_task(cfg.task, from_checkpoint=True)
File "~/fairseq/fairseq/tasks/__init__.py", line 44, in setup_task
task is not None
AssertionError: Could not infer task type from {'_name': 'contentvec_pretraining', 'data': '~/contentvec/metadata', 'fine_tuning': False, 'labels': ['km'], 'label_dir': '~/contentvec/label', 'label_rate': 50, 'sample_rate': 16000, 'normalize': False, 'enable_padding': False, 'max_keep_size': None, 'max_sample_size': 250000, 'min_sample_size': 32000, 'single_target': False, 'random_crop': True, 'crop': True, 'pad_audio': False, 'spk2info': '~/contentvec/metadata/output.dict'}. Available argparse tasks: dict_keys(['hubert_pretraining', 'speech_unit_modeling', 'translation', 'multilingual_translation', 'semisupervised_translation', 'audio_pretraining', 'nlu_finetuning', 'translation_lev', 'audio_finetuning', 'audio_classification', 'legacy_masked_lm', 'sentence_prediction', 'sentence_prediction_adapters', 'translation_from_pretrained_xlm', 'translation_from_pretrained_bart', 'denoising', 'speech_dlm_task', 'cross_lingual_lm', 'sentence_ranking', 'language_modeling', 'masked_lm', 'multilingual_language_modeling', 'speech_to_text', 'text_to_speech', 'multilingual_denoising', 'online_backtranslation', 'simul_speech_to_text', 'simul_text_to_text', 'multilingual_masked_lm', 'translation_multi_simple_epoch', 'frm_text_to_speech', 'speech_to_speech', 'span_masked_lm', 'dummy_lm', 'dummy_masked_lm', 'dummy_mt']). Available hydra tasks: dict_keys(['hubert_pretraining', 'speech_unit_modeling', 'translation', 'audio_pretraining', 'nlu_finetuning', 'translation_lev', 'audio_finetuning', 'audio_classification', 'sentence_prediction', 'sentence_prediction_adapters', 'translation_from_pretrained_xlm', 'denoising', 'speech_dlm_task', 'language_modeling', 'masked_lm', 'multilingual_language_modeling', 'multilingual_denoising', 'simul_text_to_text', 'span_masked_lm', 'dummy_lm', 'dummy_masked_lm'])
I tried using the pretrained model provided in the system for check, but the normal version (checkpoint_best_500.pt) made the exactly same error with above, but the legacy version (checkpoint_best_500_legacy.pt) was working well.
Is there any way to solve this problem? (What code should I run to inference the model I created?)
hello,
I am training a new
contentvecmodel in order to replace the framework'shubertmodel with the newly trainedcontentvec.However, when I tried to run the model created by learning with the code in the current repository on the
fairseqsystem, the following problem occurred and inference was not possible.I tried using the pretrained model provided in the system for check, but the normal version (
checkpoint_best_500.pt) made the exactly same error with above, but the legacy version (checkpoint_best_500_legacy.pt) was working well.Is there any way to solve this problem? (What code should I run to inference the model I created?)
And do you know how to train a
contentvecmodel that only contains representation modules (a.k.a. legacy model)?