A simple baseline for Natural Language Inference in clinical domain using the MedNLI dataset. Includes simplified CBOW and InferSent models from the corresponding paper.
- Clone this repo:
git clone https://github.com/jgc128/mednli_baseline.git - Install NumPy:
pip install numpy==1.15.2 - Install PyTorch v0.4.1:
pip install http://download.pytorch.org/whl/cu92/torch-0.4.1-cp36-cp36m-linux_x86_64.whl(see https://pytorch.org/ for details) - Install requirements:
pip install -r requirements.txt
- Create the
./datadirectory inside the cloned repository- Create the
./data/cachedirectory
- Create the
- Download MedNLI: https://jgc128.github.io/mednli/
- Extract the content of the
mednli_data.ziparchive into the./data/mednlidir (unzip -d data/mednli mednli_data.zip)
- Extract the content of the
- Download word embeddings (see the table below) and put the
*.pickledfiles into the./data/word_embeddings/dir (wget -P data/word_embeddings/ https://mednli.blob.core.windows.net/shared/word_embeddings/https://mednli.blob.core.windows.net/shared/word_embeddings/mimic.fastText.no_clean.300d.pickled) - Download pre-trained models (see below) and put the
*.pkland the*.ptfiles into the./data/models/dir
| Word Embedding | Link |
|---|---|
| glove | glove.840B.300d.pickled |
| mimic | mimic.fastText.no_clean.300d.pickled |
| bio_asq | bio_asq.no_clean.300d.pickled |
| wiki_en | wiki_en.fastText.300d.pickled |
| wiki_en_mimic | wiki_en_mimic.fastText.no_clean.300d.pickled |
| glove_bio_asq | glove_bio_asq.no_clean.300d.pickled |
| glove_bio_asq_mimic | glove_bio_asq_mimic.no_clean.300d.pickled |
| Model | Embeddings | MedNLI Dev accuracy | Files |
|---|---|---|---|
| CBOW | mimic | 0.670 | model spec / model weights |
| InferSent | glove | 0.743 | model spec / model weights |
| InferSent | mimic | 0.783 | model spec / model weights |
| InferSent | wiki_en | 0.763 | model spec / model weights |
| InferSent | wiki_en_mimic | 0.774 | model spec / model weights |
| InferSent | glove_bio_asq_mimic | 0.770 | model spec / model weights |
Run the predict.py file with three arguments:
- Path to the model specification file (
*.pkl) - Input file in the
jsonlformat (seemli_dev_v1.jsonl) or the\t-separated premise and hypothesis (see test_input.txt) - Output file
.csvto save predicted probabilities of each of the three classes (contradiction, entailment, and neutral)
Notes:
- The model weights file (
*.pt) should be located in the same dir as the model specification file (*.pkl) - In case of the
jsonlformat the sentences are taken from thesentence1_binary_parseandsentence2_binary_parsefields, where thesentence1is the premise andsentence2is the hypothesis. All other fields are optional
Example command to run the prediction:
python predict.py data/models/mednli.infersent.mimic.128.saek2t5q.pkl data/input_test.txt data/predictions_test.csv
Run the train.py file. The options are set in the config.py file. Command-line interface is coming soon!
By default, the model specification and the model weights are saved in the ./data/models dir.
To run a traditional feature based system, run the train_feature_based.py file.
This system achieves 0.523 accuracy on the dev set using a gradient boosting classifier
with features based on word overlaps, tf-idf similarities, word embeddings similarities, and blue scores.
Romanov, A., & Shivade, C. (2018). Lessons from Natural Language Inference in the Clinical Domain. arXiv preprint arXiv:1808.06752.
https://arxiv.org/abs/1808.06752
@article{romanov2018lessons,
title = {Lessons from Natural Language Inference in the Clinical Domain},
url = {http://arxiv.org/abs/1808.06752},
abstract = {State of the art models using deep neural networks have become very good in learning an accurate mapping from inputs to outputs. However, they still lack generalization capabilities in conditions that differ from the ones encountered during training. This is even more challenging in specialized, and knowledge intensive domains, where training data is limited. To address this gap, we introduce {MedNLI} - a dataset annotated by doctors, performing a natural language inference task ({NLI}), grounded in the medical history of patients. We present strategies to: 1) leverage transfer learning using datasets from the open domain, (e.g. {SNLI}) and 2) incorporate domain knowledge from external data and lexical sources (e.g. medical terminologies). Our results demonstrate performance gains using both strategies.},
journaltitle = {{arXiv}:1808.06752 [cs]},
author = {Romanov, Alexey and Shivade, Chaitanya},
urldate = {2018-08-27},
date = {2018-08-21},
eprinttype = {arxiv},
eprint = {1808.06752},
}