Predicting Remaining Useful Life (RUL) of turbofan engines using Long Short-Term Memory networks.
A PyTorch implementation of an LSTM-based baseline model for predictive maintenance using NASA's C-MAPSS (Commercial Modular Aero-Propulsion System Simulation) dataset. This project demonstrates an end-to-end deep learning pipeline for prognostics and health management (PHM) — from raw sensor data to actionable RUL predictions.
- Overview
- Dataset
- Architecture
- Results
- Visualizations
- Getting Started
- Project Structure
- Key Design Decisions
- Future Improvements
- References
Predictive maintenance aims to predict when equipment will fail, enabling maintenance to be scheduled just in time — avoiding both unexpected breakdowns (costly, dangerous) and unnecessary early maintenance (wasteful).
This project tackles the Remaining Useful Life (RUL) prediction problem: given a sequence of sensor readings from a turbofan engine, predict how many operational cycles remain before failure.
LSTMs are a natural fit for this problem because:
- Temporal dependencies: Engine degradation is a gradual process. The LSTM's memory mechanism captures how sensor patterns evolve over time.
- Variable-length handling: Different engines have different lifecycle lengths, but LSTMs process sequences regardless of length.
- Proven track record: LSTMs remain a strong baseline for time-series regression, especially in industrial applications where interpretability matters.
The NASA C-MAPSS FD001 dataset contains run-to-failure simulations of 100 turbofan engines:
| Property | Value |
|---|---|
| Training engines | 100 (complete run-to-failure) |
| Test engines | 100 (truncated trajectories) |
| Sensors | 21 (temperature, pressure, speed, etc.) |
| Operational settings | 3 (altitude, Mach number, TRA) |
| Operating conditions | 1 (sea level) |
| Fault modes | 1 (HPC degradation) |
Each engine starts healthy and degrades until failure. The training set provides complete trajectories; the test set provides partial trajectories where we must predict the remaining cycles.
RUL Target Engineering: Following the standard approach in the literature (Heimes, 2008), we apply a piecewise-linear RUL cap at 125 cycles. This reflects the practical reality that an engine with 300 cycles remaining is just as "healthy" as one with 200 — the model only needs to detect the degradation phase.
Input (batch, 30, 16)
│
▼
┌──────────────┐
│ LSTM Layer 1 │ hidden_dim=64
│ (dropout=0.3)│
└──────┬───────┘
│
┌──────▼───────┐
│ LSTM Layer 2 │ hidden_dim=64
│ (dropout=0.3)│
└──────┬───────┘
│
▼ (last hidden state)
┌──────────────┐
│ Dropout │ p=0.3
└──────┬───────┘
│
┌──────▼───────┐
│ FC (64→32) │ + ReLU
│ Dropout(0.3) │
│ FC (32→1) │ → RUL prediction
└──────────────┘
| Component | Detail |
|---|---|
| Input features | 16 (14 sensors + 2 operational settings) |
| Sequence length | 30 cycles |
| LSTM layers | 2 (stacked) |
| Hidden dimension | 64 |
| Dropout | 0.3 |
| Total parameters | 56,385 |
| Optimizer | Adam (lr=0.001) |
| LR Scheduler | ReduceLROnPlateau (factor=0.5, patience=5) |
| Loss function | Asymmetric MSE (Penalty=3.0) |
| Gradient clipping | max_norm=1.0 |
| Metric | Value |
|---|---|
| RMSE | 16.78 cycles |
| MAE | 13.13 cycles |
| PHM08 Score | 382.41 |
| Mean Error Bias | -2.08 cycles (Conservative) |
| Method | RMSE (FD001) |
|---|---|
| MLP (Heimes, 2008) | 25.01 |
| SVR (Benkedjouh et al.) | 20.96 |
| CNN (Babu et al., 2016) | 18.45 |
| LSTM (this repo) | 16.78 |
| Deep LSTM (Zheng et al.) | 16.14 |
| DA-RNN (Khorasgani et al.) | 15.62 |
Our simple 2-layer LSTM achieves competitive results with just ~56K parameters and 30 seconds of training on Apple Silicon.
The model converges smoothly with no significant overfitting. The learning rate scheduler reduces the LR at epoch ~38 when the validation loss plateaus.
Points close to the diagonal indicate accurate predictions. The color encodes the absolute error — most predictions fall within the ±10 cycle tolerance band.
The error distribution shows a slight negative, conservative bias (mean: -2.08 cycles), meaning the model tends to slightly under-estimate RUL to stay safe. This is achieved using a custom Asymmetric MSE Loss that penalizes risky late predictions (overestimations) 3x more heavily than early predictions.
Side-by-side comparison of actual vs. predicted RUL for individual engines in the test set.
This shows how sensor readings evolve over the full lifecycle of Engine 1. You can clearly see gradual changes in several sensors (e.g., sensor_11, sensor_15) as the engine approaches failure.
- Python 3.9+
- pip
# Clone the repository
git clone https://github.com/tomaspmz/lstm-predictive-maintenance.git
cd lstm-predictive-maintenance
# Create a virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txtpython main.pyThis will:
- ⬇ Download the C-MAPSS FD001 dataset automatically
- ⚙️ Preprocess the data (normalize, compute RUL, create sequences)
- 🏗️ Build a 2-layer LSTM model
- 🚀 Train for 50 epochs with early stopping
- 📈 Evaluate on the test set
- 🎨 Generate all visualizations in
results/figures/
The dataset (~1.5 MB) is downloaded automatically on first run.
├── main.py # Entry point — runs the full pipeline
├── requirements.txt # Python dependencies
├── .gitignore
│
├── src/
│ ├── __init__.py
│ ├── data_loader.py # Data download, preprocessing, sequence creation
│ ├── model.py # LSTM architecture definition
│ ├── train.py # Training loop, evaluation, metrics
│ └── visualize.py # Publication-quality dark-themed visualizations
│
├── data/ # Downloaded dataset (git-ignored)
│ ├── train_FD001.txt
│ ├── test_FD001.txt
│ └── RUL_FD001.txt
│
└── results/
├── lstm_rul_model.pth # Saved model checkpoint
└── figures/ # Generated visualizations
├── model_summary.png
├── training_curves.png
├── predictions_scatter.png
├── error_distribution.png
├── engine_predictions.png
├── sensor_degradation.png
└── rul_timeline.png
FD001 is the simplest C-MAPSS subset (single operating condition, single fault mode). It's the standard starting point for benchmarking — get a strong baseline here first, then scale to FD002–FD004 which introduce multi-condition and multi-fault complexity.
We drop 7 of the 21 sensors that are constant or near-constant across all engines (sensors 1, 5, 6, 10, 16, 18, 19). These carry no degradation signal and only add noise. This is a well-known preprocessing step in the C-MAPSS literature.
The sliding window size of 30 cycles balances:
- Too short (e.g., 10): insufficient temporal context for the LSTM
- Too long (e.g., 50+): fewer training samples, increased memory, diminishing returns
Without capping, the model wastes capacity trying to distinguish between RUL values like 200 vs. 300 — both indicate a healthy engine. The cap at 125 focuses the model on the critical degradation phase.
We deliberately use a unidirectional (not bidirectional) LSTM because in real-world deployment, we only have access to past sensor data, not future readings.
- Attention mechanism — Add temporal attention to help the model focus on critical degradation windows
- Bidirectional for offline analysis — For post-hoc analysis (not real-time), bidirectional LSTMs could improve accuracy
- Multi-task learning — Jointly predict RUL and fault type
- FD002–FD004 subsets — Extend to multi-condition and multi-fault scenarios
- Transformer baseline — Compare with a Transformer encoder for sequence modeling
- Asymmetric loss — Penalize late predictions more heavily (as in the PHM08 scoring function)
- Uncertainty quantification — Monte Carlo dropout or ensemble methods for confidence intervals
- Saxena, A., et al. (2008). Damage propagation modeling for aircraft engine run-to-failure simulation. PHM 2008.
- Heimes, F. O. (2008). Recurrent neural networks for remaining useful life estimation. PHM 2008.
- Zheng, S., et al. (2017). Long short-term memory network for remaining useful life estimation. ICICIP.
- Ramasso, E., & Saxena, A. (2014). Performance benchmarking and analysis of prognostic methods for CMAPSS datasets. International Journal of Prognostics and Health Management.
MIT License — see LICENSE for details.






