Skip to content

tomaspmz/lstm-predictive-maintenance

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🛩️ LSTM Predictive Maintenance — NASA C-MAPSS

Predicting Remaining Useful Life (RUL) of turbofan engines using Long Short-Term Memory networks.

A PyTorch implementation of an LSTM-based baseline model for predictive maintenance using NASA's C-MAPSS (Commercial Modular Aero-Propulsion System Simulation) dataset. This project demonstrates an end-to-end deep learning pipeline for prognostics and health management (PHM) — from raw sensor data to actionable RUL predictions.

Model Summary


📋 Table of Contents


Overview

Predictive maintenance aims to predict when equipment will fail, enabling maintenance to be scheduled just in time — avoiding both unexpected breakdowns (costly, dangerous) and unnecessary early maintenance (wasteful).

This project tackles the Remaining Useful Life (RUL) prediction problem: given a sequence of sensor readings from a turbofan engine, predict how many operational cycles remain before failure.

Why LSTM?

LSTMs are a natural fit for this problem because:

  • Temporal dependencies: Engine degradation is a gradual process. The LSTM's memory mechanism captures how sensor patterns evolve over time.
  • Variable-length handling: Different engines have different lifecycle lengths, but LSTMs process sequences regardless of length.
  • Proven track record: LSTMs remain a strong baseline for time-series regression, especially in industrial applications where interpretability matters.

Dataset

The NASA C-MAPSS FD001 dataset contains run-to-failure simulations of 100 turbofan engines:

Property Value
Training engines 100 (complete run-to-failure)
Test engines 100 (truncated trajectories)
Sensors 21 (temperature, pressure, speed, etc.)
Operational settings 3 (altitude, Mach number, TRA)
Operating conditions 1 (sea level)
Fault modes 1 (HPC degradation)

Each engine starts healthy and degrades until failure. The training set provides complete trajectories; the test set provides partial trajectories where we must predict the remaining cycles.

RUL Target Engineering: Following the standard approach in the literature (Heimes, 2008), we apply a piecewise-linear RUL cap at 125 cycles. This reflects the practical reality that an engine with 300 cycles remaining is just as "healthy" as one with 200 — the model only needs to detect the degradation phase.

RUL Timeline


Architecture

Input (batch, 30, 16)
       │
       ▼
┌──────────────┐
│  LSTM Layer 1 │  hidden_dim=64
│  (dropout=0.3)│
└──────┬───────┘
       │
┌──────▼───────┐
│  LSTM Layer 2 │  hidden_dim=64
│  (dropout=0.3)│
└──────┬───────┘
       │
       ▼ (last hidden state)
┌──────────────┐
│   Dropout     │  p=0.3
└──────┬───────┘
       │
┌──────▼───────┐
│  FC (64→32)   │  + ReLU
│  Dropout(0.3) │
│  FC (32→1)    │  → RUL prediction
└──────────────┘
Component Detail
Input features 16 (14 sensors + 2 operational settings)
Sequence length 30 cycles
LSTM layers 2 (stacked)
Hidden dimension 64
Dropout 0.3
Total parameters 56,385
Optimizer Adam (lr=0.001)
LR Scheduler ReduceLROnPlateau (factor=0.5, patience=5)
Loss function Asymmetric MSE (Penalty=3.0)
Gradient clipping max_norm=1.0

Results

Test Set Performance (FD001)

Metric Value
RMSE 16.78 cycles
MAE 13.13 cycles
PHM08 Score 382.41
Mean Error Bias -2.08 cycles (Conservative)

Comparison with Literature

Method RMSE (FD001)
MLP (Heimes, 2008) 25.01
SVR (Benkedjouh et al.) 20.96
CNN (Babu et al., 2016) 18.45
LSTM (this repo) 16.78
Deep LSTM (Zheng et al.) 16.14
DA-RNN (Khorasgani et al.) 15.62

Our simple 2-layer LSTM achieves competitive results with just ~56K parameters and 30 seconds of training on Apple Silicon.


Visualizations

Training Convergence

The model converges smoothly with no significant overfitting. The learning rate scheduler reduces the LR at epoch ~38 when the validation loss plateaus.

Training Curves

Predicted vs. Actual RUL

Points close to the diagonal indicate accurate predictions. The color encodes the absolute error — most predictions fall within the ±10 cycle tolerance band.

Predictions Scatter

Error Distribution

The error distribution shows a slight negative, conservative bias (mean: -2.08 cycles), meaning the model tends to slightly under-estimate RUL to stay safe. This is achieved using a custom Asymmetric MSE Loss that penalizes risky late predictions (overestimations) 3x more heavily than early predictions.

Error Distribution

Per-Engine Predictions

Side-by-side comparison of actual vs. predicted RUL for individual engines in the test set.

Engine Predictions

Sensor Degradation Heatmap

This shows how sensor readings evolve over the full lifecycle of Engine 1. You can clearly see gradual changes in several sensors (e.g., sensor_11, sensor_15) as the engine approaches failure.

Sensor Degradation


Getting Started

Prerequisites

  • Python 3.9+
  • pip

Installation

# Clone the repository
git clone https://github.com/tomaspmz/lstm-predictive-maintenance.git
cd lstm-predictive-maintenance

# Create a virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Running the Pipeline

python main.py

This will:

  1. Download the C-MAPSS FD001 dataset automatically
  2. ⚙️ Preprocess the data (normalize, compute RUL, create sequences)
  3. 🏗️ Build a 2-layer LSTM model
  4. 🚀 Train for 50 epochs with early stopping
  5. 📈 Evaluate on the test set
  6. 🎨 Generate all visualizations in results/figures/

The dataset (~1.5 MB) is downloaded automatically on first run.


Project Structure

├── main.py                 # Entry point — runs the full pipeline
├── requirements.txt        # Python dependencies
├── .gitignore
│
├── src/
│   ├── __init__.py
│   ├── data_loader.py      # Data download, preprocessing, sequence creation
│   ├── model.py            # LSTM architecture definition
│   ├── train.py            # Training loop, evaluation, metrics
│   └── visualize.py        # Publication-quality dark-themed visualizations
│
├── data/                   # Downloaded dataset (git-ignored)
│   ├── train_FD001.txt
│   ├── test_FD001.txt
│   └── RUL_FD001.txt
│
└── results/
    ├── lstm_rul_model.pth  # Saved model checkpoint
    └── figures/            # Generated visualizations
        ├── model_summary.png
        ├── training_curves.png
        ├── predictions_scatter.png
        ├── error_distribution.png
        ├── engine_predictions.png
        ├── sensor_degradation.png
        └── rul_timeline.png

Key Design Decisions

Why FD001?

FD001 is the simplest C-MAPSS subset (single operating condition, single fault mode). It's the standard starting point for benchmarking — get a strong baseline here first, then scale to FD002–FD004 which introduce multi-condition and multi-fault complexity.

Feature Selection

We drop 7 of the 21 sensors that are constant or near-constant across all engines (sensors 1, 5, 6, 10, 16, 18, 19). These carry no degradation signal and only add noise. This is a well-known preprocessing step in the C-MAPSS literature.

Sequence Length = 30

The sliding window size of 30 cycles balances:

  • Too short (e.g., 10): insufficient temporal context for the LSTM
  • Too long (e.g., 50+): fewer training samples, increased memory, diminishing returns

Piecewise-Linear RUL Cap (125)

Without capping, the model wastes capacity trying to distinguish between RUL values like 200 vs. 300 — both indicate a healthy engine. The cap at 125 focuses the model on the critical degradation phase.

Unidirectional LSTM

We deliberately use a unidirectional (not bidirectional) LSTM because in real-world deployment, we only have access to past sensor data, not future readings.


Future Improvements

  • Attention mechanism — Add temporal attention to help the model focus on critical degradation windows
  • Bidirectional for offline analysis — For post-hoc analysis (not real-time), bidirectional LSTMs could improve accuracy
  • Multi-task learning — Jointly predict RUL and fault type
  • FD002–FD004 subsets — Extend to multi-condition and multi-fault scenarios
  • Transformer baseline — Compare with a Transformer encoder for sequence modeling
  • Asymmetric loss — Penalize late predictions more heavily (as in the PHM08 scoring function)
  • Uncertainty quantification — Monte Carlo dropout or ensemble methods for confidence intervals

References

  1. Saxena, A., et al. (2008). Damage propagation modeling for aircraft engine run-to-failure simulation. PHM 2008.
  2. Heimes, F. O. (2008). Recurrent neural networks for remaining useful life estimation. PHM 2008.
  3. Zheng, S., et al. (2017). Long short-term memory network for remaining useful life estimation. ICICIP.
  4. Ramasso, E., & Saxena, A. (2014). Performance benchmarking and analysis of prognostic methods for CMAPSS datasets. International Journal of Prognostics and Health Management.

License

MIT License — see LICENSE for details.

About

PyTorch implementation of an LSTM-based deep learning model for predicting Remaining Useful Life (RUL) of turbofan engines using the NASA C-MAPSS dataset. Achieves 16.78 RMSE on FD001.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages