Skip to content

karannfr/SentraAI

Repository files navigation

🛡️ SentraAI - Multi-Layered Defence Framework Against Prompt Injection and Obfuscation Attacks

This project implements a multi-layered security architecture to protect Large Language Model (LLM) applications against prompt injection, obfuscation, and misuse attacks.

The framework processes incoming user messages through three core layers:

🔹 Layer 1 – Sanitization and Deobfuscation

Cleans the input by stripping away obfuscation, encoding tricks, and manipulative patterns. The sanitized version is logged for traceability and sent forward for further analysis.

🔹 Layer 2 – Prompt Injection Detection

Analyzes the cleaned input to detect whether it contains prompt injection patterns. Messages deemed malicious are blocked and logged.

🔹 LLM Response Generation & Layer 3 – Behavioural Monitoring

Safe messages are passed to the LLM for response generation. Once a response is generated, it is validated again to ensure that the LLM did not inadvertently produce harmful or unexpected outputs. If flagged, the response is blocked and logged.

All activities, including malicious attempts and flagged responses, are stored in a database for auditing and further analysis.


🔐 Key Goals

  • Prevent prompt injections and obfuscated attacks at input level
  • Detect and block manipulated or adversarial prompts
  • Monitor and validate LLM-generated responses for misuse
  • Maintain logs for traceability, observability, and compliance

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors