Skip to content

Support H-CoT: Hijacking the Chain-of-Thought to Jailbreak Reasoning Models #897

@LifeHackerBee

Description

@LifeHackerBee

Is your feature request related to a problem? Please describe.

I recently learned about the jailbreak method “H-CoT: Hijacking the Chain-of-Thought Safety Reasoning Mechanism to Jailbreak Large Reasoning Models”, a method which has been shown to successfully bypass safety filters in several large reasoning models including OpenAI o1/o3, DeepSeek-R1, and Gemini 2.0 Flash Thinking. It would be great to implement this feature.

Refer:

  1. https://github.com/dukeceicenter/jailbreak-reasoning-openai-o1o3-deepseek-r1
  2. https://maliciouseducator.org/

Describe the solution you'd like

Could you provide explicit support or integration for the H-CoT jailbreak method within your repository?

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requesthelp wantedExtra attention is needed

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions