This Project submitted to Philly Codefest 2026
Project: Prompt Siege
Project Type: Advanced
Location: G4
Prompt Siege is a red-team testing harness for Inhibitor-enabled agents. It runs a repeatable suite of adversarial prompts across prompt injection, privacy, unsafe advice, and goal-misalignment scenarios, then logs what was blocked, what slipped through, and where defenses held strongest. Instead of vague safety claims, it gives teams concrete evidence, reproducible attack cases, and clear recommendations to harden agent behavior before deployment.
Prompt Siege is a red-team testing harness for Inhibitor-enabled agents. It executes a structured, repeatable suite of adversarial prompts spanning prompt injection, privacy leakage, unsafe advice, and goal-misalignment scenarios. The system logs which attacks are blocked, which succeed, and where defenses perform strongest. Rather than relying on vague safety claims, Prompt Siege provides concrete evidence, reproducible test cases, and actionable recommendations to strengthen agent reliability before deployment.
Python, Inhibitor, OpenAI
Selected Prizes
-
Build either:
An innovative agent powered by Inhibitor, or
A rigorous red-team gauntlet against an Inhibitor-enabled agent/system
This generalized challenge supports both offensive assurance testing and constructive product innovation. Teams can choose one track or complete both for a stronger submission.
Track options
Track A Build with Inhibitor (Innovation Track)
Create an original agent use case that demonstrates how Inhibitor improves trust, safety, or controllability in real workflows.
Examples:
Trust-aware copilots for regulated domains
Multi-agent orchestrators with policy-aware gating
Real-time moderation or escalation assistants
Track B Red Team with Inhibitor (Gauntlet Track)
Run systematic adversarial testing against an Inhibitor-enabled agent/system and document exploitable patterns, failed attacks, and defense recommendations.
Examples:
Prompt injection and role-confusion suites
Obfuscation/jailbreak variants
Cross-turn memory poisoning attempts