Together AI
Secure Open Models at Scale

Security at LPU Speed
Protect your Groq LPU deployments with Wardstone Guard. Add security to the world's fastest inference without sacrificing speed.
Ultra-fast inference means successful attacks can cause damage very quickly.
Groq focuses purely on inference speed, with no safety filtering.
Low latency enables interactive attack refinement in real-time.
Groq's speed (1200+ tok/s) means attacks can happen very fast
LPU inference doesn't include safety filtering
Real-time applications have less tolerance for security latency
Wardstone's sub-30ms latency is negligible compared to network round-trips
Install Wardstone SDK alongside the Groq client.
Screen prompts before sending to Groq's ultra-fast inference.
Validate streaming responses as they arrive from Groq.
Wardstone's sub-30ms latency preserves Groq's speed benefits.
Groq offers competitive pricing for blazing-fast inference. Wardstone adds security without eliminating the speed advantage.
# Step 1: Check user input with Wardstonecurl -X POST "https://api.wardstone.ai/v1/detect" \ -H "Authorization: Bearer YOUR_WARDSTONE_KEY" \ -H "Content-Type: application/json" \ -d '{"text": "User message here"}' # Response: { "prompt_attack": { "detected": false, ... } } # Step 2: If safe, send to Groqcurl -X POST "https://api.groq.com/openai/v1/chat/completions" \ -H "Authorization: Bearer YOUR_GROQ_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "llama-3.3-70b-versatile", "messages": [{"role": "user", "content": "User message here"}] }' # Step 3: Check Groq response with Wardstone before returning to userWardstone Guard protects all Groq models with the same comprehensive security coverage. Whether you're using the latest releases or legacy models still in production, every API call is protected.
Secure Open Models at Scale
Secure Open-Weight AI
Secure European AI Models
Try Wardstone Guard in the playground to see detection in action.