Skip to content

Capability-Based Guardrails Integration Guide

This guide explains how to integrate capability-based guardrails into the Vell AgentCore system.

Status: IMPLEMENTED

As of 2026-02-20, per-capability guardrail resolution and self-healing are wired into the runtime:

  • AgentOrchestrator::executeCapability() resolves guardrails per step via resolveGuardrailForCapability()
  • AgentOrchestrator::attemptGroundedRetry() implements self-healing on guardrail intervention
  • StreamService::getGuardrailConfig() accepts an optional $capabilitySlug for capability-aware resolution

Overview

Capability-based guardrails allow you to apply different AWS Bedrock Guardrails based on what the agent is doing (capability), not just agent-level settings. This provides granular control over content safety.

Use Cases

  • Marketplace Awareness: Use Enterprise tier (strictest grounding, prevent hallucinations about AWS Marketplace)
  • Content Generation: Use Prod tier (balanced safety and creativity)
  • Social Media Publishing: Use Dev tier (lighter filtering for creative content)
  • Deal Influence Tracking: Use Enterprise tier (block sensitive deal IDs and PII)

Architecture

Runtime Guardrail Resolution (5-Layer Model)

┌─────────────────────────────────────────────────────────┐
│ Layer 1: Capability Resolution                          │
│   CapabilityRegistry → identifies what's being executed │
├─────────────────────────────────────────────────────────┤
│ Layer 2: Guardrail Resolution                           │
│   AgentOrchestrator::resolveGuardrailForCapability()    │
│   Priority: Agent > Capability > Platform > None        │
│   Resolves: guardrail_id + version + trace flag         │
├─────────────────────────────────────────────────────────┤
│ Layer 3: Bedrock Invocation                             │
│   BedrockRuntimeService.invokeModel() with guardrail    │
├─────────────────────────────────────────────────────────┤
│ Layer 4: Self-Healing                                   │
│   AgentOrchestrator::attemptGroundedRetry()             │
│   On INTERVENED: retry with grounding context           │
│   On 2nd failure: return structured error, don't pass   │
│   ungrounded content to user                            │
├─────────────────────────────────────────────────────────┤
│ Layer 5: Output Delivery                                │
│   StreamService (chat) or AgentExecution (workflow)     │
└─────────────────────────────────────────────────────────┘

Guardrail Resolution Flow

Agent Request
AgentOrchestrator.execute()
WorkflowPlanner.plan() → Generate workflow steps
For Each Step:
    ├─ Get capability slug (e.g., "marketplace_awareness")
    ├─ resolveGuardrailForCapability(agent, capabilitySlug):
    │   ├─ Priority 1: Agent guardrail_enabled? → Use agent's guardrail
    │   ├─ Priority 2: CapabilityGuardrailMapping → Use capability guardrail
    │   ├─ Priority 3: Platform default → Use settings guardrail
    │   └─ Priority 4: None
    ├─ Pass _guardrail config to handler via parameters
    ├─ Execute capability
    └─ On guardrail intervention → attemptGroundedRetry()
        ├─ Retry with grounding instructions
        ├─ On success → return grounded result
        └─ On 2nd failure → return structured error (never pass ungrounded output)
BedrockRuntimeService.invokeModel($model, $prompt, $options)

Grounded vs Non-Grounded Capabilities

22 of 34 capabilities require grounding (outputs must be verifiable against source data). 12 are either creative generation (grounding would be too restrictive) or utility operations.

Category Grounded? Guardrail Tier Self-Healing
marketplace_awareness, deal_influence_tracking, partner_intelligence Yes Enterprise Full retry
generate_text, seo_content_optimize, analyze_content Yes Prod Full retry
generate_image, publish_to_social_media No Dev/None No retry
sync_marketplace_listings, fetch_external_url No None N/A (pure data)

StreamService (Chat Path)

StreamService::getGuardrailConfig() now accepts an optional capability slug: - When called with a capability context, resolves per-capability guardrails first - Falls back to platform default if no capability mapping exists - Chat without capability context still uses platform default behavior

Bedrock AgentCore (Memory) - No Impact

BedrockAgentService handles session memory via Bedrock's agent memory API. It is orthogonal to guardrails. Guardrail enforcement happens at the invocation layer (BedrockRuntimeService), not the memory layer. No changes needed.

Implementation Details

Key Files

  • AgentOrchestrator::resolveGuardrailForCapability() - Priority chain resolution
  • AgentOrchestrator::attemptGroundedRetry() - Self-healing on intervention
  • StreamService::getGuardrailConfig() - Capability-aware chat guardrails
  • CapabilityGuardrailMapping::getGuardrailForCapability() - DB lookup with caching
  • BedrockRuntimeService::invokeModel() - Guardrail passthrough to Bedrock API
  • CapabilityGuardrailController::AUDIT_RECOMMENDATIONS - Risk/tier/grounding audit data

Self-Healing Pattern

When Bedrock's contextual grounding filter blocks output:

  1. Detection: BedrockRuntimeService returns guardrails_triggered > 0
  2. Retry: attemptGroundedRetry() adds explicit grounding instructions to the prompt
  3. Success path: Retried output passes grounding → delivered to user
  4. Failure path: Retry also blocked → structured error returned, no ungrounded content reaches user

This prevents the "fitness output" scenario where the model drifts off-topic and ungrounded content reaches the user without being caught.

Integration Steps (Reference)

1. AgentOrchestrator (DONE)

The executeCapability() method in AgentOrchestrator.php now:

use App\Models\CapabilityGuardrailMapping;

// Inside the method that executes a capability step
protected function executeCapabilityStep(Agent $agent, array $step): array
{
    $capabilitySlug = $step['capability'];

    // Determine which guardrail to use (priority order):
    // 1. Agent-level guardrail (if enabled)
    // 2. Capability-level guardrail (from mapping)
    // 3. Platform default (from settings)
    // 4. None

    $guardrailOptions = $this->resolveGuardrailForCapability($agent, $capabilitySlug);

    // Pass to Bedrock
    $response = $this->bedrockRuntimeService->invokeModel(
        $agent->ai_model,
        $prompt,
        array_merge($options, $guardrailOptions)
    );

    return $response;
}

protected function resolveGuardrailForCapability(Agent $agent, string $capabilitySlug): array
{
    // Priority 1: Agent-level guardrail
    if ($agent->guardrail_enabled && $agent->guardrail_id) {
        return [
            'guardrail_id' => $agent->guardrail_id,
            'guardrail_version' => $agent->guardrail_version ?? 'DRAFT',
            'guardrail_trace' => $agent->guardrail_trace_enabled ?? false,
        ];
    }

    // Priority 2: Capability-level guardrail
    $capabilityGuardrail = CapabilityGuardrailMapping::getGuardrailForCapability($capabilitySlug);
    if ($capabilityGuardrail) {
        return [
            'guardrail_id' => $capabilityGuardrail['guardrail_id'],
            'guardrail_version' => $capabilityGuardrail['guardrail_version'],
            'guardrail_trace' => false, // Capability-level doesn't have trace setting
        ];
    }

    // Priority 3: Platform default
    $defaultGuardrailId = setting('bedrock_default_guardrail_id');
    if ($defaultGuardrailId) {
        return [
            'guardrail_id' => $defaultGuardrailId,
            'guardrail_version' => setting('bedrock_default_guardrail_version', 'DRAFT'),
            'guardrail_trace' => false,
        ];
    }

    // Priority 4: None
    return [];
}

2. BedrockRuntimeService (ALREADY DONE)

BedrockRuntimeService::invokeModel() already supports guardrail passthrough. When $options['guardrail_id'] is set, it adds guardrailIdentifier and guardrailVersion to the Bedrock API call. It also detects amazonBedrockGuardrailAction === INTERVENED and returns guardrails_triggered in the response, which triggers the self-healing layer.

3. Test the Integration

  1. Deploy CloudFormation stack (if not already deployed):

    cd infrastructure/cloudformation
    ./deploy-guardrails.sh
    

  2. Sync guardrails to database:

  3. Go to Admin → Bedrock Guardrails
  4. Click "Sync from AWS"
  5. Verify 3 guardrails appear (dev, prod, enterprise)

  6. Create capability mappings:

  7. Go to Admin → Capability Guardrails
  8. Assign guardrails to capabilities:

    • marketplace_awarenessvellocity-marketplace-trust-enterprise
    • generate_textvellocity-marketplace-trust-prod
    • publish_to_social_mediavellocity-marketplace-trust-dev
  9. Create test agent:

  10. Go to Content Manager → Agents → Create Agent
  11. Name: "Test Capability Guardrails Agent"
  12. Select capabilities: generate_text, marketplace_awareness
  13. Do NOT enable agent-level guardrails (we want to test capability-level)
  14. Save

  15. Execute test task:

    Task: "Generate a blog post about AWS Marketplace pricing strategies,
    then analyze the latest AWS Marketplace trends"
    

  16. Verify guardrails are applied:

  17. Check Laravel logs (storage/logs/laravel.log)
  18. Look for guardrail invocation logs
  19. Verify different guardrails were used for different capabilities:

    • generate_text step should use prod guardrail
    • marketplace_awareness step should use enterprise guardrail
  20. Test guardrail blocking:

  21. Try input with sensitive data (email, phone, AWS account ID)
  22. Enterprise tier should block, prod/dev should anonymize
  23. Try ungrounded claims ("AWS guarantees unlimited support")
  24. All tiers should block based on word filters

Configuration Management

Platform Default Guardrail

Set in Admin → Settings → Tools → Bedrock Guardrails: - Default Guardrail ID: Select from deployed guardrails - Default Version: DRAFT or numbered version - Enable by Default: Auto-enable for new agents - Require Guardrails: Force all agents to use guardrails

Agent-Level Guardrails

Set in Agent → Create/Edit → Advanced Settings → Bedrock Guardrails: - Enable Guardrails: Checkbox - Select Guardrail: Dropdown (loads from /api/guardrails/available) - Version: DRAFT, 1, 2, 3 - Enable Trace Logging: For debugging

Capability-Level Guardrails

Set in Admin → Capability Guardrails: - View all agent capabilities - Assign guardrail per capability - Set priority (for multiple mappings) - Add description (why this guardrail?)

API Endpoints

List Available Guardrails

GET /api/guardrails/available

Response:

{
  "success": true,
  "guardrails": [
    {
      "guardrail_id": "abc123xyz",
      "name": "vellocity-marketplace-trust-enterprise",
      "scope": "platform",
      "version": "1",
      "status": "active"
    }
  ]
}

Get Guardrail for Capability

use App\Models\CapabilityGuardrailMapping;

$guardrail = CapabilityGuardrailMapping::getGuardrailForCapability('generate_text');
// Returns: ['guardrail_id' => '...', 'guardrail_version' => '...'] or null

Troubleshooting

Guardrails Not Applied

  1. Check guardrail sync:
  2. Admin → Guardrails → Sync from AWS
  3. Verify status is active

  4. Check capability mapping:

  5. Admin → Capability Guardrails
  6. Verify mapping exists and is enabled

  7. Check agent settings:

  8. If agent has guardrail_enabled = true, it overrides capability mappings
  9. Disable agent-level guardrails to use capability-level

  10. Check Bedrock credentials:

  11. .env file has correct AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY
  12. Settings → Tools → Bedrock Access Model is configured

Guardrails Blocking Too Much

  1. Use lower tier:
  2. Enterprise → Prod → Dev (descending strictness)

  3. Adjust grounding thresholds:

  4. Edit CloudFormation template
  5. Lower Threshold values (0.0 = most permissive, 1.0 = most strict)
  6. Redeploy stack
  7. Sync in admin panel

  8. Remove blocked words:

  9. Edit WordsConfig in CloudFormation
  10. Redeploy and sync

Performance Issues

  1. Use versioned guardrails (not DRAFT):
  2. DRAFT requires AWS to fetch latest config on every call
  3. Version 1, 2, 3 are cached by AWS

  4. Check guardrail usage in admin:

  5. Admin → Guardrails shows usage_count and last_used_at
  6. High usage may indicate over-application

  7. Consider capability-level exemptions:

  8. Remove mappings for low-risk capabilities
  9. E.g., analyze_content may not need strict guardrails

Cost Optimization

AWS Bedrock Guardrails pricing (as of 2025): - $0.75 per 1,000 content units (text) - $1.00 per 1,000 content units (images)

Tips: 1. Use capability mappings sparingly: Only for high-risk capabilities 2. Use lower tiers where appropriate: Dev tier costs same but allows more content through 3. Monitor usage: Admin → Guardrails shows usage counts 4. Consider agent-level override: For trusted agents, disable guardrails entirely

Best Practices

  1. Start with platform default: Set a baseline guardrail in settings
  2. Assign capability-level for exceptions: Only map capabilities that need different rules
  3. Use agent-level for VIPs: For specific agents (e.g., internal tools), override with looser guardrails
  4. Enable trace mode during development: Debug guardrail behavior
  5. Version guardrails in production: Use numbered versions (not DRAFT)
  6. Document mappings: Use the description field to explain why each mapping exists
  7. Review guardrail logs regularly: Check storage/logs/laravel.log for interventions
  • /docs/BEDROCK_GUARDRAILS.md - Main guardrails documentation
  • /infrastructure/cloudformation/README.md - CloudFormation deployment guide
  • AWS Bedrock Guardrails: https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html

Support

For issues: - Deployment problems: Check CloudFormation stack events in AWS Console - Integration bugs: Review Laravel logs at storage/logs/laravel.log - Guardrail behavior: Enable trace mode and check logs