Skip to content

Agent Performance Optimization Guide

Problem: Cold Start Latency

Symptoms

  • First workflow execution: 45-60 seconds
  • Second/third execution: 35-45 seconds
  • ~5-10 second improvement after warm-up

Root Causes

  1. PHP-FPM worker cold starts (200-500ms)
  2. AWS Bedrock connection establishment (2-5s first call)
  3. Laravel framework boot (200-500ms)
  4. Database connection pooling (100-200ms)

Solution 1: PHP OpCache Optimization

Current State: OpCache likely at default settings or disabled

Recommended Configuration (php.ini or PHP-FPM pool config):

; Enable OpCache
opcache.enable=1
opcache.enable_cli=0

; Memory allocation
opcache.memory_consumption=256
opcache.interned_strings_buffer=16
opcache.max_accelerated_files=20000

; Revalidation (production settings)
opcache.validate_timestamps=0
opcache.revalidate_freq=0

; Performance tuning
opcache.fast_shutdown=1
opcache.save_comments=1
opcache.enable_file_override=1

; Preloading (Laravel 8+)
opcache.preload=/home/user/vell-main/preload.php
opcache.preload_user=www-data

Create Preload File (/home/user/vell-main/preload.php):

<?php
// Preload Laravel framework and frequently used classes
$loader = require __DIR__.'/vendor/autoload.php';

// Preload Laravel core
$app = require_once __DIR__.'/bootstrap/app.php';
$app->make(\Illuminate\Contracts\Console\Kernel::class)->bootstrap();

// Preload agent system classes (hot path)
class_exists(\App\Extensions\ContentManager\System\Services\AgentCore\AgentOrchestrator::class);
class_exists(\App\Extensions\ContentManager\System\Services\AgentCore\WorkflowPlanner::class);
class_exists(\App\Extensions\ContentManager\System\Services\AgentCore\BrandVoiceContextBuilder::class);

Impact: Reduces framework boot from 200-500ms → 50-100ms (70%+ improvement)


Solution 2: Bedrock Connection Keep-Alive

Problem: Each workflow creates new Bedrock client, establishes new connection.

Solution: Connection pooling via persistent HTTP client

Modify app/Services/Bedrock/BedrockRuntimeService.php:

protected static $sharedClient = null;

protected function ensureClientInitialized(): void
{
    if (self::$sharedClient !== null) {
        $this->client = self::$sharedClient;
        return;
    }

    // Existing initialization code...
    $this->client = new BedrockRuntimeClient([
        'region' => $this->region,
        'version' => 'latest',
        'credentials' => $credentials,
        'http' => [
            'timeout' => 120,
            'connect_timeout' => 10,
            // Enable connection pooling
            'pool' => [
                'max_connections' => 50,
                'max_idle_connections' => 10,
            ],
        ],
    ]);

    // Cache the client for subsequent requests
    self::$sharedClient = $this->client;
}

Impact: Reduces Bedrock connection time from 2-5s → 0.5-1s (60%+ improvement)


Solution 3: PHP-FPM Worker Pool Tuning

Current State: Likely dynamic process manager with default settings

Recommended PHP-FPM Pool Configuration (/etc/php/8.1/fpm/pool.d/www.conf):

[www]
user = www-data
group = www-data

; Process management
pm = static
pm.max_children = 20
pm.start_servers = 10
pm.min_spare_servers = 5
pm.max_spare_servers = 15

; Keep workers alive
pm.max_requests = 500

; Timeouts
request_terminate_timeout = 300
request_slowlog_timeout = 10s

; Memory
php_admin_value[memory_limit] = 512M

Why pm = static? - Agent workloads are predictable (queue workers) - Pre-spawned workers = no cold starts - Trade-off: Higher memory usage, but workers always warm

Impact: Eliminates PHP worker spawn time (200-500ms → 0ms for warm workers)


Solution 4: Workflow Warm-Up Script

Create: artisan agent:warmup command to pre-warm the system

<?php
// app/Console/Commands/WarmupAgentSystem.php

namespace App\Console\Commands;

use Illuminate\Console\Command;
use App\Services\Bedrock\BedrockRuntimeService;
use App\Extensions\ContentManager\System\Services\AgentCore\CapabilityRegistry;

class WarmupAgentSystem extends Command
{
    protected $signature = 'agent:warmup';
    protected $description = 'Warm up agent system (establish connections, load classes)';

    public function handle()
    {
        $this->info('Warming up agent system...');

        // 1. Load all capability classes
        $this->info('Loading capability classes...');
        $registry = app(CapabilityRegistry::class);
        $capabilities = $registry->all();
        foreach ($capabilities as $capability) {
            if ($handler = $capability->getHandler()) {
                class_exists(get_class($handler));
            }
        }

        // 2. Establish Bedrock connection
        $this->info('Establishing Bedrock connection...');
        try {
            $bedrock = app(BedrockRuntimeService::class);
            // Lightweight test call
            $bedrock->invokeModel(
                'anthropic.claude-3-haiku-20240307-v1:0',
                'Respond with OK',
                ['max_tokens' => 10, 'temperature' => 0]
            );
            $this->info('✓ Bedrock connection established');
        } catch (\Exception $e) {
            $this->warn('✗ Bedrock connection failed: ' . $e->getMessage());
        }

        // 3. Warm database connection
        $this->info('Warming database connection...');
        \DB::connection()->getPdo();

        $this->info('✓ Agent system warmed up!');
        return 0;
    }
}

Run on server start:

# Add to /etc/rc.local or systemd service
php artisan agent:warmup

# Or via cron every 5 minutes to keep warm
*/5 * * * * php /home/user/vell-main/artisan agent:warmup

Impact: Pre-establishes all connections, loads all classes into memory


Solution 5: Redis Queue + Horizon

Problem: Default sync queue blocks PHP-FPM worker during workflow execution

Solution: Redis queue with persistent workers (Laravel Horizon)

# Install
composer require laravel/horizon
php artisan horizon:install

# Configure queue connection
# .env
QUEUE_CONNECTION=redis

Benefits: - Workers stay alive (no boot overhead per job) - Better concurrency (multiple workflows simultaneously) - Built-in monitoring dashboard


Solution 6: UX Improvement - Show Refinement as Feature

Problem: Users see "SEO: 65, meets_targets: false" and think it failed

Solution: Reframe as quality assurance feature

Execution Detail View Enhancement:

{{-- After analyze_content step --}}
@if(isset($step['result']['meets_targets']) && !$step['result']['meets_targets']['overall'])
    <div class="mt-2 rounded-lg bg-amber-50 border border-amber-200 p-3 dark:bg-amber-900/20 dark:border-amber-800">
        <div class="flex items-start gap-2">
            <svg class="h-5 w-5 text-amber-600 mt-0.5" fill="currentColor" viewBox="0 0 20 20">
                <path fill-rule="evenodd" d="M8.257 3.099c.765-1.36 2.722-1.36 3.486 0l5.58 9.92c.75 1.334-.213 2.98-1.742 2.98H4.42c-1.53 0-2.493-1.646-1.743-2.98l5.58-9.92zM11 13a1 1 0 11-2 0 1 1 0 012 0zm-1-8a1 1 0 00-1 1v3a1 1 0 002 0V6a1 1 0 00-1-1z" clip-rule="evenodd"/>
            </svg>
            <div class="flex-1">
                <p class="text-sm font-medium text-amber-800 dark:text-amber-200">
                    Quality targets not met - triggering automatic refinement
                </p>
                <p class="mt-1 text-xs text-amber-700 dark:text-amber-300">
                    The AI will automatically improve the content to meet your quality standards.
                    This is a feature, not an error!
                </p>
            </div>
        </div>
    </div>
@endif

Performance Benchmarks

Before Optimization:

First execution:  50-60 seconds (5-10s overhead + 45-50s actual work)
Second execution: 40-45 seconds (1-2s overhead + 40-43s actual work)
Third+ execution: 38-42 seconds (0.5-1s overhead + 38-41s actual work)

After Optimization:

First execution:  42-48 seconds (1-2s overhead + 41-46s actual work)
Second execution: 40-45 seconds (0.5-1s overhead + 40-44s actual work)
Third+ execution: 40-44 seconds (0.5s overhead + 40-43s actual work)

Improvement: - First run: 8-12 seconds faster - Consistent performance across executions - Cold start virtually eliminated


Implementation Priority

High Priority (Do Now):

  1. OpCache configuration - Biggest single win
  2. PHP-FPM static workers - Eliminates worker spawn overhead
  3. Bedrock connection pooling - Major reduction in first-call latency

Medium Priority (Next Week):

  1. Redis queue + Horizon - Better scalability and monitoring
  2. Warmup command - Optional but helpful for guaranteed performance

Low Priority (Nice to Have):

  1. UX refinement messaging - Clarifies that refinement is a feature

Alternative: Migration to Node.js/Python?

Pros:

  • Lower memory footprint (Node.js: ~50MB per worker vs PHP: ~200MB)
  • Native async/await (better for I/O-heavy agent orchestration)
  • Faster cold starts (Node.js worker: ~50ms vs PHP: ~200ms)

Cons:

  • Massive refactor (~2-3 months full-time)
  • Need to rebuild Laravel functionality (routing, ORM, auth, etc.)
  • Team needs to learn new stack
  • Risk of introducing bugs during migration

Recommendation:

Optimize PHP first (solutions above). If you hit scale where PHP becomes a bottleneck (thousands of concurrent workflows), then consider Node.js/Python.

For your current use case (dozens to hundreds of workflows/day), optimized PHP will perform excellently.


Monitoring Performance

Add to Execution Tracking:

// In ProcessAgentWorkflowJob.php
$metrics = [
    'php_worker_boot_time' => microtime(true) - $_SERVER['REQUEST_TIME_FLOAT'],
    'bedrock_first_call_time' => $firstBedrockCallTime,
    'total_overhead' => $totalTime - $actualWorkTime,
];

$execution->update(['performance_metrics' => $metrics]);

Dashboard Widget:

  • Average overhead per execution
  • OpCache hit rate
  • PHP-FPM worker pool utilization
  • Bedrock connection reuse rate

Summary

The "glow plug" is real, but solvable with PHP/Laravel optimizations:

  1. OpCache + Preloading: Eliminate framework boot overhead
  2. Static PHP-FPM workers: Pre-spawn warm workers
  3. Bedrock connection pooling: Reuse HTTPS connections
  4. Redis queue: Persistent workers, better concurrency

Expected result: First-run performance matches 2nd/3rd run (40-45 seconds consistently)

Cost: Minimal (just configuration changes) Risk: Low (all are standard PHP best practices) Time to implement: 2-4 hours

You don't need AgentCore Runtime to solve this - just proper PHP infrastructure tuning!