Hybrid Knowledge Base Implementation Guide¶
OpenAI + Bedrock Knowledge Base Support¶
Date: 2025-11-17 Status: Ready for Deployment Effort: 70%+ cost reduction for Knowledge Base queries
Overview¶
This implementation adds dual Knowledge Base provider support to the agent system, allowing agents to use either:
- OpenAI Local RAG (existing) - Documents stored locally with OpenAI embeddings
- Amazon Bedrock Knowledge Bases (new) - AWS-managed S3-backed knowledge bases
Key Benefits: - 🎯 70%+ cost reduction with Bedrock KB (Titan embeddings vs OpenAI) - 🔄 Zero code changes for existing agents (backward compatible) - 🔌 Partner-owned KBs - Use existing AWS Bedrock Knowledge Bases - 🚀 Better performance - AWS regional deployment - 💰 No data duplication - Partners keep docs in their own S3
What Changed¶
1. New Service: BedrockKnowledgeBaseService¶
Location: app/Services/Bedrock/BedrockKnowledgeBaseService.php
Provides integration with Amazon Bedrock Knowledge Bases:
use App\Services\Bedrock\BedrockKnowledgeBaseService;
// Initialize service (uses existing BYOC credential patterns)
$service = new BedrockKnowledgeBaseService($user, 'us-east-1');
// Retrieve documents from KB
$results = $service->retrieve(
knowledgeBaseId: 'KB-ABC123',
query: 'AWS Marketplace pricing',
numberOfResults: 5,
minScore: 0.7
);
// OR use integrated retrieve + generate
$response = $service->retrieveAndGenerate(
knowledgeBaseId: 'KB-ABC123',
prompt: 'Explain our AWS Marketplace pricing strategy',
modelArn: 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'
);
Features: - Multi-credential support (platform, user_keys, user_role BYOC) - IAM role assumption for cross-account access - Error handling and logging - Connection testing - Result formatting and citation parsing
2. Enhanced Agent Table¶
Migration: app/Extensions/ContentManager/database/migrations/2025_11_17_000012_add_knowledge_base_provider_to_agents_table.php
New Columns:
kb_provider VARCHAR(20) DEFAULT 'openai' -- 'openai' or 'bedrock'
bedrock_kb_id VARCHAR(100) NULL -- Bedrock KB ID if using Bedrock
bedrock_kb_region VARCHAR(20) NULL -- AWS region for Bedrock KB
Run Migration:
3. Updated QueryKnowledgeBaseCapability¶
Location: app/Extensions/ContentManager/System/Services/Capabilities/QueryKnowledgeBaseCapability.php
Changes: - Automatically detects provider from agent configuration - Routes to appropriate search method (OpenAI or Bedrock) - Returns unified result format for both providers - Logs provider used for monitoring
How It Works:
// Agent determines provider
$provider = $agent->kb_provider ?? 'openai';
if ($provider === 'bedrock') {
// Use Bedrock Knowledge Base
$results = $this->searchBedrockKnowledgeBase(...);
} else {
// Use OpenAI local RAG (existing)
$results = $this->searchKnowledgeBase(...);
}
Deployment Steps¶
Step 1: Run Migration¶
Expected Output:
Migrating: 2025_11_17_000012_add_knowledge_base_provider_to_agents_table
Migrated: 2025_11_17_000012_add_knowledge_base_provider_to_agents_table
Step 2: Verify Installation¶
# Check table structure
php artisan tinker
>>> DB::select("DESCRIBE ext_content_manager_agents");
# Should show kb_provider, bedrock_kb_id, bedrock_kb_region columns
Step 3: Test with OpenAI (Existing Behavior)¶
No changes needed! All existing agents continue to use OpenAI local RAG by default.
// Existing agents automatically use kb_provider='openai' (default)
// No configuration changes needed
Step 4: Configure Bedrock KB (Optional)¶
For partners who want to use Bedrock Knowledge Bases:
// Update agent settings
$agent = Agent::find($agentId);
$agent->kb_provider = 'bedrock';
$agent->bedrock_kb_id = 'KB-ABC123DEFGH';
$agent->bedrock_kb_region = 'us-east-1';
$agent->save();
Or via database:
UPDATE ext_content_manager_agents
SET kb_provider = 'bedrock',
bedrock_kb_id = 'KB-ABC123DEFGH',
bedrock_kb_region = 'us-east-1'
WHERE id = 1;
Step 5: Test Bedrock Integration¶
use App\Services\Bedrock\BedrockKnowledgeBaseService;
use App\Models\User;
// Test connection
$user = User::find(1);
$service = new BedrockKnowledgeBaseService($user, 'us-east-1');
$canConnect = $service->testConnection('KB-ABC123DEFGH');
echo $canConnect ? "✅ Connected!" : "❌ Failed";
// Test retrieval
$results = $service->retrieve(
'KB-ABC123DEFGH',
'test query',
5
);
print_r($results);
Configuration Options¶
Agent Configuration¶
Example 1: OpenAI Local RAG (Default)
{
"name": "Sales Content Generator",
"capabilities": ["generate_text", "query_knowledge_base"],
"kb_provider": "openai"
}
pdf_data table
- Requires file uploads via AI File Chat
- Embeddings via OpenAI text-embedding-ada-002
Example 2: Bedrock Knowledge Base
{
"name": "Partner Content Generator",
"capabilities": ["generate_text", "query_knowledge_base"],
"kb_provider": "bedrock",
"bedrock_kb_id": "KB-PARTNER-123",
"bedrock_kb_region": "us-east-1"
}
Credential Access Patterns¶
The Bedrock KB service uses the same credential patterns as BedrockRuntimeService:
1. Platform Credentials (Default)
Uses platform's AWS credentials. Good for testing or platform-managed KBs.
2. User Keys (User-Provided Credentials)
// User provides their own AWS credentials
$user->aws_access_key_id = encrypt('AKIAIOSFODNN7EXAMPLE');
$user->aws_secret_access_key = encrypt('wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY');
$user->aws_region = 'us-east-1';
$user->save();
Setting: bedrock_access_model = 'user_keys'
3. User Role (BYOC - Bring Your Own Cloud)
// User assumes an IAM role in their AWS account
$user->bedrock_role_arn = 'arn:aws:iam::123456789012:role/VellBedrockAccess';
$user->bedrock_external_id = 'unique-external-id-123';
$user->aws_region = 'us-east-1';
$user->save();
Setting: bedrock_access_model = 'user_role'
This is the recommended approach for partners with existing Bedrock KBs.
Partner Setup Guide¶
For Partners with Existing Bedrock KBs¶
Step 1: Get Your Knowledge Base ID
Via AWS Console:
1. Go to AWS Bedrock Console → Knowledge Bases
2. Copy your Knowledge Base ID (e.g., KB-ABC123DEFGH)
Via AWS CLI:
Step 2: Configure IAM Permissions
The platform's IAM role (or user's assumed role) needs:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"bedrock:Retrieve",
"bedrock:RetrieveAndGenerate"
],
"Resource": [
"arn:aws:bedrock:us-east-1:ACCOUNT_ID:knowledge-base/*"
]
}
]
}
For BYOC (Recommended):
Deploy the CloudFormation template (if not already done) and update the role permissions to include Bedrock Knowledge Base actions.
Step 3: Configure Your Agent
Update agent to use Bedrock KB:
$agent->kb_provider = 'bedrock';
$agent->bedrock_kb_id = 'YOUR-KB-ID';
$agent->bedrock_kb_region = 'us-east-1';
$agent->save();
Step 4: Test
Run an agent execution with query_knowledge_base capability:
$execution = AgentExecution::create([
'agent_id' => $agent->id,
'user_id' => $user->id,
'task_description' => 'Generate competitive email',
'context' => [
'task_type' => 'competitive_email',
'competitor' => 'Acme Corp',
],
]);
// Agent will automatically query Bedrock KB for relevant context
For Partners Creating New Bedrock KBs¶
Step 1: Create S3 Bucket
Step 2: Upload Documents
# Upload PDFs, docs, markdown files, etc.
aws s3 sync ./documents/ s3://my-company-knowledge-base/docs/
Step 3: Create Knowledge Base
Via AWS Console:
1. Go to Bedrock → Knowledge Bases → Create
2. Name: "My Company KB"
3. Data source: S3
4. S3 URI: s3://my-company-knowledge-base/docs/
5. Embedding model: Amazon Titan Embeddings G1 - Text
6. Vector database: Quick create (OpenSearch Serverless)
7. Sync schedule: Hourly or Daily
8. Create
Via AWS CLI:
# Create knowledge base
aws bedrock-agent create-knowledge-base \
--name "My Company KB" \
--role-arn "arn:aws:iam::ACCOUNT_ID:role/BedrockKBRole" \
--knowledge-base-configuration '{
"type": "VECTOR",
"vectorKnowledgeBaseConfiguration": {
"embeddingModelArn": "arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v1"
}
}' \
--region us-east-1
# Add S3 data source
aws bedrock-agent create-data-source \
--knowledge-base-id KB-ABC123 \
--name "S3 Docs" \
--data-source-configuration '{
"type": "S3",
"s3Configuration": {
"bucketArn": "arn:aws:s3:::my-company-knowledge-base",
"inclusionPrefixes": ["docs/"]
}
}' \
--region us-east-1
# Start ingestion
aws bedrock-agent start-ingestion-job \
--knowledge-base-id KB-ABC123 \
--data-source-id DS-ABC123 \
--region us-east-1
Step 4: Wait for Ingestion
aws bedrock-agent get-ingestion-job \
--knowledge-base-id KB-ABC123 \
--data-source-id DS-ABC123 \
--ingestion-job-id JOB-ABC123 \
--region us-east-1
Status should change from IN_PROGRESS to COMPLETE.
Step 5: Test Retrieval
aws bedrock-agent-runtime retrieve \
--knowledge-base-id KB-ABC123 \
--retrieval-query text="AWS Marketplace best practices" \
--region us-east-1
Cost Comparison¶
Scenario: 10,000 Documents, 50,000 Queries/Month¶
OpenAI Local RAG: - Document indexing: $40 (one-time, text-embedding-ada-002) - Query embeddings: $50/month (50K queries × $0.0001/1K tokens) - Storage: Included in DB - Total: $90/month (after initial $40)
Bedrock Knowledge Base: - Document indexing: $8 (one-time, Titan embeddings) - Query embeddings: $4/month (50K queries × $0.00008/1K tokens) - S3 storage: $2.30/month (100GB @ $0.023/GB) - OpenSearch Serverless: $8/month (vector storage) - Total: $22.30/month (after initial $8)
Savings: 75% cheaper! ($90 → $22.30)
Per-Query Costs¶
| Provider | Embedding Cost | Generation Cost | Total/Query |
|---|---|---|---|
| OpenAI | $0.0001/1K | $0.015/1K (gpt-4) | ~$0.0002 |
| Bedrock | $0.00008/1K | $0.003/1K (Claude Sonnet) | ~$0.00004 |
Bedrock is 80% cheaper per query!
Monitoring and Logging¶
All Knowledge Base queries are logged with provider information:
// OpenAI query
[Query Knowledge Base] Querying knowledge base {
"query": "AWS Marketplace pricing",
"category": "sales",
"top_results": 5,
"provider": "openai"
}
// Bedrock query
[Query Knowledge Base] Querying knowledge base {
"query": "AWS Marketplace pricing",
"category": null,
"top_results": 5,
"provider": "bedrock"
}
[Query Knowledge Base] Searching Bedrock KB {
"kb_id": "KB-ABC123",
"region": "us-east-1",
"query": "AWS Marketplace pricing"
}
[Query Knowledge Base] Knowledge base search completed {
"provider": "bedrock",
"results_found": 3,
"has_context": true
}
Monitor with:
Troubleshooting¶
Issue: "Bedrock credentials not configured"¶
Cause: Missing AWS credentials
Fix:
1. Check .env has AWS credentials:
- Or set
bedrock_access_modeltouser_keysoruser_roleand configure user credentials
Issue: "No Bedrock KB configured for agent"¶
Cause: Agent missing bedrock_kb_id
Fix:
Issue: AccessDeniedException¶
Cause: IAM permissions missing
Fix: Add Bedrock permissions to IAM role/user:
{
"Effect": "Allow",
"Action": [
"bedrock:Retrieve",
"bedrock:RetrieveAndGenerate"
],
"Resource": "*"
}
Issue: Bedrock KB returns empty results¶
Possible Causes:
1. KB not yet ingested (wait for ingestion job to complete)
2. Query doesn't match any documents
3. minScore threshold too high
Debug:
# Test retrieval directly
aws bedrock-agent-runtime retrieve \
--knowledge-base-id KB-ABC123 \
--retrieval-query text="test" \
--region us-east-1
Issue: Knowledge base search fails silently¶
Expected Behavior: By design, KB failures return empty results rather than throwing exceptions. This allows agents to continue even if KB is unavailable.
Check Logs:
Advanced Usage¶
1. Multiple Knowledge Bases per Partner¶
// Sales KB
$salesAgent->bedrock_kb_id = 'KB-SALES-123';
// Technical KB
$techAgent->bedrock_kb_id = 'KB-TECH-456';
// Compliance KB
$complianceAgent->bedrock_kb_id = 'KB-COMP-789';
2. Cross-Account Access (BYOC)¶
Partner deploys CloudFormation stack in their AWS account, grants cross-account access to platform:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::PLATFORM_ACCOUNT:root"
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "unique-external-id-123"
}
}
}
]
}
Platform assumes role to access partner's KB:
$user->bedrock_role_arn = 'arn:aws:iam::PARTNER_ACCOUNT:role/VellBedrockAccess';
$user->bedrock_external_id = 'unique-external-id-123';
3. RetrieveAndGenerate (Integrated Approach)¶
For even better performance, use Bedrock's integrated retrieve + generate:
use App\Services\Bedrock\BedrockKnowledgeBaseService;
$service = new BedrockKnowledgeBaseService($user, 'us-east-1');
$response = $service->retrieveAndGenerate(
knowledgeBaseId: 'KB-ABC123',
prompt: 'Generate a competitive email against Acme Corp',
modelArn: 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'
);
echo $response['generated_text'];
print_r($response['citations']);
This combines retrieval + Claude generation in one API call, reducing latency and cost.
Migration Strategy¶
Phase 1: Soft Launch (Week 1)¶
- ✅ Deploy code (backward compatible)
- ✅ Run migration
- ✅ Test with internal Bedrock KB
- ✅ Monitor logs for errors
Phase 2: Partner Pilot (Weeks 2-3)¶
- ✅ Identify 3-5 partners with existing Bedrock KBs
- ✅ Help them configure agents
- ✅ Gather feedback
- ✅ Document common issues
Phase 3: General Availability (Week 4+)¶
- ✅ Announce to all partners
- ✅ Provide setup documentation
- ✅ Offer migration assistance (OpenAI → Bedrock)
- ✅ Build cost calculator tool
Phase 4: Bedrock-First (Optional, 3-6 months)¶
- ✅ Make Bedrock the default for new agents
- ✅ Provide migration tools
- ✅ Eventually deprecate OpenAI local RAG (if unused)
Rollback Plan¶
If issues arise, rollback is simple:
- No code rollback needed - Feature is backward compatible
- Agents automatically fall back to OpenAI if
kb_provider='openai'or null - Migration is reversible:
Success Metrics¶
Track these metrics to measure success:
- Cost Savings
- Embedding API costs (OpenAI vs Bedrock)
-
Total KB operation costs
-
Adoption
- Number of agents using Bedrock KB
-
Number of partners with configured Bedrock KBs
-
Performance
- Query latency (OpenAI vs Bedrock)
-
Result quality (similarity scores)
-
Reliability
- Error rates by provider
- Fallback occurrences
Next Steps¶
Immediate¶
- ✅ Run migration:
php artisan migrate - ✅ Test with existing agents (should work unchanged)
- ✅ Test Bedrock KB integration (if you have a KB)
Short-term¶
- Identify partner pilot candidates
- Create partner onboarding documentation
- Build cost comparison calculator
Long-term¶
- Add UI for configuring KB provider
- Build KB creation wizard for partners
- Add multi-KB support per agent
- Explore Bedrock Agents integration
Support¶
Questions?
- Check logs: storage/logs/laravel.log
- Review Bedrock docs: https://docs.aws.amazon.com/bedrock/
- Contact platform team
Issues? - Create GitHub issue with logs and agent configuration - Include provider type (openai/bedrock) - Include error messages from logs
Document Version: 1.0 Status: Ready for Production Estimated Cost Savings: 70-80% Backward Compatible: Yes Breaking Changes: None