Skip to content

Amazon Bedrock Knowledge Bases Integration Strategy

Leveraging Partner-Owned AWS Knowledge Bases

Date: 2025-11-17 Status: Proposed Enhancement


Executive Summary

Many AWS partners already have Amazon Bedrock Knowledge Bases set up in their AWS accounts, typically connected to S3 buckets containing product documentation, technical guides, and sales materials. This document outlines how to integrate partner-owned Bedrock Knowledge Bases into the agent system as an alternative to (or in addition to) the OpenAI-based implementation.


Why Bedrock Knowledge Bases?

Current State (OpenAI)

  • ✅ Working implementation with OpenAI embeddings
  • ❌ Requires separate file uploads to OpenAI
  • ❌ Additional API costs for embeddings
  • ❌ Partners can't leverage existing Bedrock KBs
  • ❌ Data stored outside AWS ecosystem

Future State (Bedrock KB)

  • ✅ Partners already have KBs configured
  • ✅ No data duplication needed
  • ✅ Native AWS integration (S3, IAM, CloudWatch)
  • ✅ Lower costs (Bedrock embeddings cheaper than OpenAI)
  • ✅ Better performance (regional deployment)
  • ✅ Uses Claude models natively (your preferred LLM)

Architecture Comparison

Current: OpenAI-based Knowledge Base

User uploads PDF
Stored locally → Chunked → OpenAI Ada-002 embedding → pdf_data table
Agent queries → OpenAI embedding API → Cosine similarity → Top N results
Injected into Claude prompt via Bedrock

Costs: - $0.0001 per 1K tokens (embedding query) - $0.0004 per 1K tokens (document indexing)

Proposed: Bedrock Knowledge Base

Partner has existing S3 bucket with docs
Bedrock KB sync (one-time setup) → Titan/Cohere embeddings → Knowledge Base
Agent queries → Bedrock RetrieveAndGenerate API → Top N results
Directly integrated with Claude in same Bedrock call

Costs: - $0.00008 per 1K tokens (Titan embeddings - 80% cheaper!) - No storage costs (uses partner's S3)


Implementation Options

Support both OpenAI and Bedrock Knowledge Bases simultaneously.

Agent Settings:

{
  "knowledge_base_provider": "bedrock",  // or "openai"
  "bedrock_kb_id": "ABCD1234",           // if using bedrock
  "aws_region": "us-east-1",
  "capabilities": ["query_knowledge_base"]
}

Benefits: - Flexibility for partners who don't have Bedrock KBs yet - Gradual migration path - Best of both worlds

Option 2: Bedrock-Only (Future)

Replace OpenAI implementation entirely with Bedrock.

Benefits: - Simplified architecture - Lower costs - Single cloud provider (AWS) - Better for partners already on AWS


Technical Implementation

1. Bedrock Knowledge Base Service

Create new service: app/Services/Bedrock/BedrockKnowledgeBaseService.php

<?php

namespace App\Services\Bedrock;

use Aws\BedrockAgentRuntime\BedrockAgentRuntimeClient;
use Aws\Exception\AwsException;

class BedrockKnowledgeBaseService
{
    protected BedrockAgentRuntimeClient $client;

    public function __construct(string $region = 'us-east-1')
    {
        $this->client = new BedrockAgentRuntimeClient([
            'region' => $region,
            'version' => 'latest',
            'credentials' => [
                'key' => config('services.bedrock.key'),
                'secret' => config('services.bedrock.secret'),
            ],
        ]);
    }

    /**
     * Query a Bedrock Knowledge Base
     *
     * @param string $knowledgeBaseId The KB ID from AWS
     * @param string $query The search query
     * @param int $maxResults Number of results to return
     * @return array Retrieved documents with scores
     */
    public function retrieve(
        string $knowledgeBaseId,
        string $query,
        int $maxResults = 5
    ): array {
        try {
            $result = $this->client->retrieve([
                'knowledgeBaseId' => $knowledgeBaseId,
                'retrievalQuery' => [
                    'text' => $query,
                ],
                'retrievalConfiguration' => [
                    'vectorSearchConfiguration' => [
                        'numberOfResults' => $maxResults,
                    ],
                ],
            ]);

            return $this->formatRetrievalResults($result['retrievalResults']);
        } catch (AwsException $e) {
            \Log::error('Bedrock KB retrieval failed', [
                'error' => $e->getMessage(),
                'kb_id' => $knowledgeBaseId,
                'query' => $query,
            ]);

            return [];
        }
    }

    /**
     * Retrieve AND generate using Bedrock's integrated approach
     * This combines retrieval + Claude generation in a single API call
     */
    public function retrieveAndGenerate(
        string $knowledgeBaseId,
        string $prompt,
        string $modelArn,
        array $previousMessages = []
    ): array {
        try {
            $result = $this->client->retrieveAndGenerate([
                'input' => [
                    'text' => $prompt,
                ],
                'retrieveAndGenerateConfiguration' => [
                    'type' => 'KNOWLEDGE_BASE',
                    'knowledgeBaseConfiguration' => [
                        'knowledgeBaseId' => $knowledgeBaseId,
                        'modelArn' => $modelArn, // e.g., arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0
                        'retrievalConfiguration' => [
                            'vectorSearchConfiguration' => [
                                'numberOfResults' => 5,
                            ],
                        ],
                    ],
                ],
                'sessionConfiguration' => [
                    'kmsKeyArn' => config('services.bedrock.kms_key'), // Optional encryption
                ],
            ]);

            return [
                'generated_text' => $result['output']['text'],
                'citations' => $result['citations'] ?? [],
                'session_id' => $result['sessionId'],
            ];
        } catch (AwsException $e) {
            \Log::error('Bedrock retrieve and generate failed', [
                'error' => $e->getMessage(),
                'kb_id' => $knowledgeBaseId,
            ]);

            throw $e;
        }
    }

    protected function formatRetrievalResults(array $results): array
    {
        return array_map(function ($result) {
            return [
                'content' => $result['content']['text'],
                'score' => $result['score'],
                'source_file' => $result['location']['s3Location']['uri'] ?? 'Unknown',
                'metadata' => $result['metadata'] ?? [],
            ];
        }, $results);
    }
}

2. Enhanced QueryKnowledgeBaseCapability

Update to support both providers:

protected function searchKnowledgeBase(
    array $queryEmbedding,
    int $userId,
    ?int $teamId,
    ?string $category,
    int $topN,
    float $minSimilarity,
    string $provider = 'openai'  // NEW: provider selection
): array {
    if ($provider === 'bedrock') {
        return $this->searchBedrockKnowledgeBase($userId, $teamId, $category, $topN);
    }

    // Existing OpenAI implementation...
}

protected function searchBedrockKnowledgeBase(
    int $userId,
    ?int $teamId,
    ?string $category,
    int $topN
): array {
    // Get user's configured Bedrock KB ID
    $agent = $this->getCurrentAgent();
    $kbId = $agent->settings['bedrock_kb_id'] ?? null;

    if (!$kbId) {
        $this->log('No Bedrock KB configured for agent');
        return [];
    }

    $bedrockKB = new BedrockKnowledgeBaseService(
        $agent->settings['aws_region'] ?? 'us-east-1'
    );

    $results = $bedrockKB->retrieve($kbId, $query, $topN);

    // Format to match OpenAI response structure
    return array_map(function ($result) {
        return [
            'content' => $result['content'],
            'similarity' => $result['score'],
            'source_file' => basename($result['source_file']),
            'category' => $category ?? 'bedrock-kb',
        ];
    }, $results);
}

3. Alternative: Direct RetrieveAndGenerate

For even better performance, use Bedrock's integrated approach:

// In GenerateTextCapability.php

if ($this->shouldUseBedrockKB($execution)) {
    // Use Bedrock's integrated retrieve + generate
    $bedrockKB = new BedrockKnowledgeBaseService();

    $result = $bedrockKB->retrieveAndGenerate(
        $agent->settings['bedrock_kb_id'],
        $this->buildPrompt($parameters, $context),
        'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'
    );

    // Already includes KB context + generation in one call!
    return [
        'success' => true,
        'content' => $result['generated_text'],
        'citations' => $result['citations'],
        'credits_used' => $this->calculateBedrockCredits($result),
    ];
}

Partner Setup Guide

For Partners with Existing Bedrock KBs

Step 1: Get Your Knowledge Base ID

aws bedrock-agent list-knowledge-bases --region us-east-1

Step 2: Configure Agent

In the agent configuration UI:

{
  "name": "Partner Content Writer",
  "capabilities": ["generate_text", "query_knowledge_base"],
  "settings": {
    "knowledge_base_provider": "bedrock",
    "bedrock_kb_id": "ABCD1234EFGH",
    "aws_region": "us-east-1"
  }
}

Step 3: Grant IAM Permissions

The application's IAM role needs:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "bedrock:Retrieve",
        "bedrock:RetrieveAndGenerate"
      ],
      "Resource": [
        "arn:aws:bedrock:us-east-1:ACCOUNT_ID:knowledge-base/ABCD1234EFGH"
      ]
    }
  ]
}

For Partners Creating New Bedrock KBs

Step 1: Create S3 Bucket

aws s3 mb s3://my-partner-knowledge-base

Step 2: Upload Documents

aws s3 sync ./documents/ s3://my-partner-knowledge-base/docs/

Step 3: Create Knowledge Base via Console 1. Go to AWS Bedrock Console → Knowledge Bases 2. Click "Create knowledge base" 3. Choose S3 as data source 4. Select Titan Embeddings model 5. Configure sync schedule (hourly/daily) 6. Copy the Knowledge Base ID

Step 4: Test Retrieval

aws bedrock-agent-runtime retrieve \
  --knowledge-base-id ABCD1234EFGH \
  --retrieval-query text="AWS Marketplace best practices" \
  --region us-east-1


Migration Strategy

Phase 1: Dual Support (Current + Bedrock)

  • ✅ Keep existing OpenAI implementation
  • ✅ Add Bedrock KB support as optional
  • ✅ Agent setting: choose provider
  • Timeline: 2 weeks

Phase 2: Partner Onboarding

  • ✅ Documentation for partners with existing KBs
  • ✅ Migration tooling (OpenAI → Bedrock)
  • ✅ Cost comparison calculator
  • Timeline: 1 month

Phase 3: Bedrock-First (Optional)

  • ✅ Make Bedrock the default
  • ✅ OpenAI as fallback
  • ✅ Eventually deprecate OpenAI if unused
  • Timeline: 3-6 months

Cost Comparison

Scenario: 10,000 Documents, 50,000 Queries/Month

OpenAI: - Document indexing: $40 (one-time) - Query embeddings: $50/month - Storage: Included in DB - Total: $90/month (after initial $40)

Bedrock KB: - Document indexing: $8 (one-time) - Query embeddings: $4/month (Titan) - S3 storage: $2.30/month (100GB) - Total: $14.30/month (after initial $8)

Savings: 84% cheaper with Bedrock!


Advanced Features

1. Multi-KB Support

Partners can configure multiple KBs:

{
  "bedrock_kbs": {
    "sales": "KB-SALES-123",
    "technical": "KB-TECH-456",
    "compliance": "KB-COMP-789"
  },
  "kb_routing": {
    "competitive_email": "sales",
    "api_documentation": "technical",
    "marketplace_listing": "compliance"
  }
}

2. Cross-Partner Knowledge Sharing

For co-sell scenarios:

{
  "shared_kbs": [
    {
      "partner_id": "partner-a",
      "kb_id": "KB-PARTNER-A",
      "permission": "read"
    },
    {
      "partner_id": "partner-b",
      "kb_id": "KB-PARTNER-B",
      "permission": "read"
    }
  ]
}

3. Real-time Sync

Bedrock KBs can auto-sync from S3: - Partners update docs in S3 - KB automatically re-indexes (hourly/daily) - Agents always have latest content - No manual upload needed


Implementation Checklist

Backend

  • Install AWS SDK for PHP: composer require aws/aws-sdk-php
  • Create BedrockKnowledgeBaseService class
  • Update QueryKnowledgeBaseCapability for dual provider support
  • Add agent settings: bedrock_kb_id, aws_region, kb_provider
  • Implement provider routing logic
  • Add IAM role configuration docs

Database

  • Add columns to ext_content_manager_agents table:
    ALTER TABLE ext_content_manager_agents
    ADD COLUMN kb_provider VARCHAR(20) DEFAULT 'openai',
    ADD COLUMN bedrock_kb_id VARCHAR(100) NULL,
    ADD COLUMN aws_region VARCHAR(20) DEFAULT 'us-east-1';
    

Configuration

  • Add Bedrock credentials to .env:
    AWS_BEDROCK_KEY=
    AWS_BEDROCK_SECRET=
    AWS_BEDROCK_REGION=us-east-1
    
  • Add to config/services.php:
    'bedrock' => [
        'key' => env('AWS_BEDROCK_KEY'),
        'secret' => env('AWS_BEDROCK_SECRET'),
        'region' => env('AWS_BEDROCK_REGION', 'us-east-1'),
    ]
    

Testing

  • Unit tests for BedrockKnowledgeBaseService
  • Integration tests with actual Bedrock KB
  • Performance benchmarks (Bedrock vs OpenAI)
  • Cost tracking

Documentation

  • Partner setup guide (existing KB)
  • Partner setup guide (new KB)
  • IAM permissions documentation
  • Migration guide (OpenAI → Bedrock)
  • Troubleshooting guide

Benefits Summary

For Vell (Platform Owner)

  • ✅ Lower infrastructure costs (84% savings)
  • ✅ Better AWS partner story
  • ✅ Reduced dependency on OpenAI
  • ✅ Faster performance (regional deployment)
  • ✅ Enterprise-grade security (AWS IAM)

For Partners

  • ✅ Leverage existing AWS investments
  • ✅ No data duplication
  • ✅ S3-based document management (familiar)
  • ✅ Auto-sync from S3 (no manual uploads)
  • ✅ Better compliance (data stays in AWS)
  • ✅ Lower costs

For End Users

  • ✅ Faster query responses (regional)
  • ✅ More accurate results (partner's own KB)
  • ✅ Up-to-date content (auto-sync)
  • ✅ Better citations (S3 URIs)

Example Use Case

Partner: Acme Corp (AWS Advanced Partner)

Current Setup: - S3 bucket: s3://acme-product-docs/ (500 PDFs, 2GB) - Bedrock KB: KB-ACME-PROD-123 - Auto-syncs daily at 2 AM UTC - Documents: Product specs, API docs, case studies, battle cards

Integration:

{
  "agent": {
    "name": "Acme Sales Assistant",
    "capabilities": ["generate_text", "query_knowledge_base"],
    "settings": {
      "kb_provider": "bedrock",
      "bedrock_kb_id": "KB-ACME-PROD-123",
      "aws_region": "us-east-1"
    }
  }
}

Result: - Agents automatically reference Acme's docs - No need to upload to Vell platform - Content updates in S3 → available in agents next day - 84% lower KB costs - Faster responses (AWS regional)


Next Steps

  1. Validate Interest - Survey partners: Who has existing Bedrock KBs?
  2. Pilot Program - 3-5 partners test Bedrock KB integration
  3. Build MVP - Implement BedrockKnowledgeBaseService
  4. Documentation - Partner setup guides
  5. Full Rollout - Make available to all partners

Questions to Consider: - Should we support other AWS KB providers (Kendra, OpenSearch)? - Should we build KB creation wizard for partners who don't have one? - Should we offer Bedrock KB hosting as a managed service?


Document Version: 1.0 Status: Proposed Estimated Effort: 2-3 weeks for MVP