Bedrock Model Invocation Optimization

AWS Bedrock is a powerful managed service for machine learning and natural language processing, offering various models with different performance and cost characteristics. This script analyzes Bedrock model invocations, identifies usage patterns, and provides recommendations for cost optimization based on invocation frequency and usage.

Benefits of Invocation Optimization

Cost Efficiency: By monitoring Bedrock model usage, you can identify underutilized models or determine if high-usage models would benefit from provisioned throughput, helping to reduce operational costs.
Improved Resource Allocation: Knowing the frequency and scale of model invocations can aid in better budgeting and resource management.
Enhanced Usage Insights: Analyzing invocation patterns helps understand which models are most valuable, allowing for targeted optimizations or replacements if needed.

Model Pricing (Per 1,000 Tokens or Per Image)

This pricing information can help calculate cost savings based on usage patterns.

Amazon Titan Models:
amazon.titan-text-express-v1: $0.002
amazon.titan-text-express-v2: $0.0025
amazon.titan-text-express-v3: $0.003
amazon.titan-llama-v1: $0.0015
AI21 Labs Models:
ai21.jurassic-2-mid: $0.0125
ai21.jurassic-2-ultra: $0.0188
ai21.jamba-1.5-large: $0.002
ai21.jamba-1.5-mini: $0.0002
ai21.jamba-instruct: $0.0005
Anthropic Models:
anthropic.claude-3-opus: $0.015
anthropic.claude-3-sonnet: $0.003
anthropic.claude-3-haiku: $0.00025
anthropic.claude-2.1: $0.008
Cohere Models:
cohere.command-r: $0.0015
cohere.command-r+: $0.002
Meta Models:
meta.llama-3-8b: $0.0015
meta.llama-3-70b: $0.002
Mistral AI Models:
mistral.mistral-8x7b-instruct: $0.0015
mistral.mistral-7b-instruct: $0.001
Stability AI Model:
stability.stable-diffusion-xl-1.0: $0.02 per image

Optimization Strategy

Detecting Log Groups

Objective: Identify CloudWatch log groups associated with AWS Bedrock to monitor model invocations.
Method: Search for log groups containing "bedrock" in their names to narrow down relevant log groups.

Analyzing Invocation Logs

Objective: For each model invocation, capture input/output token counts and timestamps to understand usage patterns and calculate total costs.
Method: Retrieve logs from CloudWatch, parse JSON-formatted messages, and calculate the total token count for each invocation to estimate costs.

Usage Recommendations

Provisioned Throughput: Models with more than 100 invocations per day are candidates for provisioned throughput, which can reduce costs for high-usage scenarios.
On-Demand Pricing: Models with low invocation frequency (e.g., less than one per day) may benefit from on-demand pricing.
Review Low-Usage Models: Models with minimal activity should be reviewed to determine if they are still required.

Calculating Potential Savings

For each invocation: - Calculate the current cost based on token usage and the model's pricing. - Determine if transitioning to provisioned throughput or reducing usage can yield cost savings. - Generate recommendations for each model based on average daily invocation counts.

Implementation Strategy

Log Fetching: Use a CloudWatch Logs client to gather all relevant Bedrock model invocation logs.
Data Analysis: Analyze invocation frequency, total token usage, and calculate the associated costs.
Recommendations: Generate recommendations for provisioned throughput, on-demand pricing, or usage review based on invocation frequency.
Saving Findings: Optionally, save findings to a DynamoDB table for future reference and tracking.

By regularly monitoring and analyzing AWS Bedrock model invocations, you can optimize costs and usage patterns, ensuring that model resources are aligned with actual demand.