Skip to content

Bedrock Model Invocation Optimization

AWS Bedrock is a powerful managed service for machine learning and natural language processing, offering various models with different performance and cost characteristics. This script analyzes Bedrock model invocations, identifies usage patterns, and provides recommendations for cost optimization based on invocation frequency and usage.

Benefits of Invocation Optimization

  1. Cost Efficiency: By monitoring Bedrock model usage, you can identify underutilized models or determine if high-usage models would benefit from provisioned throughput, helping to reduce operational costs.
  2. Improved Resource Allocation: Knowing the frequency and scale of model invocations can aid in better budgeting and resource management.
  3. Enhanced Usage Insights: Analyzing invocation patterns helps understand which models are most valuable, allowing for targeted optimizations or replacements if needed.

Model Pricing (Per 1,000 Tokens or Per Image)

This pricing information can help calculate cost savings based on usage patterns.

  • Amazon Titan Models:
  • amazon.titan-text-express-v1: $0.002
  • amazon.titan-text-express-v2: $0.0025
  • amazon.titan-text-express-v3: $0.003
  • amazon.titan-llama-v1: $0.0015

  • AI21 Labs Models:

  • ai21.jurassic-2-mid: $0.0125
  • ai21.jurassic-2-ultra: $0.0188
  • ai21.jamba-1.5-large: $0.002
  • ai21.jamba-1.5-mini: $0.0002
  • ai21.jamba-instruct: $0.0005

  • Anthropic Models:

  • anthropic.claude-3-opus: $0.015
  • anthropic.claude-3-sonnet: $0.003
  • anthropic.claude-3-haiku: $0.00025
  • anthropic.claude-2.1: $0.008

  • Cohere Models:

  • cohere.command-r: $0.0015
  • cohere.command-r+: $0.002

  • Meta Models:

  • meta.llama-3-8b: $0.0015
  • meta.llama-3-70b: $0.002

  • Mistral AI Models:

  • mistral.mistral-8x7b-instruct: $0.0015
  • mistral.mistral-7b-instruct: $0.001

  • Stability AI Model:

  • stability.stable-diffusion-xl-1.0: $0.02 per image

Optimization Strategy

Detecting Log Groups

  • Objective: Identify CloudWatch log groups associated with AWS Bedrock to monitor model invocations.
  • Method: Search for log groups containing "bedrock" in their names to narrow down relevant log groups.

Analyzing Invocation Logs

  • Objective: For each model invocation, capture input/output token counts and timestamps to understand usage patterns and calculate total costs.
  • Method: Retrieve logs from CloudWatch, parse JSON-formatted messages, and calculate the total token count for each invocation to estimate costs.

Usage Recommendations

  • Provisioned Throughput: Models with more than 100 invocations per day are candidates for provisioned throughput, which can reduce costs for high-usage scenarios.
  • On-Demand Pricing: Models with low invocation frequency (e.g., less than one per day) may benefit from on-demand pricing.
  • Review Low-Usage Models: Models with minimal activity should be reviewed to determine if they are still required.

Calculating Potential Savings

For each invocation: - Calculate the current cost based on token usage and the model's pricing. - Determine if transitioning to provisioned throughput or reducing usage can yield cost savings. - Generate recommendations for each model based on average daily invocation counts.

Implementation Strategy

  1. Log Fetching: Use a CloudWatch Logs client to gather all relevant Bedrock model invocation logs.
  2. Data Analysis: Analyze invocation frequency, total token usage, and calculate the associated costs.
  3. Recommendations: Generate recommendations for provisioned throughput, on-demand pricing, or usage review based on invocation frequency.
  4. Saving Findings: Optionally, save findings to a DynamoDB table for future reference and tracking.

By regularly monitoring and analyzing AWS Bedrock model invocations, you can optimize costs and usage patterns, ensuring that model resources are aligned with actual demand.