Skip to content

SageMaker

SageMaker, Amazon Web Services' powerful machine learning platform, offers a comprehensive suite of tools designed for data scientists to experiment, train datasets, and develop models.

However, SageMaker resources — particularly GPU-backed endpoints and notebook instances — can become extremely costly when left running without active use. A single forgotten ml.p3.2xlarge endpoint costs over $2,700/month.

We actively detect idle SageMaker resources across all three resource types to help ML teams and FinOps practitioners reclaim this waste.

Endpoints

SageMaker inference Endpoints are the highest-cost risk. We analyze CloudWatch metrics over an extended observation period to determine whether an endpoint is receiving real inference traffic. Only endpoints with no meaningful activity are flagged — ensuring active models in production are never falsely reported.

Notebooks

SageMaker Notebook Instances left in InService state are billed continuously, even when no one is actively using them. We detect notebooks that have been running idle for an extended period without any modification or interaction, flagging those that are likely forgotten by their owners.

Apps

SageMaker Studio Apps (JupyterServer, KernelGateway, Canvas, etc.) run on dedicated compute instances that are billed while active. We surface all running apps with their instance types, uptime, and estimated costs so teams can identify and shut down forgotten Studio sessions.