Top 100 Developer Tooling and Productivity SaaS Ideas to Launch in 2026 to Scale to $10,000 Monthly Recurring Revenue (MRR)

Leveraging AI for Automated Code Review and Refactoring

The sheer volume of code produced daily necessitates efficient, automated quality assurance. AI-powered tools can analyze code for common anti-patterns, security vulnerabilities, and performance bottlenecks, offering actionable refactoring suggestions. This isn’t about replacing human review but augmenting it, freeing up senior engineers for complex architectural decisions.

Consider a SaaS that integrates with GitHub/GitLab webhooks. Upon a `push` or `pull_request` event, it triggers an AI analysis. The AI model, trained on vast code repositories and best practices, would parse the diff, identify potential issues, and generate a detailed report. For a $10k MRR target, focus on niche languages or specific frameworks where existing tools are less mature.

Technical Implementation Sketch: Python-based Analysis Engine

A core component would be a Python service leveraging libraries like ast for abstract syntax tree parsing and potentially a fine-tuned transformer model (e.g., from Hugging Face) for semantic analysis. Integration with cloud services like AWS Lambda or Google Cloud Functions for scalable processing is key.

import ast
import json
from typing import List, Dict, Any

class CodeAnalyzer:
    def __init__(self, code_string: str):
        self.code_string = code_string
        self.tree = ast.parse(code_string)
        self.issues: List[Dict[str, Any]] = []

    def analyze(self):
        self.check_for_unused_variables()
        self.check_for_long_functions()
        # Add more checks here...
        return self.issues

    def check_for_unused_variables(self):
        defined_vars = set()
        used_vars = set()

        for node in ast.walk(self.tree):
            if isinstance(node, ast.Assign):
                for target in node.targets:
                    if isinstance(target, ast.Name):
                        defined_vars.add(target.id)
            elif isinstance(node, ast.Name) and isinstance(node.ctx, ast.Load):
                used_vars.add(node.id)

        unused = defined_vars - used_vars
        for var_name in unused:
            self.issues.append({
                "type": "UnusedVariable",
                "message": f"Variable '{var_name}' is assigned but never used.",
                "line": None # AST doesn't directly provide line for assignment target easily without more complex traversal
            })

    def check_for_long_functions(self, max_lines: int = 50):
        for node in ast.walk(self.tree):
            if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef)):
                # Crude line count estimation
                start_line = node.lineno
                end_line = node.end_lineno if hasattr(node, 'end_lineno') else start_line
                num_lines = end_line - start_line + 1
                if num_lines > max_lines:
                    self.issues.append({
                        "type": "LongFunction",
                        "message": f"Function '{node.name}' is too long ({num_lines} lines).",
                        "line": start_line
                    })

# Example Usage (within a webhook handler)
# code_content = "def my_func(a, b):\n    c = a + b\n    d = 10 # unused\n    return c\n"
# analyzer = CodeAnalyzer(code_content)
# found_issues = analyzer.analyze()
# print(json.dumps(found_issues, indent=2))

Intelligent API Gateway for Microservices with Advanced Rate Limiting and Caching

As microservice architectures mature, managing inter-service communication becomes a significant challenge. An intelligent API gateway can provide centralized authentication, authorization, request routing, and crucially, sophisticated rate limiting and caching strategies tailored to individual endpoints and user tiers. This is a prime candidate for a $10k MRR SaaS targeting businesses with complex microservice deployments.

The core differentiator here would be dynamic, AI-driven rate limiting that adapts to traffic patterns and potential abuse, rather than static, pre-configured limits. Caching could be intelligent, invalidating cache entries based on downstream service responses or event streams.

Configuration Example: Nginx with Lua Scripting for Dynamic Rate Limiting

Leveraging Nginx with the lua-nginx-module allows for highly flexible, programmatic control over request handling. This enables complex logic for rate limiting based on various factors.

# nginx.conf snippet

# Define a shared memory zone for storing rate limit counters
# The size should be sufficient for your expected number of clients and rate limits
# 'rate_limit_zone' is the name of the zone, '10m' is its size
# 'rate=$rate' defines a variable that will hold the rate limit value per client IP
lua_shared_dict rate_limit_zone 10m;

# Load the Lua module
lua_package_path "/path/to/lua/libs/?.lua;;";

server {
    listen 80;
    server_name api.example.com;

    location / {
        # Access the rate limit zone
        access_by_lua_block {
            local redis = require "resty.redis"
            local cjson = require "cjson"

            -- Assume we have a function to fetch dynamic rate limit from Redis or another service
            -- This function would implement the "intelligent" part, e.g., based on user tier, API endpoint, current traffic
            local function get_dynamic_rate_limit(client_ip)
                local r, err = redis.new()
                if not r then
                    ngx.log(ngx.ERR, "failed to connect to redis: ", err)
                    return 500, "Internal Server Error" -- Fallback or error handling
                end
                r.timeout = 1000 -- 1 sec
                local ok, err = r:connect("127.0.0.1", 6379)
                if not ok then
                    ngx.log(ngx.ERR, "failed to connect to redis: ", err)
                    return 500, "Internal Server Error"
                end

                -- Example: Fetch rate limit for a specific API key or user ID (if available in headers)
                -- For simplicity, using IP here. In production, use authenticated user ID.
                local rate_key = "rate_limit:" .. client_ip
                local rate_config_json, err = r:get(rate_key)

                if rate_config_json then
                    local config = cjson.loads(rate_config_json)
                    -- Example config: {"limit": 100, "period": 60} -- 100 requests per 60 seconds
                    return config.limit, config.period
                else
                    -- Default rate limit if not found
                    return 100, 60 -- Default: 100 requests per minute
                end
            end

            local client_ip = ngx.var.remote_addr
            local limit, period = get_dynamic_rate_limit(client_ip)

            -- Use the Nginx limit_req module with dynamic values
            -- The 'rate' variable is set by the Lua script
            ngx.req.set_header("X-RateLimit-Limit", limit)
            ngx.req.set_header("X-RateLimit-Remaining", limit - 1) -- Placeholder, actual remaining needs counter logic

            -- Nginx's limit_req module needs a rate defined. We can't directly pass variables.
            -- A common workaround is to use a Lua-based rate limiter or a more complex Nginx setup.
            -- For a true Lua-based rate limiter:
            local limit_req = require "resty.limit.req"
            local limiter, err = limit_req.new("rate_limit_zone")
            if not limiter then
                ngx.log(ngx.ERR, "failed to create limiter: ", err)
                return ngx.exit(500)
            end

            -- Set the rate dynamically. This requires the 'rate_limit_zone' to be configured with a 'rate' variable.
            -- However, limit_req.new doesn't directly take dynamic rates.
            -- A more robust solution involves custom Lua logic to manage counters in lua_shared_dict.

            -- Simplified example using a fixed rate for demonstration, but the goal is dynamic:
            -- For dynamic, you'd manage counters in lua_shared_dict directly.
            -- Example:
            -- local key = client_ip .. ":" .. ngx.req.get_uri_args().endpoint -- more granular key
            -- local count, err = rate_limit_zone:incr(key, 1)
            -- if count == 1 then
            --     rate_limit_zone:expire(key, period) -- set expiry
            -- end
            -- if count > limit then
            --     ngx.log(ngx.INFO, "Rate limit exceeded for ", key)
            --     return ngx.exit(429)
            -- end

            -- Using Nginx's built-in limit_req for simplicity, but it's less dynamic without complex Lua
            -- This part would need significant Lua logic to be truly dynamic and efficient.
            -- For a real SaaS, you'd implement a custom Lua rate limiter.
            ngx.log(ngx.INFO, "Dynamic rate limit: ", limit, " per ", period, "s for ", client_ip)
            -- Placeholder for actual rate limiting logic in Lua
            -- If rate limit exceeded: ngx.exit(429)
        }

        # Proxy pass to your backend services
        proxy_pass http://your_backend_service;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

Automated Infrastructure as Code (IaC) Drift Detection and Remediation

Cloud environments are dynamic, and manual changes or misconfigurations can lead to significant security risks and operational instability. A SaaS that continuously monitors IaC definitions (Terraform, CloudFormation, Pulumi) against the actual deployed infrastructure, detecting and optionally remediating drift, offers immense value. Targeting teams heavily invested in cloud-native architectures and compliance requirements can drive adoption.

The key is robust diffing capabilities and intelligent remediation strategies. For instance, detecting an unauthorized security group change and automatically reverting it or flagging it for immediate review. The $10k MRR can be achieved by offering tiered plans based on the number of monitored resources, frequency of scans, and complexity of remediation workflows.

Workflow Example: Terraform State Comparison with Git Repository

This involves comparing the state file generated by terraform plan (or a remote state backend) against the committed IaC code in a Git repository. Tools like driftctl or custom scripts can automate this.

#!/bin/bash

# --- Configuration ---
TERRAFORM_DIR="./path/to/your/terraform/project"
GIT_REPO_DIR="./path/to/your/git/repo" # Where IaC code is checked out
REMOTE_STATE_BACKEND_CONFIG="~/.terraformrc" # Or environment variables
DRIFTCTL_CONFIG="./driftctl-config.yaml" # Optional driftctl configuration

# --- Pre-requisites ---
# Ensure Terraform is installed and configured for remote state access
# Ensure driftctl is installed: https://docs.driftctl.io/installation/
# Ensure Git is installed

# --- Script Logic ---

echo "INFO: Navigating to Terraform project directory..."
cd "$TERRAFORM_DIR" || { echo "ERROR: Could not change directory to $TERRAFORM_DIR"; exit 1; }

echo "INFO: Initializing Terraform..."
# Use remote state backend configuration
export TF_CLI_CONFIG_FILE="$REMOTE_STATE_BACKEND_CONFIG"
terraform init -input=false -backend-config="bucket=your-terraform-state-bucket" # Example backend config

echo "INFO: Generating Terraform plan to capture current state..."
# Use -refresh=false to avoid unnecessary cloud API calls if state is already up-to-date
# For drift detection, a full refresh is often desired.
terraform plan -out=tfplan -refresh=true

echo "INFO: Extracting current infrastructure state from Terraform plan..."
# This step depends on how you want to compare. driftctl is excellent for this.
# Alternatively, you could use 'terraform state pull' and parse JSON.

echo "INFO: Running driftctl to detect drift..."
# driftctl scan --config $DRIFTCTL_CONFIG --output json > drift_report.json
# For a simpler comparison, you might compare terraform state pull with git repo state directly.
# Let's use driftctl for a more comprehensive approach.

# Ensure driftctl has access to cloud credentials (e.g., via environment variables)
# Example: export AWS_ACCESS_KEY_ID=...
# Example: export AWS_SECRET_ACCESS_KEY=...

# Perform a scan against your cloud provider
# The --from-state flag tells driftctl to compare against the actual cloud state.
# The --to-iac flag tells it to compare against your IaC definitions (e.g., Terraform files).
# This requires driftctl to parse your IaC files.
echo "INFO: Scanning cloud infrastructure against IaC definitions..."
driftctl scan --from-state --to-iac --output json > drift_report.json

echo "INFO: Analyzing drift report..."
# Parse the drift_report.json for actionable insights.
# This is where your SaaS logic would reside:
# - Identify drift types (unmanaged resources, missing resources, drifted resources)
# - Categorize by severity
# - Trigger alerts or automated remediation workflows

# Example: Count drifted resources
DRIFTED_COUNT=$(jq '. | length' drift_report.json)

if [ "$DRIFTED_COUNT" -gt 0 ]; then
echo "WARNING: Found $DRIFTED_COUNT drifted resources!"
echo "Review drift_report.json for details."
# In a SaaS, you'd send this to a notification service (Slack, email, etc.)
# and potentially trigger a remediation playbook.
# Example: curl -X POST -H "Content-Type: application/json" -d '{"message": "Infrastructure drift detected!", "details": "'$(jq . drift_report.json)'"}' YOUR_ALERTING_WEBHOOK
else
echo "INFO: No infrastructure drift detected. Infrastructure is in sync with IaC."
fi

# Optional: Automated Remediation (Use with extreme caution!)
# This would involve parsing the report and executing terraform apply with specific changes,
# or using cloud provider APIs to revert unauthorized changes.
# Example (highly simplified, DO NOT use in production without extensive testing):
# if [ "$DRIFTED_COUNT" -gt 0 ]; then
# echo "INFO: Attempting automated remediation..."
# # This is a placeholder. Real remediation is complex.
# # You might need to terraform import drifted resources or terraform destroy unmanaged ones.
# # For example, if an instance was manually terminated:
# # terraform import aws_instance.my_instance i-0123456789abcdef0
# # Then run 'terraform apply'
# fi

echo "INFO: Drift detection process completed."

Real-time Collaboration Platform for Distributed Development Teams

The shift to remote and hybrid work models has amplified the need for seamless collaboration. A SaaS offering real-time code sharing, collaborative debugging, integrated pair programming, and asynchronous communication channels specifically designed for developers can capture significant market share. Think of it as a specialized Slack/VS Code Live Share hybrid.

Monetization can be tiered based on the number of active users, features (e.g., advanced debugging tools, AI code suggestions during collaboration), and integration capabilities with existing VCS and CI/CD pipelines. Achieving $10k MRR requires a polished UX and robust, low-latency infrastructure.

Core Components: WebSockets and CRDTs for Real-time Synchronization

Implementing real-time collaborative editing typically involves WebSockets for persistent, low-latency communication and Conflict-free Replicated Data Types (CRDTs) to manage concurrent edits from multiple users without central locking. Libraries like Yjs in JavaScript are excellent for this.

// --- Frontend (Conceptual - using Yjs and a WebSocket provider) ---
import * as Y from 'yjs';
import { WebsocketProvider } from 'y-websocket'; // Or y-webRTCProvider for P2P

// Create a Yjs document
const doc = new Y.Doc();

// Connect to a WebSocket server for synchronization
// Replace with your actual WebSocket server URL
const provider = new WebsocketProvider('ws://localhost:1234', 'my-shared-document-room', doc);

// Get a shared text type
const text = doc.getText('codemirror'); // 'codemirror' is the key for this shared text

// --- Initialize a Code Editor (e.g., CodeMirror, Monaco Editor) ---
// Assume 'editor' is an instance of your code editor
// editor.setValue(text.toString()); // Initial load

// --- Bind Yjs to the Editor ---
// This part involves listening to editor changes and updating Yjs,
// and listening to Yjs changes and updating the editor.
// Libraries like y-codemirror or y-monaco handle this binding.

// Example binding logic (simplified):
text.observe(event => {
    // When Yjs text changes, update the editor
    // This needs to be debounced or handled carefully to avoid infinite loops
    // and ensure cursor position is maintained.
    const update = Y.applyUpdate(doc, event.update); // Apply the update to the document
    // Update editor content based on the Yjs document state
    // editor.getDoc().apply(update); // Example for Monaco Editor
});

// Listen to editor changes and apply them to Yjs
// editor.on('changes', (from, to, textContent) => {
//     doc.transact(() => {
//         text.delete(from, text.length); // Clear existing text
//         text.insert(0, editor.getValue()); // Insert new text
//     });
// });

// --- Backend (Conceptual - Node.js with ws library) ---
/*
const WebSocket = require('ws');
const { setupWS } = require('y-websocket/bin/server'); // Simplified import

const wss = new WebSocket.Server({ port: 1234 });

setupWS(wss, {
    // Optional: Authentication, persistence, etc.
    // authenticate: async (req) => { ... },
    // levelDbDir: './y-db' // For persistence
});

console.log('Yjs WebSocket server started on ws://localhost:1234');
*/

AI-Powered Test Case Generation and Optimization

Writing comprehensive test suites is time-consuming and often relies on developer intuition. An AI that analyzes application code, user stories, or even production logs to automatically generate relevant test cases (unit, integration, E2E) and optimize existing ones for better coverage and efficiency is a powerful offering. This targets QA teams and development leads focused on quality and speed.

The SaaS could integrate with CI/CD pipelines, providing test generation as a build step. Monetization tiers could be based on the volume of code analyzed, the number of test cases generated, and the sophistication of the AI models used (e.g., symbolic execution, model-based testing). A $10k MRR goal is achievable by focusing on specific testing frameworks or application types (e.g., API testing, UI testing for React apps).

Approach: Analyzing Code Structure and Dependencies for Test Generation

For unit test generation, the AI can parse function signatures, analyze control flow, and identify potential edge cases. For integration tests, it can examine API contracts and data flow between services. Python with libraries like unittest or pytest, and potentially graph-based analysis for dependencies, would be suitable.

import inspect
import ast
import unittest
from typing import List, Dict, Any, Tuple

class TestCaseGenerator:
    def __init__(self, source_code: str):
        self.source_code = source_code
        self.tree = ast.parse(source_code)
        self.generated_tests: List[str] = []

    def generate_tests(self):
        for node in ast.walk(self.tree):
            if isinstance(node, ast.FunctionDef):
                # Focus on functions not starting with '_' (convention for private/internal)
                if not node.name.startswith('_'):
                    self.generate_unit_tests_for_function(node)
        return "\n\n".join(self.generated_tests)

    def generate_unit_tests_for_function(self, func_node: ast.FunctionDef):
        func_name = func_node.name
        class_name = self._find_enclosing_class(func_node)

        # Basic test case structure
        test_class_name = f"Test{class_name}{func_name.capitalize()}" if class_name else f"Test{func_name.capitalize()}"
        test_method_prefix = f"test_{func_name}"

        # Analyze function signature for potential arguments
        arg_names = [arg.arg for arg in func_node.args.args]

        # --- Test Case Generation Logic ---
        # This is a simplified example. Real-world generation would involve:
        # - Analyzing return types and values
        # - Identifying conditional branches (if/else, loops)
        # - Generating edge cases (None, empty strings, zero, large numbers, etc.)
        # - Mocking dependencies

        test_cases: List[Tuple[Dict[str, Any], Any]] = [] # List of (args_dict, expected_output)

        # Example: Generate a basic test case with default/placeholder values
        placeholder_args = {arg: f"'{arg}_value'" for arg in arg_names} # Simple string placeholders
        test_cases.append((placeholder_args, "expected_result_placeholder"))

        # Example: Generate a test for a specific condition (if we could infer it)
        # if 'x' in arg_names and 'y' in arg_names:
        #     test_cases.append(({'x': 0, 'y': 0}, 0)) # Assuming a function like add(x, y)

        # --- Constructing the Test Class ---
        test_code = f"class {test_class_name}(unittest.TestCase):\n"

        if not test_cases:
            test_code += f"    def {test_method_prefix}_basic(self):\n"
            test_code += f"        # TODO: Implement test for {func_name}\n"
            test_code += f"        self.skipTest('Test not yet implemented')\n"
        else:
            for i, (args, expected) in enumerate(test_cases):
                args_str = ", ".join([f"{k}={repr(v)}" for k, v in args.items()])
                test_code += f"    def {test_method_prefix}_case_{i+1}(self):\n"
                if class_name:
                    test_code += f"        instance = {class_name}() # Assuming default constructor\n"
                    test_code += f"        result = instance.{func_name}({args_str})\n"
                else:
                    # Assuming a standalone function
                    test_code += f"        result = {func_name}({args_str})\n"
                test_code += f"        self.assertEqual(result, {repr(expected)})\n"

        self.generated_tests.append(test_code)

    def _find_enclosing_class(self, func_node: ast.FunctionDef) -> str | None:
        # Traverse up the AST to find the enclosing class definition
        # This is a simplified approach; a full AST walker might be needed for complex nesting
        for node in ast.walk(self.tree):
            if isinstance(node, ast.ClassDef):
                for body_item in node.body:
                    if body_item == func_node:
                        return node.name
        return None

# Example Usage:
# python_code = """
# class Calculator:
#     def add(self, x, y):
#         return x + y
#
#     def subtract(self, x, y):
#         return x - y
#
# def multiply(a, b):
#     return a * b
# """
#
# generator = TestCaseGenerator(python_code)
# tests = generator.generate_tests()
# print(tests)

Cloud Cost Optimization and Anomaly Detection Platform

With cloud spending escalating, businesses are desperate for tools that provide granular visibility into costs, identify optimization opportunities, and flag anomalous spending patterns in real-time. A SaaS that aggregates data from AWS Cost Explorer, Azure Cost Management, and GCP Billing, applies AI to predict future spend, recommend resource rightsizing, and alert on unexpected spikes, can achieve significant MRR.

The $10k MRR target is attainable by focusing on specific cloud providers initially or offering advanced features like automated cost allocation tagging enforcement or reserved instance/savings plan optimization recommendations. Integration with cloud provider APIs is paramount.

Data Ingestion and Analysis Pipeline (Conceptual)

This involves setting up scheduled jobs to pull cost and usage data from cloud provider APIs, storing it in a data warehouse (e.g., Snowflake, BigQuery, or even PostgreSQL with partitioning), and running analytical models (statistical analysis, time-series forecasting) to detect trends and anomalies. Python with libraries like pandas, scikit-learn, and cloud SDKs is ideal.

import boto3
import pandas as pd
from datetime import datetime, timedelta
from dateutil.relativedelta import relativedelta
import json
# Assume a data warehouse connection is established (e.g., using SQLAlchemy)
# from sqlalchemy import create_engine

# --- Configuration ---
AWS_REGION = "us-east-1"
COST_EXPLORER_START_DATE = (datetime.now() - relativedelta(months=1)).strftime('%Y-%m-%d')
COST_EXPLORER_END_DATE = datetime.now().strftime('%Y-%m-%d')
COST_GRANULARITY = "DAILY" # "MONTHLY", "HOURLY"
METRICS = ["UnblendedCost", "UsageQuantity"]
GROUP_BY = [
    {"Type": "DIMENSION", "Key": "SERVICE"},
    {"Type": "DIMENSION", "Key": "LINKED_ACCOUNT"},
    {"Type": "TAG", "Key": "Environment"} # Example tag
]

# --- AWS Cost Explorer Client ---
ce_client = boto3.client("ce", region_name=AWS_REGION)

# --- Data Ingestion Function ---
def ingest_aws_cost_data(start_date: str, end_date: str, granularity: str, group_by: List[Dict[str, Any]], metrics: List[str]) -> pd.DataFrame:
    """Fetches cost and usage data from AWS Cost Explorer."""
    all_results = []
    next_page_token = None

    while True:
        kwargs = {
            "TimePeriod": {"Start": start_date, "End": end_date},
            "Granularity": granularity,
            "Metrics": metrics,
            "GroupBy": group_by,
        }
        if next_page_token:
            kwargs["NextPageToken"] = next_page_token

        try:
            response = ce_client.get_cost_and_usage(**kwargs)
            all_results.extend(response.get("ResultsByTime", []))
            next_page_token = response.get("NextPageToken")

            if not next_page_token:
                break
        except Exception as e:
            print(f"Error fetching cost data: {e}")
            break

    # --- Data Processing ---
    processed_data = []
    for time_period_result in all_results:
        time_period = time_period_result["TimePeriod"]["Start"]
        for group in time_period_result["Groups"]:
            dimensions = {d["Key"]: d["Value"] for d in group["Keys"]}
            metrics_data = group["Metrics"]
            row = {
                "Date": time_period,
                **dimensions,
                "UnblendedCost": float(metrics_data.get("UnblendedCost", {}).get("Amount", 0)),
                "UsageQuantity": float(metrics_data.get("UsageQuantity", {}).get("Amount", 0)),
            }
            processed_data.append(row)

    df = pd.DataFrame(processed_data)
    # Ensure numeric columns are correctly typed
    df["UnblendedCost"] = pd.to_numeric(df["UnblendedCost"])
    df["UsageQuantity"] = pd.to_numeric(df["UsageQuantity"])
    return df

# --- Anomaly Detection (Conceptual) ---
def detect_anomalies(df: pd.DataFrame) -> pd.DataFrame:
    """Applies anomaly detection algorithms to cost data."""
    # Example: Simple threshold-based anomaly detection
    # In a real SaaS, use more sophisticated methods like Isolation Forest, ARIMA, etc.
    df['CostChange'] = df.groupby(['SERVICE', 'LINKED_ACCOUNT'])['UnblendedCost'].diff().fillna(0)
    
    # Define anomaly threshold (e.g., 3 standard deviations or a fixed percentage increase)
    # This needs to be dynamic and context-aware.
    anomaly_threshold_percentage = 0.50 # 50% increase
    
    anomalies = df[df['CostChange'] > df['UnblendedCost'] * anomaly_threshold_percentage]
    
    # Add more sophisticated anomaly detection here...
    # from sklearn.ensemble import IsolationForest
    # model = IsolationForest(contamination='auto')
    # anomalies = df[model.fit_predict(df[['UnblendedCost']]) == -1]

    return anomalies

# --- Main Execution Flow ---
if __name__ == "__main__":
    print(f"Fetching AWS cost data from {COST_EXPLORER_START_DATE} to {COST_EXPLORER_END_DATE}...")
    cost_df = ingest_aws_cost_data(
        start_date=COST_EXPLORER_START_DATE,
        end_date=COST_EXPLORER_END_DATE,
        granularity=COST_GRANULARITY,
        group_by=GROUP_BY,
        metrics=METRICS
    )

    print("Cost data fetched successfully. First 5 rows:")
    print(cost_df.head())

    print("\nDetecting cost anomalies...")
    anomalies_df = detect_anomalies(cost_df)

    if not anomalies_df.empty:
        print(f"\nFound {len(anomalies_df)} potential cost anomalies:")
        print(anomalies_df)
        # In a SaaS, send alerts, generate reports, etc.
        # For example, send to a notification service:
        # notify_users(anomalies_df.to_dict('records'))
    else:
        print("\nNo significant cost anomalies detected.")

    # --- Data Storage (Conceptual) ---
    # Store cost_df and anomalies_df in your data warehouse
    # engine = create_engine("postgresql://user:password@host:port/database")
    # cost_df.to_sql("aws_cost_data", engine, if_exists="append", index=False)
    # anomalies_df.to_sql("cost_anomalies", engine, if_exists="append", index=False)