> ## Documentation Index
> Fetch the complete documentation index at: https://docs.rapydo.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Defining Rules

> Step-by-step guide to creating automation rules in Rapydo

## Creating Automation Rules

Rapydo automation rules enable proactive database management by automatically responding to specific conditions. This guide walks you through the process of creating effective rules.

***

## Step 1: Identify What to Monitor

Before creating a rule, understand your database's normal behavior and identify what needs attention.

**Key questions to ask:**

* What query duration is acceptable for your workload?
* At what CPU/memory threshold does performance degrade?
* How close do you get to connection limits during peak hours?
* Are there specific users or databases requiring special monitoring?

**Common scenarios:**

* "Kill any query running longer than 5 minutes"
* "Alert when CPU exceeds 80% for more than 3 minutes"
* "Notify when connections reach 90% of the limit"
* "Terminate reporting queries exceeding 10 minutes"

***

## Step 2: Choose Your Rule Type

### Scout Rules

**Use for:** Query monitoring and automatic action.

* Monitor long-running queries in real-time
* Automatically kill queries exceeding duration thresholds
* Filter by user, database, or query pattern
* Get alerts when specific queries are detected

**Example:** Kill queries from `analytics_user` running longer than 600 seconds

***

### Alert Rules

**Use for:** Metric monitoring and notifications

* Monitor CPU, memory, connections, IOPS
* Send alerts when thresholds are exceeded
* Multi-metric support (combine multiple conditions)
* Email and webhook notifications

**Example:** Alert when CPU > 80% for 5 consecutive checks

***

## Step 3: Define Triggers and Conditions

Specify the exact conditions that activate your rule.

### For Scout Rules

**Query Duration Trigger**

```
Condition: Query running longer than X seconds
Example: Duration > 300 seconds
```

**Optional Filters:**

* **Database**: Apply only to specific databases (e.g., `production_db`)
* **User**: Target specific users (e.g., `reporting_user`)
* **Query Pattern**: Match SQL text or patterns (e.g., `SELECT * FROM large_table`)
* **IP Address**: Filter by client IP address

**💡 Important:** Filters are optional but help target rules precisely and avoid impacting legitimate queries.

***

### For Alert Rules

**Metrics**

Available metrics for monitoring:

**Resource Utilization:**

* **CPU Utilization (%)**: Processor usage across instances
* **Free Memory**: Available RAM (not percentage-based)
* **Read IOPS**: Read input/output operations per second
* **Write IOPS**: Write input/output operations per second
* **Connection Utilization (%)**: Percentage of maximum connections in use

**Query Performance:**

* **Max Query Duration**: Duration of the longest-running query

**Database Activity:**

* **Connections count**: Number of active connections
* **DB count**: Number of databases
* **Users count**: Number of connected users
* **Hosts count**: Number of client hosts with connections
* **Waits count**: Number of queries in wait state

**Operators:**

* Greater than (`>`)
* Greater than or equal (`>=`)
* Less than (`<`)
* Less than or equal (`<=`)

***

**Example single metric:**

```
Metric: CPU Utilization
Operator: Greater than
Value: 80%
```

***

**Multi-Metric Rules (Advanced)**

Combine multiple metrics with AND logic for sophisticated monitoring. All conditions must be true simultaneously for the rule to trigger.

**Example - High CPU AND High Connections:**

```
Metric 1: CPU Utilization > 80%
AND
Metric 2: Connections count > 100
```

→ Alert only when BOTH conditions are true simultaneously

**Example - Query Performance Issues:**

```
Metric 1: Max Query Duration > 60 seconds
AND
Metric 2: Waits count > 20
```

→ Alerts when slow queries correlate with high wait states

**💡 Important:** Multi-metric rules let you create precise conditions that reduce false alerts. For example, alerting on high CPU is more meaningful when combined with high connections.

***

## Step 4: Specify Actions

Define what happens when trigger conditions are met.

### Scout Rule Actions

When a Scout Rule is triggered, you can execute one of these actions:

***

**Kill query**

* Terminates the specific query that triggered the rule
* **Use when:** Query exceeds acceptable duration or consumes excessive resources
* **Example:** Kill any query running longer than 300 seconds

***

**Kill connection**

* Terminates the entire database connection (closes all queries from that connection)
* **Use when:** A connection is causing persistent issues or needs to be forcibly closed
* **Example:** Kill connections from problematic clients or applications

***

**Kill idle connections**

* Terminates idle connections that aren't actively running queries
* **Use when:** Too many idle connections are consuming resources
* **Example:** Close connections that have been idle for more than 1 hour

***

**Rate limit**

* Automatically kills connections when they exceed a defined threshold to enforce the limit
* **How it works:** If Rapydo detects multiple simultaneous connections matching the trigger, it kills enough connections to reach the defined limit
* **Use when:** Need to limit concurrent connections from specific users or databases
* **Example:** Limit `reporting_user` to maximum 5 concurrent connections—if 10 connections are detected, kill 5 to reach the limit

**💡 Important:** Rate limit controls the NUMBER of simultaneous connections, not queries per second.

***

**RCA (Query Analysis)**

* Triggers automatic AI-powered query analysis for queries matching the trigger
* **Results are sent via email** with complete analysis and remediation plan
* **What you get:**
  * Root cause identification (missing indexes, inefficient joins, etc.)
  * Step-by-step remediation plan with SQL statements
  * Estimated performance impact
  * Table statistics and execution plan details
* **Use when:** You want to understand WHY queries are slow and get optimization recommendations
* **Common trigger:** Analyze any query running longer than a defined threshold (e.g., 60 seconds)

**💡 Important:** RCA goes beyond just identifying the problem—it provides complete solutions with implementation guidance.

**⚠️ Required:** A notification destination (email or webhook) must be configured when using RCA. The analysis report cannot be delivered without a valid notification target.

***

**No action (Notification Only)**

* Select **No action** as the action type, then enable notifications with your email or webhook
* Sends alert without taking any database action — queries continue running unaffected
* The event is logged in Rapydo for audit purposes
* **Use when:** You want visibility without automatic intervention
* **Example:** Monitor query patterns to build baselines before taking action

***

## Step 5: Configure Details

### Basic Settings

**Type**

* Alert rule (metric monitoring)
* Scout rule (query monitoring)

**Status**

* **Active**: Rule is monitoring and will execute actions
* **Disabled**: Rule is saved but inactive (useful for testing)

**DB Instances**

* **Select All**: Apply rule to all monitored instances
* **Specific Instances**: Choose individual databases
* **💡 Tip:** Start with specific instances, then expand to "Select All" after testing

***

### Alert Rule Parameters

**Samples to Trigger**
Number of consecutive checks that must exceed the threshold before alerting.

```
Example:
Samples: 5
Metric: CPU > 80%

Check 1: 85% ✓ (1/5)
Check 2: 88% ✓ (2/5)
Check 3: 82% ✓ (3/5)
Check 4: 90% ✓ (4/5)
Check 5: 87% ✓ (5/5) → ALERT SENT 🚨
```

**Why this matters:** Prevents false alerts from temporary spikes. Only alerts on sustained issues.

**Recommended values:**

* Volatile metrics (CPU, IOPS): 4-5 samples
* Critical failures (connections, deadlocks): 1-2 samples

***

**Notification Interval (minutes)**
Minimum time between repeated alerts for the same condition.

```
Example:
Notification Interval: 3 minutes
Condition: CPU still > 80%

Minute 0:  Alert sent 🚨
Minute 3:  Alert sent 🚨 (3 min passed)
Minute 6:  Alert sent 🚨 (3 min passed)
```

**Why this matters:** Prevents alert flooding while keeping you informed of ongoing issues.

***

## Complete Example: Scout Rule

**Scenario:** Analytics queries sometimes run for hours, impacting production.

**Goal:** Terminate analytics queries exceeding 10 minutes.

**Configuration:**

```
Type: Scout rule
Status: Active
DB instances: production_db

Trigger:
  Query Duration > 600 seconds

Filters:
  User: analytics_user
  Database: production_db

Action: Kill Query + Send Webhook

Webhook Destination: #database-alerts
```

**What happens:**

1. Rapydo monitors all queries from `analytics_user` on `production_db`
2. If a query runs longer than 10 minutes, it's automatically killed
3. Webhook notification sent to Slack #database-alerts with query details
4. Team is informed and can investigate root cause

***

## Complete Example: Alert Rule

**Scenario:** Production database occasionally experiences CPU spikes degrading performance.

**Goal:** Get alerted when CPU remains high for sustained periods.

**Configuration:**

```
Type: Alert rule
Status: Active
DB instances: Select All (production instances)

Metric 1:
  Type: CPU Utilization
  Operator: Greater than
  Value: 80%

Samples to Trigger: 5
Notification: Webhook
Webhook Destination: #database-ops
Notification Interval: 3 minutes
```

**What happens:**

1. Rapydo checks CPU every few minutes on all production instances
2. If CPU > 80% for 5 consecutive checks (\~15 minutes), webhook alert sent
3. While CPU remains high, alerts repeat every 3 minutes
4. If CPU drops below 80%, counter resets and alerts stop

***

## Best Practices

✅ **Start conservative**: Higher thresholds, longer durations. Tighten after observing behavior.

✅ **Use descriptive names**: "Kill Analytics Queries >10min" not "Rule 1"

✅ **Leverage filters**: Target rules precisely to avoid impacting legitimate activity.

✅ **Test thoroughly**: Always validate in non-production before deploying.

✅ **Review regularly**: Audit rules monthly to ensure they're still relevant.

✅ **Avoid alert fatigue**: Don't create so many alerts that teams start ignoring them.

***

## Multi-Metric Examples

**Example 1: High CPU + High Connections**

```
Metric 1: CPU Utilization > 85%
AND
Metric 2: Connection Count > 200

Notification: Webhook to #critical-alerts
```

→ Only alerts when database is both CPU-bound AND connection-saturated

***

**Example 2: Query Performance Degradation**

```
Metric 1: Max Query Duration > 30 seconds
AND
Metric 2: Waits count > 50

Notification: Webhook to #database-ops
```

→ Alerts when slow queries correlate with high lock contention

***

## What's Next?

* [Scout Rules Reference](/automation/scout_triggers_and_actions) - Complete guide to Scout Rules
* [Alert Rules Reference](/automation/alerts_triggers_and_actions) - Complete guide to Alert Rules
* [Back to Automation Overview](/automation/introduction) - Return to main automation page