This document describes how you can use AI assistance to help you monitor and troubleshoot your Cloud SQL resources. You can use the AI-assisted troubleshooting tools of Cloud SQL and Gemini Cloud Assist to troubleshoot slow queries and troubleshoot high database load.
Limitations
The following limitations apply to AI-assisted troubleshooting in Cloud SQL:
- AI-assisted troubleshooting isn't supported for the following Cloud SQL configurations:
- Cloud SQL Enterprise edition instances
- Instances that use the old network architecture for Cloud SQL
- Instances inside a VPC Service Controls perimeter
- Instances enabled with Access Transparency
- Read replica instances
- Query anomaly detection is available only for Cloud SQL Enterprise Plus edition instances.
Before you begin
- Ensure that
Gemini Cloud Assist is set up for your Cloud de Confiance user account
and project.
After you set up Gemini Cloud Assist, you might need to wait five minutes to let the service propagate before you can enable AI-assisted troubleshooting in Cloud SQL.
- Ensure that your instance is a Cloud SQL Enterprise Plus edition instance.
- Ensure that your Cloud SQL instance is using the new network architecture.
- Enable query insights for Cloud SQL Enterprise Plus edition and Cloud SQL Enterprise edition.
Required roles and permissions
To get the permissions that you need to use AI-assisted troubleshooting, ask your administrator to grant you the following IAM roles on the project that hosts the Cloud SQL instance:
-
Database insights viewer (
roles/databaseinsights.viewer) -
Use Gemini Cloud Assist investigations:
Gemini Cloud Assist Investigation Owner (
roles/geminicloudassist.investigationOwner)
For more information about granting roles, see Manage access to projects, folders, and organizations.
These predefined roles contain the permissions required to use AI-assisted troubleshooting. To see the exact permissions that are required, expand the Required permissions section:
Required permissions
The following permissions are required to use AI-assisted troubleshooting:
-
databaseinsights.performanceIssues.detect -
databaseinsights.performanceIssues.investigate
You might also be able to get these permissions with custom roles or other predefined roles.
For more information about required roles and permissions for using Gemini Cloud Assist investigations, see Troubleshoot issues with Gemini Cloud Assist Investigations.
Enable AI-assisted troubleshooting
When you enable AI-assisted troubleshooting for your Cloud SQL instance, Cloud SQL can analyze the performance of your databases and detect anomalies in the execution of your queries. When Cloud SQL detects anomalies in query performance or identifies high system load, AI-assisted troubleshooting helps you analyze the situation with evidence and provides recommendations.
To enable AI-assisted troubleshooting for your Cloud SQL instance, do the following:
-
In the Cloud de Confiance console, go to the Cloud SQL Instances page.
- To open the Overview page of an instance, click the instance name.
- In the Configuration tile, click Edit configuration.
- In the Customize your instance section, expand Query insights.
- If not already selected, select Enable Query insights.
- For Cloud SQL Enterprise Plus edition only, if not already selected, select Enable Enterprise Plus features.
- For Cloud SQL Enterprise Plus edition only, select Enable AI-assisted troubleshooting. For Cloud SQL Enterprise edition instances, troubleshooting with AI assistance is only available if you enable Gemini Cloud Assist.
- Click Save.
- For the best results, wait 24 hours after you enable AI-assisted troubleshooting in the Cloud de Confiance console to let Cloud SQL build a baseline of the average performance of your instance, database, and queries.
If you enable query insights for Cloud SQL Enterprise Plus edition, then your instance requires a restart. If you enable AI-assisted troubleshooting only, then your instance doesn't require a restart. For more information about enabling query insights for Cloud SQL Enterprise Plus edition, see Use query insights to improve query performance.
Open Gemini Cloud Assist
To use Gemini Cloud Assist with Cloud SQL, do the following:
-
In the Cloud de Confiance console, go to the Cloud SQL Instances page.
- To open the Overview page of an instance, click the instance name.
- In the navigation pane, select Query insights.
- To open the Cloud Assist panel, click Open or close Gemini Cloud Assist chat.
- In the Cloud Assist panel, enter a prompt that describes the information that you're interested in.
- After you enter the prompt, click Send prompt. Gemini returns a response to your prompt based on information from the last hour.
Troubleshoot slow queries
To use AI assistance with troubleshooting your slow queries, go to the Query insights dashboard for your Cloud SQL instance in Cloud de Confiance console.
Top queries table
You can start troubleshooting slow queries with AI assistance in the Top queries table section of the Query insights dashboard.
Cloud SQL can help you identify which queries are performing slower than average during a specific detection time period. After you select a time range in the Query insights dashboard, Cloud SQL checks whether any queries are performing slower than average by using a detection time period of 24 hours before the end of your selected time range.
When you adjust the time range filter of the Database load chart, or any other filter such as database or user, Cloud SQL refreshes the Top queries table and reruns anomaly detection based on the new list of queries and an updated detection time period.
For Cloud SQL Enterprise Plus edition instances, the following occurs when Cloud SQL detects an anomaly:If a query is running slower than expected, then a Warning warning_spark icon is displayed. When you click either icon, Gemini Cloud Assist is used to help analyze the query execution and offers observations about what might have caused any issue. Based on these observations, Gemini Cloud Assist generates a hypothesis that can help you address the issue.
To troubleshoot slow queries in the Top queries table in the Query insights dashboard, do the following:
-
In the Cloud de Confiance console, go to the Cloud SQL Instances page.
- To open the Overview page of an instance, click the instance name.
- In the SQL navigation menu, click Query insights.
- In the Executed queries chart, use the Time range filter to select either 1 hour, 6 hours, 1 day, 7 days, 30 days or a custom range.
- In the Top queries table, under the Queries tab, review the list of queries for your database.
- If a Warning warning_spark icon appears next to the query's Avg execution time (ms) value for a query, then Cloud SQL has detected an anomaly in your query performance. Cloud SQL checks for anomalies within the 24-hour time period that occurs before the end of your selected time range.
- Click the Warning warning_spark icon.
- In the Query is slower than usual dialog, click New Investigation
to start troubleshooting with AI assistance from Gemini Cloud Assist.
After about two minutes, the Investigation details pane opens with the
following sections:
- Issue. A description of the issue being investigated, including the investigation’s start and stop time.
- Observations. A list of observations about the issue. For example, these can include lock contention details, such as a longer than expected lock wait ratio for the query.
- Hypotheses. A list of AI-recommended actions to take to help address the slow running query.
If you want to see all investigations associated with the query, in the Query is slower than usual dialog, click View all investigations. The Gemini Cloud Assist page opens where you can view all currently running and previously completed investigations. You can filter the page by project or label, for example, to find the specific investigation you need.
Alternatively, to see all previous investigations, click the Notifications icon, then select a notification associated with any investigation to open the Gemini Cloud Assist page.
- Alternatively, if you want to investigate the latency of any query, complete the following steps:
- Identify the specific query you want to investigate.
- In the Actions column, click the Actions icon associated with that query.
- Select Investigate latency in the menu to run a Gemini Cloud Assist investigation.
Query details
You can also troubleshoot a slow query with AI assistance from the Query details page.
-
In the Cloud de Confiance console, go to the Cloud SQL Instances page.
- To open the Overview page of an instance, click the instance name.
- Click Query insights to open the Query insights dashboard.
- In the Query insights dashboard, click the query in the Top queries that you want to view. The Query details page appears.
- For Cloud SQL Enterprise Plus edition, if Cloud SQL detects an anomaly for the query, then one or more of the following indicators appears in the Query details page:
- A message on the details screen that says
This query is slower than usualand an Investigate option. A message in the Query latency chart that says
Query slower than usual. If this message appears, then click Investigate button to start troubleshooting with AI assistance from Gemini Cloud Assist.After about two minutes, the Investigation details pane opens with the following sections:
- Issue. A description of the issue being investigated, including the investigation’s start and stop time.
- Observations. A list of observations about the issue. For example, these can include lock contention details, such as a longer than expected lock wait ratio for the query.
- Hypotheses. A list of AI-recommended actions to take to help address the slow running query.
- Optional: Use the Time range filter to select either 1 hour, 6 hours, 1 day, 7 days, 30 days or a custom range. When you adjust the Time range filter of the Query details page , or any other filter such as Database or User, Cloud SQL reruns anomaly detection.
- If Cloud SQL doesn't detect an anomaly for the query, then you can still run an analysis on the query by clicking the Investigate button in the Query latency card.
Analyze query latency
Using AI assistance, you can analyze and troubleshoot the details of your query latency.
Analysis time period
The analysis time period consists of the 24 hours that occur before the end of the time range that you select in the Database load chart of the Query insights dashboard or the Query details page. Cloud SQL uses this time period to compare baseline metrics with the metrics retrieved during the time period of the anomaly.
On the Query details page, for Cloud SQL Enterprise Plus edition, if Cloud SQL has detected an anomaly with the query, then after you select the query from the Query insights dashboard, Cloud SQL performs a baseline performance analysis for the query using the last 24 hours from the end of the anomaly. If Cloud SQL hasn't detected an anomaly with the query and runs anomaly detection on the query again, then Cloud SQL uses 48 hours before the end of the selected time range as the performance baseline for the analysis time period.
Detected anomaly period
The detected anomaly period is applicable to Cloud SQL Enterprise Plus edition instances only.The detected anomaly period represents a time period when Cloud SQL finds an anomalous change in query performance. Cloud SQL uses the baseline performance measured for the query during the analysis time period.
If Cloud SQL detects multiple anomalies for a query within a selected time period, then Cloud SQL uses the last detected anomaly.
Blocked active queries
If a specific active query is blocked or running much longer than expected, it can block other dependent queries.
Cloud SQL gives you the option to terminate specific long-running or blocked active queries.
For more information, see Blocked active queries.
Examples of query performance prompts
You can also use Gemini Cloud Assist to enter prompts to help you improve the performance of your queries. Gemini Cloud Assist answers questions for the selected Cloud SQL instance and database.
| Prompt | Type of response |
|---|---|
| What are the top queries by latency in my database? |
|
| What is the slowest query in this database instance? | Guidance on how to identify the slowest query by latency. |
Troubleshoot high database load
By accessing the Query insights dashboard in the Cloud de Confiance console, you can analyze your database and troubleshoot events when your system experiences a higher database load than average. Cloud SQL uses the 24 hours of data that occurs prior to your selected time range to calculate the expected load of your database. You can look into the reasons for the higher load events and analyze the evidence behind reduced performance. Cloud SQL also provides recommendations for optimizing your database to improve performance.
To use AI assistance with troubleshooting high database load, go to the Instance Overview page or the Query insights dashboard in the Cloud de Confiance console.
Instance overview page
Troubleshoot high database load with AI assistance in the Instance overview page by using the following steps:
-
In the Cloud de Confiance console, go to the Cloud SQL Instances page.
- To open the Overview page of an instance, click the instance name.
- In the Overview page, from the Chart menu, select a metric for the database. You can select any metric, for example, CPU utilization.
- Optional: To select a specific analysis time period, use the Time range filter
to select either 1 hour, 6 hours, 1 day, 7 days, 30 days or a custom range.
You can zoom in to specific sections of the chart where you notice areas of high load that you want to analyze. For example, an area of high load might display CPU utilization levels closer to 100%. To zoom in, you can click and select a portion of the chart.
Click the Investigate performance button to start troubleshooting high database load with AI assistance from Gemini Cloud Assist.
After about two minutes, the Investigation details pane opens with the following sections:
- Issue. A description of the issue being investigated, including the investigation’s start and stop time.
- Observations. A list of observations about the issue. For example, these can include lock contention details, such as a longer than expected lock wait ratio for the query.
- Hypotheses. A list of AI-recommended actions to take to help address the slow running query.
Query insights dashboard
Troubleshoot high database load with AI assistance in the Query insights dashboard using the following steps:
-
In the Cloud de Confiance console, go to the Cloud SQL Instances page.
- To open the Overview page of an instance, click the instance name.
- Click Query insights to open the Query insights dashboard.
- Optional: Use the Time range filter to select either 1 hour, 6 hours, 1 day, 7 days, 30 days or a custom range.
- Issue. A description of the issue being investigated, including the investigation’s start and stop time.
- Observations. A list of observations about the issue. For example, these can include lock contention details, such as a longer than expected lock wait ratio for the query.
- Hypotheses. A list of AI-recommended actions to take to help address the slow running query.
You can zoom in to specific sections of the chart where you notice areas of higher database load by query execution time. To zoom in, you can click and select a portion of the chart.
In the Database load chart, click the Investigate performance button to start troubleshooting high database load with AI assistance from Gemini Cloud Assist.
After about two minutes, the Investigation details pane opens with the following sections:
Analyze high database load
Using AI assistance, you can analyze and troubleshoot the details of your database load.
Analysis time period
Cloud SQL analyzes your database for the time period that you select in your database load chart from the Query insights dashboard or the Instance overview page. If you select a time period of less than 24 hours, then Cloud SQL analyzes the entire time period. If you select a time period greater than 24 hours, then Cloud SQL selects only the last 24 hours of the time period for analysis.
To calculate the baseline performance analysis of your database, Cloud SQL includes 24 hours of a baseline time period in its analysis time period. If your selected time period occurs on a day other than Monday, then Cloud SQL uses a baseline time period of the 24 hours previous to your selected time period. If your selected time period occurs on a Monday, then Cloud SQL uses a baseline time period of the 7th day previous to your selected time period.
Metrics analysis
When Cloud SQL starts the analysis, Cloud SQL checks for significant changes in the various metrics, including but not limited to the following:
- Queries per second (QPS)
- CPU
- Memory
- Disk I/O
Cloud SQL compares the baseline aggregated data for your database within the performance data of your analysis time window. If Cloud SQL detects a significant change in threshold for a key metric, then Cloud SQL indicates a possible situation with your database. The identified situation might explain a root cause for the high load on your database over the selected time period.
Recommendations
When Gemini Cloud Assist completes analysis, the Hypotheses section of the Investigation details pane lists actionable insights to help remediate the issue.
For some situations, based on the analysis, there might not be a recommendation.
Examples of system performance prompts
You can also use Gemini Cloud Assist to enter prompts to gather information about your system performance. Gemini Cloud Assist answers questions for the selected Cloud SQL instance.
| Prompt | Type of response |
|---|---|
| How many error log entries are there for this database instance in the last 7 days? | Summary of log entries grouped by their severity type. Gemini scopes the response by the time range filter selected in the instance performance chart. |
| What was the CPU utilization for this database instance around 2 PM today? | Metrics results in percentage range for CPU utilization within the time interval. |
Troubleshoot connectivity issues
You can start troubleshooting connectivity issues by using Gemini Cloud Assist or by initiating an investigation when connection errors occur. AI assistance evaluates several sources to identify why a client might encounter issues when trying to connect to a Cloud SQL database.
Investigate connectivity issues
To use AI assistance with troubleshooting connectivity issues, do the following:
-
In the Cloud de Confiance console, go to the Cloud SQL Instances page.
- To open the Overview page of an instance, click the instance name.
- In the Resolve database issues with AI-assisted troubleshooting pane, click Explore investigations.
- In the Investigation options window, look for the Connection usage section.
- Optional: Select a specific analysis time period using the Time range filter, either 1 hour, 6 hours, 1 day, 7 days, or a custom range.
- Click Investigate.
Gemini initiates an automated analysis of your instance metadata, logs, and networking configuration. After the analysis is complete, the Investigation details pane displays the following sections:
- Issue: A summary of the connectivity failure, including affected resources and timestamps.
- Observations: Evidence gathered from signals such as when a database has reached its
max_connectionslimit or active concurrent connections cross-referenced with instance metadata. Evidence can be used to determine whether a traffic spike or unclosed sessions might be the cause of instance downtime. - Hypotheses: AI-generated root causes and remediation steps.
Examples of connectivity issue prompts
You can also use Gemini Cloud Assist to troubleshoot connectivity issues between a client and your Cloud SQL instance.
| Prompt | Type of response |
|---|---|
| Why am I seeing connection errors? | Gemini evaluates connections to your database and recommends improvements such as enabling managed connection pooling. |
Get index recommendations
You can obtain index recommendations from Cloud SQL in query insights. For more information about obtaining index recommendations, see Use index advisor.
Examples of index recommendation prompts
Use Gemini Cloud Assist to get more information about how to use indexes in your databases. Gemini Cloud Assist answers questions for the selected Cloud SQL instance.
| Prompt | Type of response |
|---|---|
| Show index recommendations for queries run in the last 7 days. | Guidance on the types of queries that can benefit from an index. |
Monitor active queries
Use the Query insights dashboard to monitor active queries, and if necessary, terminate long-running processes. For more information, see Monitor active queries.
Examples of active query prompts
Use Gemini Cloud Assist to find out more information about queries that cause high latency or CPU load. Gemini Cloud Assist answers questions for the selected Cloud SQL instance.
| Prompt | Type of response |
|---|---|
| What are the top queries currently running in my database? | Guidance on how to find the longest running and most resource-intensive queries. |
What's next
- Learn how to write better prompts.
- Learn how to use the Gemini Cloud Assist panel.
- Read Use Gemini for AI assistance and development
-
Learn how and when Gemini for Cloud de Confiance uses your data.
- Optimize high CPU usage
- Optimize high memory usage
- Use system insights to improve system performance
- Optimize queries with high memory usage
- Use index advisor
- Monitor active queries