本页面上的部分或全部信息可能不适用于 Trusted Cloud by S3NS。
排查并发操作问题
本页面可帮助您解决 Google Kubernetes Engine (GKE) 中由并发操作引起的错误。
本页面适用于想要了解部署失败(例如 Cluster is running incompatible operation
错误)原因的应用开发者。还适用于希望在集群或节点池级别诊断和解决这些并发操作错误的平台管理员和运维人员。如需详细了解我们在Trusted Cloud by S3NS 内容中提及的常见角色和示例任务,请参阅常见的 GKE 用户角色和任务。
了解并发操作错误
在 GKE 中,集群操作是指修改集群控制平面或其节点池状态的操作。这些操作由 GKE 管理,您可以出于维护目的发起这些操作,GKE 也可以出于维护目的发起这些操作。常见的集群操作包括:
- 创建或删除集群。
- 升级集群的控制平面版本。
- 创建、更新、删除节点池或调整其大小。
- 修改集群级设置,例如启用或停用功能。
- 由 GKE 启动的自动控制平面修复。
在集群上执行操作时,您可能会看到类似于以下内容的错误消息:
Cluster is running incompatible operation OPERATION_NAME
Cluster is currently being created, deleted, updated or repaired and cannot be updated
Operation OPERATION_NAME is currently ACTIONING cluster CLUSTER_NAME. Please wait and try again once it is done
这些错误可能包含以下值:
OPERATION_NAME
:已在集群上运行的操作的唯一 ID。您可以使用此名称跟踪会阻止新操作启动的现有操作的状态。
ACTIONING
:正在对集群执行的操作。例如 Creating
或 Updating
。
CLUSTER_NAME
:操作所针对的集群的名称。
发生这些错误是因为 GKE 限制了为防止发生冲突可同时运行的操作数量。一般来说,GKE 只允许同时运行一个集群级操作或每个节点池一个操作。GKE 还会执行自己的自动操作(例如控制平面升级),这些操作也会计入此限制,可能会暂时阻止您启动新任务。
解决并发操作错误
如果您收到一条错误消息,指出有另一项操作正在进行中,找到该任务并等待其完成:
如果您不知道阻塞操作的名称,请列出集群的所有进行中和等待中的操作:
gcloud container operations list \
--location=LOCATION \
--filter '(targetLink~/clusters/CLUSTER_NAME$ OR targetLink~/clusters/CLUSTER_NAME/) AND status!=DONE' \
--format json
替换以下内容:
LOCATION
:集群的 Compute Engine 区域或可用区(例如 us-central1
或 us-central1-a
),具体取决于集群是区域级还是可用区级。
CLUSTER_NAME
:发生失败操作的集群的名称。
输出类似于以下内容:
{
"name": "operation-0978307200000-00112233-4455-6677-8899-aabbccddeeff",
"operationType": "UPDATE_CLUSTER",
"selfLink": "https://container.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/operation-0978307200000-00112233-4455-6677-8899-aabbccddeeff",
"startTime": "2001-01-01T00:00:00.000000000Z",
"status": "RUNNING",
"targetLink": "https://container.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/clusters/CLUSTER_NAME/nodePools/NODE_POOL_NAME",
"zone": "LOCATION"
}
在输出中,查看每个操作的 name
字段。此值是阻止新操作启动的操作的名称。下一步中您需要用到该值。
如需详细了解输出中的其他字段,请参阅 projects.locations.operations
的 API 文档。
等待操作完成:
gcloud container operations wait OPERATION_NAME \
--location=LOCATION
将 OPERATION_NAME
替换为错误消息或上一步中的阻塞操作的名称。
此命令会主动监控操作,并在操作完成后退出。
当阻塞操作的状态为 DONE
时,重试导致错误的操作。
后续步骤
如未另行说明,那么本页面中的内容已根据知识共享署名 4.0 许可获得了许可,并且代码示例已根据 Apache 2.0 许可获得了许可。有关详情,请参阅 Google 开发者网站政策。Java 是 Oracle 和/或其关联公司的注册商标。
最后更新时间 (UTC):2025-08-08。
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["没有我需要的信息","missingTheInformationINeed","thumb-down"],["太复杂/步骤太多","tooComplicatedTooManySteps","thumb-down"],["内容需要更新","outOfDate","thumb-down"],["翻译问题","translationIssue","thumb-down"],["示例/代码问题","samplesCodeIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-08-08。"],[],[],null,["# Troubleshoot concurrent operations\n\n[Autopilot](/kubernetes-engine/docs/concepts/autopilot-overview) [Standard](/kubernetes-engine/docs/concepts/choose-cluster-mode)\n\n*** ** * ** ***\n\nWhen you try to modify your Google Kubernetes Engine (GKE) cluster or its node\npools, your action might be temporarily blocked because another operation is\nrunning. This issue can delay critical updates, cluster administration, and\napplication deployments.\n\nUse this page to learn how to identify the operation\nthat's running. Understanding the active operation helps you estimate the delay\nand determine when to retry your action.\n\nThis information is important for both Application developers, who might\nencounter these limits when deploying or modifying applications, and for\nPlatform admins and operators, who manage the overall cluster lifecycle,\nupgrades, and need to troubleshoot blocked administrative tasks. For more\ninformation about the common roles and example tasks that we reference in\nGoogle Cloud content, see\n[Common GKE user roles and tasks](/kubernetes-engine/enterprise/docs/concepts/roles-tasks).\n\nUnderstand concurrent operation errors\n--------------------------------------\n\nIn GKE, a cluster operation is an action that modifies the state\nof your cluster's control plane or its node pools. These operations are managed\nby GKE and can be initiated by you or by GKE for\nmaintenance purposes. Common cluster operations include the following:\n\n- Creating or deleting the cluster.\n- Upgrading the cluster's control plane version.\n- Creating, updating, resizing, or deleting node pools.\n- Modifying cluster-level settings, such as enabling or disabling features.\n- Automatic control plane repairs initiated by GKE.\n\nWhen you perform operations on your clusters, you might see error messages\nsimilar to the following: \n\n Cluster is running incompatible operation \u003cvar translate=\"no\"\u003eOPERATION_NAME\u003c/var\u003e\n\n Cluster is currently being created, deleted, updated or repaired and cannot be updated\n\n Operation \u003cvar translate=\"no\"\u003eOPERATION_NAME\u003c/var\u003e is currently \u003cvar translate=\"no\"\u003eACTIONING\u003c/var\u003e cluster \u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e. Please wait and try again once it is done\n\nThese errors can include the following values:\n\n- \u003cvar translate=\"no\"\u003eOPERATION_NAME\u003c/var\u003e: the unique ID for an operation that's already running on your cluster. Use this name to track the status of this pre-existing operation that's blocking your new operation from starting.\n- \u003cvar translate=\"no\"\u003eACTIONING\u003c/var\u003e: the action that is being performed on the cluster. For example, `Creating` or `Updating`.\n- \u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e: the name of the cluster that the operation is targeting.\n\nThese errors occur because GKE limits how many operations can run\nat the same time to prevent conflicts. Generally, GKE permits\nonly one operation at the cluster level or one operation per node pool to run\nconcurrently. GKE also performs its own automatic actions, such\nas control plane upgrades, which count toward this limit and can temporarily\nblock you from starting a new task.\n\nResolve concurrent operation errors\n-----------------------------------\n\nIf you receive an error that indicates that another operation is in progress,\nidentify the ongoing task and wait for it to finish:\n\n1. If you don't know the name of the blocking operation, list all ongoing and\n pending operations for your cluster:\n\n gcloud container operations list \\\n --location=\u003cvar translate=\"no\"\u003eLOCATION\u003c/var\u003e \\\n --filter '(targetLink~/clusters/\u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e$ OR targetLink~/clusters/\u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e/) AND status!=DONE' \\\n --format json\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003eLOCATION\u003c/var\u003e: the Compute Engine region or zone (for example, `us-central1` or `us-central1-a`) for the cluster, depending on whether your cluster is regional or zonal.\n - \u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e: the name of the cluster with the\n failing operation.\n\n The output is similar to the following: \n\n {\n \"name\": \"operation-0978307200000-00112233-4455-6677-8899-aabbccddeeff\",\n \"operationType\": \"UPDATE_CLUSTER\",\n \"selfLink\": \"https://container.googleapis.com/v1/projects/\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e/locations/\u003cvar translate=\"no\"\u003eLOCATION\u003c/var\u003e/operations/operation-0978307200000-00112233-4455-6677-8899-aabbccddeeff\",\n \"startTime\": \"2001-01-01T00:00:00.000000000Z\",\n \"status\": \"RUNNING\",\n \"targetLink\": \"https://container.googleapis.com/v1/projects/\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e/locations/\u003cvar translate=\"no\"\u003eLOCATION\u003c/var\u003e/clusters/\u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e/nodePools/\u003cvar translate=\"no\"\u003eNODE_POOL_NAME\u003c/var\u003e\",\n \"zone\": \"\u003cvar translate=\"no\"\u003eLOCATION\u003c/var\u003e\"\n }\n\n In the output, review the `name` field for each operation. This value is\n the name of the operation that's blocking your new operation from starting.\n You need this value for the next step.\n\n For more information about the other fields in the output, see the API\n documentation for\n [`projects.locations.operations`](/kubernetes-engine/docs/reference/rest/v1/projects.locations.operations).\n2. Wait for the operation to complete:\n\n gcloud container operations wait \u003cvar translate=\"no\"\u003eOPERATION_NAME\u003c/var\u003e \\\n --location=\u003cvar translate=\"no\"\u003eLOCATION\u003c/var\u003e\n\n Replace \u003cvar translate=\"no\"\u003eOPERATION_NAME\u003c/var\u003e with the name of a blocking\n operation from an error message or the preceding step.\n\n This command actively monitors the operation and exits when the operation\n is complete.\n3. After the blocking operation has a status of `DONE`, retry the\n operation that caused the error.\n\nWhat's next\n-----------\n\n- If you can't find a solution to your problem in the documentation, see\n [Get support](/kubernetes-engine/docs/getting-support) for further help,\n including advice on the following topics:\n\n - Opening a support case by contacting [Cloud Customer Care](/support-hub).\n - Getting support from the community by [asking questions on StackOverflow](http://stackoverflow.com/questions/tagged/google-kubernetes-engine) and using the `google-kubernetes-engine` tag to search for similar issues. You can also join the [`#kubernetes-engine` Slack channel](https://googlecloud-community.slack.com/messages/C0B9GKTKJ/) for more community support.\n - Opening bugs or feature requests by using the [public issue tracker](/support/docs/issue-trackers)."]]