模擬區域 MIG 發生區域服務中斷的情形

如要測試地區代管執行個體群組 (MIG) 是否已過度佈建，且能應對區域服務中斷的情況，可以參考下列範例模擬區域故障。

事前準備

如要使用本指南提供的指令列範例，請安裝 Google Cloud CLI。
如果尚未設定驗證，請先完成設定。驗證可確認您的身分，以便存取 Cloud de Confiance by S3NS 服務和 API。如要從本機開發環境執行程式碼或範例，可以選取下列任一選項，向 Compute Engine 進行驗證：
Select the tab for how you plan to use the samples on this page:
gcloud
1. 安裝 Google Cloud CLI，然後使用同盟身分登入 gcloud CLI。登入後，請執行下列指令初始化 Google Cloud CLI：
  gcloud init
  注意：如果您先前已安裝 gcloud CLI，請執行 gcloud components update，確認您使用的是最新版本。
2. Set a default region and zone.
REST

如要在本機開發環境中使用本頁的 REST API 範例，請使用您提供給 gcloud CLI 的憑證。
詳情請參閱 Cloud de Confiance 驗證說明文件中的「Authenticate for using REST」。

使用指令碼模擬區域服務中斷

這個指令碼會停止和啟動 Apache 來做為預設案例。如果這個指令碼不適用於您的應用程式，請以您自己的失敗和復原案例取代停止和啟動 Apache 的指令。

在群組中的每個 VM 持續部署並執行這個指令碼。您可以透過將這個指令碼新增至執行個體範本，或者將這個指令碼包含在自訂映像檔中並在執行個體範本中使用該映像檔，來進行這項作業。

#!/usr/bin/env bash

# Copyright 2016 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

set -o nounset
set -o errexit
set -o pipefail

function GetMetadata() {
  curl -s "$1" -H "Metadata-Flavor: Google"
}

PROJECT_METADATA_URL="http://metadata.google.internal/computeMetadata/v1/project/attributes"
INSTANCE_METADATA_URL="http://metadata.google.internal/computeMetadata/v1/instance"
ZONE=$(GetMetadata "$INSTANCE_METADATA_URL/zone" | cut -d '/' -f 4)
INSTANCE_NAME=$(hostname)

# We keep track of the state to make sure failure and recovery is triggered only once.
STATE="healthy"
while true; do
  if [[ "$ZONE" = "$(GetMetadata $PROJECT_METADATA_URL/failed_zone)" ]] && \
     [[ "$INSTANCE_NAME" = *"$(GetMetadata $PROJECT_METADATA_URL/failed_instance_names)"* ]]; then
    if [[ "$STATE" = "healthy" ]]; then
      STATE="failure"
      # Do something to simulate failure here.
      echo "STARTING A FAILURE"
      /etc/init.d/apache2 stop
    fi
  else
    if [[ "$STATE" = "failure" ]] ; then
      STATE="healthy"
      # Do something to recover here.
      echo "RECOVERING FROM FAILURE"
      /etc/init.d/apache2 start
    fi
  fi
  sleep 5
done

透過設定以下兩個專案中繼資料欄位來模擬區域失敗的情形：
- failed_zone：設定您要模擬服務中斷的區域 (將故障範圍限制在單一區域)。
- failed_instance_names：依名稱選擇要使其離線的 VM (將故障範圍限制於只包含此字串的 VM 名稱)。
您可以使用 gcloud CLI 設定此中繼資料。舉例來說，下列指令會將區域服務中斷設為 europe-west1-b 區域，並影響名稱以 base-instance-name 開頭的 VM：
```
gcloud compute project-info add-metadata --metadata failed_zone='europe-west1-b',failed_instance_names='base-instance-name-'
```

完成模擬服務中斷後，請透過移除中繼資料鍵從故障中復原：

gcloud compute project-info remove-metadata --keys failed_zone,failed_instance_names

以下提供一些您可以使用這個指令碼執行之失敗案例的建議：

完全停止應用程式，查看 MIG 的回應方式。
讓 VM 在進行負載平衡健康狀態檢查時傳回「健康狀態不良」。
修改 iptables，封鎖進出 VM 的部分流量。
關閉 VM。根據預設，地區 MIG 將在不久之後重新建立那些執行個體，但只要設定了中繼資料值，新的執行個體將在該指令碼執行時立即自行關閉。如此將會導致重複當機。

後續步驟

瞭解如何建立可擴充且有彈性的網路應用程式。
瞭解 Google Cloud Platform 上的災難復原。

模擬區域 MIG 發生區域服務中斷的情形

事前準備

gcloud

REST

使用指令碼模擬區域服務中斷

後續步驟