Datadog 公式で Terraform を使った管理方法のブログが公開されていました。(多分2017/04/07公開)

Managing Datadog with Terraform

Datadog Provider は結構前から用意されていたようですが、触れたことが無かったので、ほぼDatadogブログの内容のままですが実際に使用してみます。

Datadog Provider
Datadog API Key 設定
- tfvars
- tf
- plan
実行例(Monitor)
AWS EC2 インスタンス起動と合わせて Monitor 作成
まとめ

Datadog Provider

Terraform 公式ドキュメントは以下です。

Provider: Datadog - Terraform by HashiCorp

管理できるリソースは今の所以下の4つです。

Downtime
Monitor
Timeboard
User

Datadog API Key 設定

Terraform はインストール済みの前提です。今回利用したバージョンは Terraform v0.9.2

tfvars

Datadog の API Key を設定した tfvars ファイルを作成します。

$ cat terraform.tfvars
datadog_api_key="XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
datadog_app_key="YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY"

tf

tfvars から API Key を読み込みます。main.tfとして作成しました。

$ cat main.tf
# Variables
variable "datadog_api_key" {}
variable "datadog_app_key" {}

# Configure the Datadog provider
provider "datadog" {
  api_key = "${var.datadog_api_key}"
  app_key = "${var.datadog_app_key}"
}

plan

API Key 設定を行った後、リソース部分が空の状態で plan 実行した結果が以下になります。エラー出力されるようなら設定に何かしら誤りがあります。

$ terraform plan
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
# Variables
persisted to local or remote state storage.

No changes. Infrastructure is up-to-date.

This means that Terraform did not detect any differences between your
configuration and real physical resources that exist. As a result, Terraform
doesn't need to do anything.

実行例(Monitor)

モニター定義を管理します。

パラメータ未指定での実行

パラメータを未指定で実行すると必須パラメータ不足のエラーとなります。

cat monitor.tf
# Monitors
resource "datadog_monitor" "cpumonitor" {
}
$ terraform plan
4 error(s) occurred:

* datadog_monitor.cpumonitors: "message": required field is not set
* datadog_monitor.cpumonitors: "name": required field is not set
* datadog_monitor.cpumonitors: "query": required field is not set
* datadog_monitor.cpumonitors: "type": required field is not set

必須パラメータ指定での実行

必須パラメータ name,type,message,query を指定してplan実行します。

$ cat monitor.tf
# Monitors
resource "datadog_monitor" "cpumonitor" {
  name = "cpu monitor"
  type = "metric alert"
  message = "CPU usage alert"
  query = "avg(last_1m):avg:system.cpu.system{*} by {host} > 60"
}
$ terraform plan
・・・
+ datadog_monitor.cpumonitor
    include_tags:        "true"
    message:             "CPU usage alert"
    name:                "cpu monitor"
    new_host_delay:      "<computed>"
    notify_no_data:      "false"
    query:               "avg(last_1m):avg:system.cpu.system{*} by {host} > 60"
    require_full_window: "true"
    type:                "metric alert"


Plan: 1 to add, 0 to change, 0 to destroy.

plan は問題無いので apply を実行します。

$ terraform apply
datadog_monitor.cpumonitor: Creating...
  include_tags:        "" => "true"
  message:             "" => "CPU usage alert"
  name:                "" => "cpu monitor"
  new_host_delay:      "" => "<computed>"
  notify_no_data:      "" => "false"
  query:               "" => "avg(last_1m):avg:system.cpu.system{*} by {host} > 60"
  require_full_window: "" => "true"
  type:                "" => "metric alert"
datadog_monitor.cpumonitor: Creation complete (ID: XXXX732)

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

実行結果

事前にインポートを行っていない限りは、新規作成となります。

f:id:htnosm:20170409231844p:plain

$ terraform show
datadog_monitor.cpumonitor:
  id = XXXX732
  include_tags = true
  message = CPU usage alert
  name = cpu monitor
  notify_no_data = false
  query = avg(last_1m):avg:system.cpu.system{*} by {host} > 60
  require_full_window = true
  type = metric alert
$ cat terraform.tfstate
{
    "version": 3,
    "terraform_version": "0.9.2",
    "serial": 0,
    "lineage": "f751ef78-ced3-4035-896b-aa0008b760e3",
    "modules": [
        {
            "path": [
                "root"
            ],
            "outputs": {},
            "resources": {
                "datadog_monitor.cpumonitor": {
                    "type": "datadog_monitor",
                    "depends_on": [],
                    "primary": {
                        "id": "XXXX732",
                        "attributes": {
                            "id": "XXXX732",
                            "include_tags": "true",
                            "message": "CPU usage alert",
                            "name": "cpu monitor",
                            "notify_no_data": "false",
                            "query": "avg(last_1m):avg:system.cpu.system{*} by {host} \u003e 60",
                            "require_full_window": "true",
                            "type": "metric alert"
                        },
                        "meta": {},
                        "tainted": false
                    },
                    "deposed": [],
                    "provider": ""
                }
            },
            "depends_on": []
        }
    ]
}

変更無しの状態で確認

何も変更を行っていない状態で、更新が掛からない事を確認します。

$ terraform plan
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.

datadog_monitor.cpumonitor: Refreshing state... (ID: XXXX732)
No changes. Infrastructure is up-to-date.

This means that Terraform did not detect any differences between your
configuration and real physical resources that exist. As a result, Terraform
doesn't need to do anything.
$ terraform apply
datadog_monitor.cpumonitor: Refreshing state... (ID: XXXX732)

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

更新

閾値を追加して更新を行います。

$ cat monitor.tf
# Monitors
resource "datadog_monitor" "cpumonitor" {
  name = "cpu monitor"
  type = "metric alert"
  message = "CPU usage alert"
  query = "avg(last_1m):avg:system.cpu.system{*} by {host} > 60"
  thresholds {
    ok = 20
    warning = 50
    critical = 60
  }
}
$ terraform plan
・・・
datadog_monitor.cpumonitor: Refreshing state... (ID: XXXX732)
・・・

~ datadog_monitor.cpumonitor
    thresholds.%:        "0" => "3"
    thresholds.critical: "" => "60"
    thresholds.ok:       "" => "20"
    thresholds.warning:  "" => "50"


Plan: 0 to add, 1 to change, 0 to destroy.
$ terraform apply
datadog_monitor.cpumonitor: Refreshing state... (ID: XXXX732)
datadog_monitor.cpumonitor: Modifying... (ID: XXXX732)
  thresholds.%:        "0" => "3"
  thresholds.critical: "" => "60"
  thresholds.ok:       "" => "20"
  thresholds.warning:  "" => "50"
datadog_monitor.cpumonitor: Modifications complete (ID: XXXX732)

Apply complete! Resources: 0 added, 1 changed, 0 destroyed.
・・・

更新結果

閾値設定されたことを確認できます。

f:id:htnosm:20170409231845p:plain

$ terraform show
datadog_monitor.cpumonitor:
  id = XXXX732
  escalation_message =
  include_tags = true
  locked = false
  message = CPU usage alert
  name = cpu monitor
  new_host_delay = 300
  no_data_timeframe = 0
  notify_audit = false
  notify_no_data = false
  query = avg(last_1m):avg:system.cpu.system{*} by {host} > 60
  renotify_interval = 0
  require_full_window = true
  silenced.% = 0
  tags.# = 0
  thresholds.% = 3
  thresholds.critical = 60.0
  thresholds.ok = 20.0
  thresholds.warning = 50.0
  timeout_h = 0
  type = metric alert

show結果を見ると、指定していないパラメータについても値が出力されています。
注意点としては、これらのデフォルト値は Datadog API ではなく、 Terraform provider 側で指定される事です。

Terraform でのデフォルト値は以下を参照
- Argument Reference

削除

Terraformで管理している設定の削除を実行します。

$ terraform destroy
Do you really want to destroy?
  Terraform will delete all your managed infrastructure.
  There is no undo. Only 'yes' will be accepted to confirm.

  Enter a value: yes

datadog_monitor.cpumonitor: Refreshing state... (ID: XXXX732)
datadog_monitor.cpumonitor: Destroying... (ID: XXXX732)
datadog_monitor.cpumonitor: Destruction complete

Destroy complete! Resources: 1 destroyed.

f:id:htnosm:20170409231846p:plain

AWS EC2 インスタンス起動と合わせて Monitor 作成

他Providerと組み合わせる例として EC2インスタンスとの連携例がありました。

ec2.tf

$ cat ec2.tf
# Configure the AWS Provider
provider "aws" {
  access_key = "${var.aws_access_key}"
  secret_key = "${var.aws_secret_key}"
  region     = "ap-northeast-1"
}

resource "aws_instance" "base" {
  ami = "ami-859bbfe2" # Amazon Linux AMI 2017.03.0 (HVM), SSD Volume Type
  instance_type = "t2.micro"
}

resource "datadog_monitor" "cpumonitor" {
  name = "cpu monitor ${aws_instance.base.id}"
  type = "metric alert"
  message = "CPU usage alert"
  query = "avg(last_1m):avg:system.cpu.system{host:${aws_instance.base.id}} by {host} > 10"
  new_host_delay = 30
}

plan

$ terraform plan
・・・
+ aws_instance.base
    ami:                         "ami-859bbfe2"
    associate_public_ip_address: "<computed>"
    availability_zone:           "<computed>"
    ebs_block_device.#:          "<computed>"
    ephemeral_block_device.#:    "<computed>"
    instance_state:              "<computed>"
    instance_type:               "t2.micro"
    ipv6_addresses.#:            "<computed>"
    key_name:                    "<computed>"
    network_interface_id:        "<computed>"
    placement_group:             "<computed>"
    private_dns:                 "<computed>"
    private_ip:                  "<computed>"
    public_dns:                  "<computed>"
    public_ip:                   "<computed>"
    root_block_device.#:         "<computed>"
    security_groups.#:           "<computed>"
    source_dest_check:           "true"
    subnet_id:                   "<computed>"
    tenancy:                     "<computed>"
    vpc_security_group_ids.#:    "<computed>"

+ datadog_monitor.cpumonitor
    include_tags:        "true"
    message:             "CPU usage alert"
    name:                "cpu monitor ${aws_instance.base.id}"
    new_host_delay:      "30"
    notify_no_data:      "false"
    query:               "avg(last_1m):avg:system.cpu.system{host:${aws_instance.base.id}} by {host} > 10"
    require_full_window: "true"
    type:                "metric alert"


Plan: 2 to add, 0 to change, 0 to destroy.

apply

$ terraform apply
aws_instance.base: Creating...
  ami:                         "" => "ami-859bbfe2"
  associate_public_ip_address: "" => "<computed>"
  availability_zone:           "" => "<computed>"
  ebs_block_device.#:          "" => "<computed>"
  ephemeral_block_device.#:    "" => "<computed>"
  instance_state:              "" => "<computed>"
  instance_type:               "" => "t2.micro"
  ipv6_addresses.#:            "" => "<computed>"
  key_name:                    "" => "<computed>"
  network_interface_id:        "" => "<computed>"
  placement_group:             "" => "<computed>"
  private_dns:                 "" => "<computed>"
  private_ip:                  "" => "<computed>"
  public_dns:                  "" => "<computed>"
  public_ip:                   "" => "<computed>"
  root_block_device.#:         "" => "<computed>"
  security_groups.#:           "" => "<computed>"
  source_dest_check:           "" => "true"
  subnet_id:                   "" => "<computed>"
  tenancy:                     "" => "<computed>"
  vpc_security_group_ids.#:    "" => "<computed>"
aws_instance.base: Still creating... (10s elapsed)
aws_instance.base: Still creating... (20s elapsed)
aws_instance.base: Creation complete (ID: i-0XXXXXXXXXXX6f52e)
datadog_monitor.cpumonitor: Creating...
  include_tags:        "" => "true"
  message:             "" => "CPU usage alert"
  name:                "" => "cpu monitor i-0XXXXXXXXXXX6f52e"
  new_host_delay:      "" => "30"
  notify_no_data:      "" => "false"
  query:               "" => "avg(last_1m):avg:system.cpu.system{host:i-0XXXXXXXXXXX6f52e} by {host} > 10"
  require_full_window: "" => "true"
  type:                "" => "metric alert"
datadog_monitor.cpumonitor: Creation complete (ID: XXXX862)

Apply complete! Resources: 2 added, 0 changed, 0 destroyed.
・・・

EC2インスタンス起動

|        AZ        |      InstanceId       | InstanceType  |  State   |
+------------------+-----------------------+---------------+----------+
|  ap-northeast-1a |  i-0XXXXXXXXXXX6f52e  |  t2.micro     |  running |

インスタンスIDを指定した Monitor 作成

f:id:htnosm:20170409233431p:plain

WebUI上での手動更新を行う

変更点が無い状態であることを確認します。

$ terraform plan
・・・
aws_instance.base: Refreshing state... (ID: i-0XXXXXXXXXXX6f52e)
datadog_monitor.cpumonitor: Refreshing state... (ID: XXXX862)
No changes. Infrastructure is up-to-date.
・・・

DatadogのWebUI上でMonitorを更新します。

f:id:htnosm:20170409231848p:plain

再度planを実行します。

$ terraform plan -target datadog_monitor.cpumonitor
・・・
aws_instance.base: Refreshing state... (ID: i-0XXXXXXXXXXX6f52e)
datadog_monitor.cpumonitor: Refreshing state... (ID: XXXX862)
・・・
~ datadog_monitor.cpumonitor
    name:                "cpu monitor terraform-dd-test" => "cpu monitor i-0XXXXXXXXXXX6f52e"
    no_data_timeframe:   "2" => "0"
    thresholds.%:        "1" => "0"
    thresholds.critical: "10.0" => ""


Plan: 0 to add, 1 to change, 0 to destroy.

変更をしていないパラメータも変更有りと認識されるようになってしまいました。 Datadog Provider に限りませんが、意図しない更新には注意が必要です。

まとめ

Terraform でのホスト管理にDatadog監視設定も併せて設定できます。 Datadog上のリソースはID指定となっているため、他の設定に影響することも無く、使い勝手は良いと思います。
期間限定で起動するインスタンスで他の設定に影響を与えず、管理・更新する等で使えそうです。

vague memory

うろ覚えを無くしていこうともがき苦しむ人の備忘録

Terraform Datadog Provider を試してみる

Datadog Provider

Datadog API Key 設定

tfvars

tf

plan

実行例(Monitor)

パラメータ未指定での実行

必須パラメータ指定での実行

実行結果

変更無しの状態で確認

更新

更新結果

削除

AWS EC2 インスタンス起動と合わせて Monitor 作成

ec2.tf

plan

apply

WebUI上での手動更新を行う

まとめ