Datadog 公式で Terraform を使った管理方法のブログが公開されていました。(多分2017/04/07公開)
Datadog Provider は結構前から用意されていたようですが、
触れたことが無かったので、ほぼDatadogブログの内容のままですが実際に使用してみます。
目次
Datadog Provider
Terraform 公式ドキュメントは以下です。
管理できるリソースは今の所以下の4つです。
- Downtime
- Monitor
- Timeboard
- User
Datadog API Key 設定
Terraform はインストール済みの前提です。 今回利用したバージョンは Terraform v0.9.2
tfvars
Datadog の API Key を設定した tfvars ファイルを作成します。
$ cat terraform.tfvars
datadog_api_key="XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
datadog_app_key="YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY"
tf
tfvars から API Key を読み込みます。main.tfとして作成しました。
$ cat main.tf
# Variables
variable "datadog_api_key" {}
variable "datadog_app_key" {}
# Configure the Datadog provider
provider "datadog" {
api_key = "${var.datadog_api_key}"
app_key = "${var.datadog_app_key}"
}
plan
API Key 設定を行った後、リソース部分が空の状態で plan 実行した結果が以下になります。
エラー出力されるようなら設定に何かしら誤りがあります。
$ terraform plan
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
# Variables
persisted to local or remote state storage.
No changes. Infrastructure is up-to-date.
This means that Terraform did not detect any differences between your
configuration and real physical resources that exist. As a result, Terraform
doesn't need to do anything.
実行例(Monitor)
モニター定義を管理します。
パラメータ未指定での実行
パラメータを未指定で実行すると必須パラメータ不足のエラーとなります。
cat monitor.tf
# Monitors
resource "datadog_monitor" "cpumonitor" {
}
$ terraform plan
4 error(s) occurred:
* datadog_monitor.cpumonitors: "message": required field is not set
* datadog_monitor.cpumonitors: "name": required field is not set
* datadog_monitor.cpumonitors: "query": required field is not set
* datadog_monitor.cpumonitors: "type": required field is not set
必須パラメータ指定での実行
必須パラメータ name,type,message,query を指定してplan実行します。
$ cat monitor.tf
# Monitors
resource "datadog_monitor" "cpumonitor" {
name = "cpu monitor"
type = "metric alert"
message = "CPU usage alert"
query = "avg(last_1m):avg:system.cpu.system{*} by {host} > 60"
}
$ terraform plan
・・・
+ datadog_monitor.cpumonitor
include_tags: "true"
message: "CPU usage alert"
name: "cpu monitor"
new_host_delay: "<computed>"
notify_no_data: "false"
query: "avg(last_1m):avg:system.cpu.system{*} by {host} > 60"
require_full_window: "true"
type: "metric alert"
Plan: 1 to add, 0 to change, 0 to destroy.
plan は問題無いので apply を実行します。
$ terraform apply
datadog_monitor.cpumonitor: Creating...
include_tags: "" => "true"
message: "" => "CPU usage alert"
name: "" => "cpu monitor"
new_host_delay: "" => "<computed>"
notify_no_data: "" => "false"
query: "" => "avg(last_1m):avg:system.cpu.system{*} by {host} > 60"
require_full_window: "" => "true"
type: "" => "metric alert"
datadog_monitor.cpumonitor: Creation complete (ID: XXXX732)
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
実行結果
事前にインポートを行っていない限りは、新規作成となります。
$ terraform show
datadog_monitor.cpumonitor:
id = XXXX732
include_tags = true
message = CPU usage alert
name = cpu monitor
notify_no_data = false
query = avg(last_1m):avg:system.cpu.system{*} by {host} > 60
require_full_window = true
type = metric alert
$ cat terraform.tfstate
{
"version": 3,
"terraform_version": "0.9.2",
"serial": 0,
"lineage": "f751ef78-ced3-4035-896b-aa0008b760e3",
"modules": [
{
"path": [
"root"
],
"outputs": {},
"resources": {
"datadog_monitor.cpumonitor": {
"type": "datadog_monitor",
"depends_on": [],
"primary": {
"id": "XXXX732",
"attributes": {
"id": "XXXX732",
"include_tags": "true",
"message": "CPU usage alert",
"name": "cpu monitor",
"notify_no_data": "false",
"query": "avg(last_1m):avg:system.cpu.system{*} by {host} \u003e 60",
"require_full_window": "true",
"type": "metric alert"
},
"meta": {},
"tainted": false
},
"deposed": [],
"provider": ""
}
},
"depends_on": []
}
]
}
変更無しの状態で確認
何も変更を行っていない状態で、更新が掛からない事を確認します。
$ terraform plan
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.
datadog_monitor.cpumonitor: Refreshing state... (ID: XXXX732)
No changes. Infrastructure is up-to-date.
This means that Terraform did not detect any differences between your
configuration and real physical resources that exist. As a result, Terraform
doesn't need to do anything.
$ terraform apply
datadog_monitor.cpumonitor: Refreshing state... (ID: XXXX732)
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
更新
閾値を追加して更新を行います。
$ cat monitor.tf
# Monitors
resource "datadog_monitor" "cpumonitor" {
name = "cpu monitor"
type = "metric alert"
message = "CPU usage alert"
query = "avg(last_1m):avg:system.cpu.system{*} by {host} > 60"
thresholds {
ok = 20
warning = 50
critical = 60
}
}
$ terraform plan
・・・
datadog_monitor.cpumonitor: Refreshing state... (ID: XXXX732)
・・・
~ datadog_monitor.cpumonitor
thresholds.%: "0" => "3"
thresholds.critical: "" => "60"
thresholds.ok: "" => "20"
thresholds.warning: "" => "50"
Plan: 0 to add, 1 to change, 0 to destroy.
$ terraform apply
datadog_monitor.cpumonitor: Refreshing state... (ID: XXXX732)
datadog_monitor.cpumonitor: Modifying... (ID: XXXX732)
thresholds.%: "0" => "3"
thresholds.critical: "" => "60"
thresholds.ok: "" => "20"
thresholds.warning: "" => "50"
datadog_monitor.cpumonitor: Modifications complete (ID: XXXX732)
Apply complete! Resources: 0 added, 1 changed, 0 destroyed.
・・・
更新結果
閾値設定されたことを確認できます。
$ terraform show
datadog_monitor.cpumonitor:
id = XXXX732
escalation_message =
include_tags = true
locked = false
message = CPU usage alert
name = cpu monitor
new_host_delay = 300
no_data_timeframe = 0
notify_audit = false
notify_no_data = false
query = avg(last_1m):avg:system.cpu.system{*} by {host} > 60
renotify_interval = 0
require_full_window = true
silenced.% = 0
tags.# = 0
thresholds.% = 3
thresholds.critical = 60.0
thresholds.ok = 20.0
thresholds.warning = 50.0
timeout_h = 0
type = metric alert
show結果を見ると、指定していないパラメータについても値が出力されています。
注意点としては、これらのデフォルト値は Datadog API ではなく、 Terraform provider 側で指定される事です。
削除
Terraformで管理している設定の削除を実行します。
$ terraform destroy
Do you really want to destroy?
Terraform will delete all your managed infrastructure.
There is no undo. Only 'yes' will be accepted to confirm.
Enter a value: yes
datadog_monitor.cpumonitor: Refreshing state... (ID: XXXX732)
datadog_monitor.cpumonitor: Destroying... (ID: XXXX732)
datadog_monitor.cpumonitor: Destruction complete
Destroy complete! Resources: 1 destroyed.
AWS EC2 インスタンス起動と合わせて Monitor 作成
他Providerと組み合わせる例として EC2インスタンスとの連携例がありました。
ec2.tf
$ cat ec2.tf
# Configure the AWS Provider
provider "aws" {
access_key = "${var.aws_access_key}"
secret_key = "${var.aws_secret_key}"
region = "ap-northeast-1"
}
resource "aws_instance" "base" {
ami = "ami-859bbfe2" # Amazon Linux AMI 2017.03.0 (HVM), SSD Volume Type
instance_type = "t2.micro"
}
resource "datadog_monitor" "cpumonitor" {
name = "cpu monitor ${aws_instance.base.id}"
type = "metric alert"
message = "CPU usage alert"
query = "avg(last_1m):avg:system.cpu.system{host:${aws_instance.base.id}} by {host} > 10"
new_host_delay = 30
}
plan
$ terraform plan
・・・
+ aws_instance.base
ami: "ami-859bbfe2"
associate_public_ip_address: "<computed>"
availability_zone: "<computed>"
ebs_block_device.#: "<computed>"
ephemeral_block_device.#: "<computed>"
instance_state: "<computed>"
instance_type: "t2.micro"
ipv6_addresses.#: "<computed>"
key_name: "<computed>"
network_interface_id: "<computed>"
placement_group: "<computed>"
private_dns: "<computed>"
private_ip: "<computed>"
public_dns: "<computed>"
public_ip: "<computed>"
root_block_device.#: "<computed>"
security_groups.#: "<computed>"
source_dest_check: "true"
subnet_id: "<computed>"
tenancy: "<computed>"
vpc_security_group_ids.#: "<computed>"
+ datadog_monitor.cpumonitor
include_tags: "true"
message: "CPU usage alert"
name: "cpu monitor ${aws_instance.base.id}"
new_host_delay: "30"
notify_no_data: "false"
query: "avg(last_1m):avg:system.cpu.system{host:${aws_instance.base.id}} by {host} > 10"
require_full_window: "true"
type: "metric alert"
Plan: 2 to add, 0 to change, 0 to destroy.
apply
$ terraform apply
aws_instance.base: Creating...
ami: "" => "ami-859bbfe2"
associate_public_ip_address: "" => "<computed>"
availability_zone: "" => "<computed>"
ebs_block_device.#: "" => "<computed>"
ephemeral_block_device.#: "" => "<computed>"
instance_state: "" => "<computed>"
instance_type: "" => "t2.micro"
ipv6_addresses.#: "" => "<computed>"
key_name: "" => "<computed>"
network_interface_id: "" => "<computed>"
placement_group: "" => "<computed>"
private_dns: "" => "<computed>"
private_ip: "" => "<computed>"
public_dns: "" => "<computed>"
public_ip: "" => "<computed>"
root_block_device.#: "" => "<computed>"
security_groups.#: "" => "<computed>"
source_dest_check: "" => "true"
subnet_id: "" => "<computed>"
tenancy: "" => "<computed>"
vpc_security_group_ids.#: "" => "<computed>"
aws_instance.base: Still creating... (10s elapsed)
aws_instance.base: Still creating... (20s elapsed)
aws_instance.base: Creation complete (ID: i-0XXXXXXXXXXX6f52e)
datadog_monitor.cpumonitor: Creating...
include_tags: "" => "true"
message: "" => "CPU usage alert"
name: "" => "cpu monitor i-0XXXXXXXXXXX6f52e"
new_host_delay: "" => "30"
notify_no_data: "" => "false"
query: "" => "avg(last_1m):avg:system.cpu.system{host:i-0XXXXXXXXXXX6f52e} by {host} > 10"
require_full_window: "" => "true"
type: "" => "metric alert"
datadog_monitor.cpumonitor: Creation complete (ID: XXXX862)
Apply complete! Resources: 2 added, 0 changed, 0 destroyed.
・・・
| AZ | InstanceId | InstanceType | State |
+------------------+-----------------------+---------------+----------+
| ap-northeast-1a | i-0XXXXXXXXXXX6f52e | t2.micro | running |
WebUI上での手動更新を行う
変更点が無い状態であることを確認します。
$ terraform plan
・・・
aws_instance.base: Refreshing state... (ID: i-0XXXXXXXXXXX6f52e)
datadog_monitor.cpumonitor: Refreshing state... (ID: XXXX862)
No changes. Infrastructure is up-to-date.
・・・
DatadogのWebUI上でMonitorを更新します。
再度planを実行します。
$ terraform plan -target datadog_monitor.cpumonitor
・・・
aws_instance.base: Refreshing state... (ID: i-0XXXXXXXXXXX6f52e)
datadog_monitor.cpumonitor: Refreshing state... (ID: XXXX862)
・・・
~ datadog_monitor.cpumonitor
name: "cpu monitor terraform-dd-test" => "cpu monitor i-0XXXXXXXXXXX6f52e"
no_data_timeframe: "2" => "0"
thresholds.%: "1" => "0"
thresholds.critical: "10.0" => ""
Plan: 0 to add, 1 to change, 0 to destroy.
変更をしていないパラメータも変更有りと認識されるようになってしまいました。
Datadog Provider に限りませんが、意図しない更新には注意が必要です。
まとめ
Terraform でのホスト管理にDatadog監視設定も併せて設定できます。
Datadog上のリソースはID指定となっているため、他の設定に影響することも無く、使い勝手は良いと思います。
期間限定で起動するインスタンスで他の設定に影響を与えず、管理・更新する等で使えそうです。