Linux Note / 运维笔记

Terraform 初步体验

Einic Yeo · 9月14日 · 2021年 · · · ·

Terraform 简介

Terraform是由HashiCorp公司在2014年左右推出的开源资源编排工具, 目前几乎所有的主流云服务商都支持Terraform,包括阿里云、腾版权声明:本文遵循 CC 4.0 BY-SA 版权协议,若要转载请务必附上原文出处链接及本声明,谢谢合作!讯云、华为云、AWS、Azure、百度云等等。目前很多公司都基于terraform构建自己的基础架构。

诞生背景: 传统运维模式下,业务上线需经过设备采购,机器上架,网络环境搭建和系统安装等准备阶段。随着云计算的兴起,各大公有云厂商均提供了非常友好的交互界面,用户借助一个浏览器就可以按需采购各种云资源,快速实现业务架构的搭建。然而,随着业务架构的不断扩展,云资源采购的规模和种类也在持续增加。当用户需要快速采购大量不同类型的云资源时,云管版权声明:本文遵循 CC 4.0 BY-SA 版权协议,若要转载请务必附上原文出处链接及本声明,谢谢合作!理页面间大量的交互操作反而降低了云资源的采购效率。在阿里云控制台上初始化一个经典的VPC网络架构,从创建VPC、交换机VSwitch到创建Nat网关、弹性IP再到配置路由等工作,大概要花费20分钟甚至更久。同时,工作成果的不可复制性,带来的是跨Region和跨云平台场景下的重复劳动。

事实上,对业务运维人员而言,只关心对资源的配置,无需关心这些资源的创建步骤。如同喝咖啡,只需要告诉服务员喝什么,加不加冰等就够了。如果有一份完整的云资源采购清单,这张清单清楚的记录了所需要购买的云资源的种类,规格,数量以及各云资源之间的关系,然后一键完成购买,并且当业务需求发生变化时,只需要变更清单就可以实现对云资源的快速变更,那么效率就会提高很多。在云计算中这被称作资源编排,目前很多云平台也提供了资源编排的能力,如阿里云的ROS,AWS的CloudFormation等。

将云资源、服务或者操作步骤以代码的形式定义在模板中,借助编排引擎,实现资源的自动化管理,这就是基础设施即代码(Infrastructure as Code,简称IaC),也是资源编排最高效的实现模式。然而,多种云编排服务带来的是高昂的学习成本、低效的代码复用率和复杂的多云协同工作流程。每一种服务仅限于管理自家的单一云平台上,无法满足对多个云平台,多种层级(如IaaS,PaaS)资源的统一管理。如何解决如上问题,是否可以使用统一的编排工具,共用一套语法实现对包括阿里云在内的多云的统一管理呢?所以这个时候就诞生Terraform,来解决这些问题。

Terrafrom 功能和作用

功能点

  • IaC:infrastructure as code,用代码管理基础设施
  • 执行计划:显示terraform apply时执行的操作
  • 资源图:构建所有资源的图形
  • 变更自动化:基于执行计划和资源图,可以清晰知道要变更的内容和顺序 总结:terraform用于各类基础设施资源初始化,支持多种云平台,支持第三方服务对接

作用

  • 使用不同provider的API,包装抽象成Terraform的标准代码结构
  • 用户不版权声明:本文遵循 CC 4.0 BY-SA 版权协议,若要转载请务必附上原文出处链接及本声明,谢谢合作!需要了解每个云计算厂商的API细节,降低了部署难度

Terraform 基本架构

Terraform本身是基于插件的架构,可扩展性很强,可以方便程序员对Terraform进行扩展。Terraform从逻辑上可以分为两层,核心层(Terraform Core)和插件层(Terraform Provider)。

核心层

核心层其实就是terraform的命令行工具,它是用go语言开发的,它负责:

  1. 读取.tf配置,进行变量替换
  2. 资源状态文件管理
  3. 分析资源关系,绘制图谱
  4. 依赖关系图谱,创建资源 根据依赖关系,创建资源;对于没有依赖关系的资源,会并行进行创建(缺省10个并行进程),这也是Terraform能够高效快速管理云资源的原因。
  5. 用RPC调用插件层

插件层

插件层也是由go语言开发的,Terraform有超过250个不同的插件,它们负责:

  • 接受核心层的RPC调用
  • 具体提供某一项服务的执行

插件层又有两种:

Provider

Provider,负责与外界API的集成,比如阿里云Provider就提供了在阿里云创建、修改、删除云资源的功能。这个插件负责和阿里云云API的接口交互,并提供一层抽象,这样程序员可以在不了解版权声明:本文遵循 CC 4.0 BY-SA 版权协议,若要转载请务必附上原文出处链接及本声明,谢谢合作!API细节的情况下,通过terraform来编排资源。它负责:

  • 始化以及外界API通信
  • 外界API的认证
  • 定义云资源与外界服务的关系

比如常见provider:

阿里云: https://github.com/aliyun/terraform-provider-alicloud
百度云:https://github.com/baidubce/terraform-provider-baiducloud
腾讯云:https://github.com/tencentcloudstack/terraform-provider-tencentcloud
华为云:https://github.com/huaweicloud/terraform-provider-huaweicloud
ucloud:https://github.com/ucloud/terraform-provider-ucloud
qingcloud:https://gi版权声明:本文遵循 CC 4.0 BY-SA 版权协议,若要转载请务必附上原文出处链接及本声明,谢谢合作!thub.com/yunify/terraform-provider-qingcloud
AWS:https://github.com/hashicorp/terraform-provider-aws
Azure:https://github.com/terraform-providers/terraform-provider-azurerm
GoogleCloud:https://github.com/hashicorp/terraform-provider-google

Provisioner

Provisioner,负责在资源创建或者删除完成后,执行一些脚本。比如Puppet Provisioner就可以在云虚拟机资源创建完成后,在该资源上下载、安装、配置Puppet agent。

为了方便理解,网络上找了一个组件架构图,简单说明各个组件位置:

Terraform workflow

Terraform 安装

首先我们先安装Terraform。对于Ubuntu用户:

curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo apt-key add -
sudo apt-add-repository -y "deb [arch=amd64] https://apt.releases.hashicorp.com $(lsb_release -cs) main"
sudo apt-get update && sudo apt-get install -y terraform

对于CentOS用户:

sudo yum install -y yum-utils
sudo yum-config-manager --add-repo https://rpm.releases.hashicorp.com/RHEL/hashicorp.repo
sudo yum -y install terraform

对于Mac用户:

brew tap hashicorp/tap
brew install hashicorp/tap/terraform

对于Windows用户,官方推荐的包管理器是choco,可以去https://chocolatey.org/ 下载安装好chocolatey后,以管理员身份启动powershell,然后:

choco install terraform

如果只想纯手动安装,那么可以前往Terraform官网下载对应操作系统的可执行文件(Terraform是用go编写的,只有一个可执行文件),解压缩到指定的位置后,配置一下环境变量的PATH,使其包含Terraform所在的目录即可。

Terraform 验证

terraform version
Terraform v0.13.5

terraform -help
Usage: terraform [-version] [-help] <command> [args]

The available commands for execution are listed below.
The most common, useful commands are shown first, followed by
less common or more advanced commands. If you're just getting
started with Terraform, stick with the common commands. For the
other commands, please read the help and docs before usage.
##...

Terraform 实战

让我们以UCloud为例写一个简单的例子,运行这个例子会产生些许费用,如果读者不想付费,那么只需要阅读流程即可。另外介绍Terraform入门的文章很多,大多以AWS为例,阿里云与腾讯云也有相关入门介绍文章,不想注册UCloud账号的读者也可以自行查阅其他公有云相关的入门文章,本系列教程的重点将是后续的部分,所以无论用什么公有云,读者只需要对Terraform有一个最简单的体验就可以。

首先,你需要前往ucloud.cn注册一个UCloud账号,然后登录控制台,获取PublicKey和PrivateKey,再访问https://accountv2.ucloud.cn/auth_manage/project,获取默认项目id。

然后我们创建一个干净的空文件夹,在里面创建一个main.tf文件(tf就是Terraform,Terraform代码大部分是.tf文件,语法是HCL,当然目前也支持JSON格式的Terraform代码,但我们暂时只以tf为例):

terraform {
  required_version = "~>0.13.5"
  required_providers {
    ucloud = {
      source  = "ucloud/ucloud"
      version = "~>1.22.0"
    }
  }
}

provider "ucloud" {
  public_key  = "JInqRnkSY8eAmxKFRxW9kVANYThg1pcvjD2Aw5f5p"
  private_key = "IlJn6GlmanYI1iDVEtrPyt5R9noAGz41B8q5TML7abqD8e4YjVdylwaKWdY61J5TcA"
  project_id  = "org-tgqbvi"
  region      = "cn-bj2"
}

data "ucloud_security_groups" "default" {
  type = "recommend_web"
}

data "ucloud_images" "default" {
  availability_zone = "cn-bj2-04"
  name_regex        = "^CentOS 6.5 64"
  image_type        = "base"
}

resource "ucloud_instance" "web" {
  availability_zone = "cn-bj2-04"
  image_id          = data.ucloud_images.default.images[0].id
  instance_type     = "n-basic-2"
  root_password     = "supersecret1234"
  name              = "tf-example-instance"
  tag               = "tf-example"
  boot_disk_type    = "cloud_ssd"

  security_group = data.ucloud_security_groups.default.security_groups[0].id

  delete_disks_with_instance = true

  user_data = <<EOF
#!/bin/bash
yum install -y nginx
service nginx start
EOF
}

resource "ucloud_eip" "web-eip" {
  internet_type = "bgp"
  charge_mode   = "bandwidth"
  charge_type   = "dynamic"
  name          = "web-eip"
}

resource "ucloud_eip_association" "web-eip-association" {
  eip_id      = ucloud_eip.web-eip.id
  resource_id = ucloud_instance.web.id
}

output "eip" {
  value = ucloud_eip.web-eip.public_ip
}

这里要注意修改代码中的这一段:

provider "ucloud" {
  public_key  = "JInqRnkSY8eAmxKFRxW9kVANYThg1pcvjD2Aw5f5p"
  private_key = "IlJn6GlmanYI1iDVEtrPyt5R9noAGz41B8q5TML7abqD8e4YjVdylwaKWdY61J5TcA"
  project_id  = "org-tgqbvi"
  region      = "cn-bj2"
}

这里的public_keyprivate_key以及project_id要替换成读者自己刚才获取到的访问密钥以及项目id。代码里的key和project id已经被我删除了。必须特别指出的是,这种将机密信息硬编码在代码中的做法是非常错误的,仅在此处方便演示适用,切勿将含有自己机密信息的代码提交到源代码管理系统里,练习后注意重制自己的密钥。

这段代码比较简单,头部的terraform这一段声明了这段代码所需要的Terraform版本以及UCloud插件版本,后面的provider段则是给出了调用UCloud API所需要的key和项目id等信息。

真正定义云端基础设施的代码就是后面的部分,分为三部分,data、resource和output。

data代表利用UCloud插件定义的data模型对UCloud进行查询,例如我们在代码中利用data查询cn-bj2-04机房UCloud官方提供的CentOS 6.5 x64主机镜像的id,以及官方提供的默认Web服务器适用的安全组(可以理解成防火墙)的id,这样我们就不需要人工在界面上去查询相关id,再硬编码到代码中。

resource代表我们需要在云端创建的资源,在例子里我们创建了三个资源,分别是主机、弹性公网ip,以及主机和公网ip的绑定。

我们在定义主机时给定了主机的尺寸、系统盘类型等关键信息,并且通过user_data定义了第一次开机时所要执行的初始化脚本,在脚本中我们在这台CentOS服务器上安装了nginx服务并启动之。

最后,我们声明了一个output,名字是eip,它的值就是我们创建的弹性公网ip的值。

运行这段代码很简单,让我们在代码所在的路径下进入命令行,执行:

$ terraform init

这时Terraform会进行初始化操作,通过官方插件仓库下载对应操作系统的UCloud插件。如果一切都正常,读者应该会看到:

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.

然后我们可以预览一下代码即将产生的变更:

$ terraform plan
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.

data.ucloud_images.default: Refreshing state...
data.ucloud_security_groups.default: Refreshing state...

------------------------------------------------------------------------

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # ucloud_eip.web-eip will be created
  + resource "ucloud_eip" "web-eip" {
      + bandwidth     = (known after apply)
      + charge_mode   = "bandwidth"
      + charge_type   = "dynamic"
      + create_time   = (known after apply)
      + expire_time   = (known after apply)
      + id            = (known after apply)
      + internet_type = "bgp"
      + ip_set        = (known after apply)
      + name          = "web-eip"
      + public_ip     = (known after apply)
      + remark        = (known after apply)
      + resource      = (known after apply)
      + status        = (known after apply)
      + tag           = "Default"
    }

  # ucloud_eip_association.web-eip-association will be created
  + resource "ucloud_eip_association" "web-eip-association" {
      + eip_id        = (known after apply)
      + id            = (known after apply)
      + resource_id   = (known after apply)
      + resource_type = (known after apply)
    }

  # ucloud_instance.web will be created
  + resource "ucloud_instance" "web" {
      + auto_renew                 = (known after apply)
      + availability_zone          = "cn-bj2-04"
      + boot_disk_size             = (known after apply)
      + boot_disk_type             = "cloud_ssd"
      + charge_type                = (known after apply)
      + cpu                        = (known after apply)
      + cpu_platform               = (known after apply)
      + create_time                = (known after apply)
      + data_disk_size             = (known after apply)
      + data_disk_type             = (known after apply)
      + delete_disks_with_instance = true
      + disk_set                   = (known after apply)
      + expire_time                = (known after apply)
      + id                         = (known after apply)
      + image_id                   = "uimage-awndwi"
      + instance_type              = "n-basic-2"
      + ip_set                     = (known after apply)
      + isolation_group            = (known after apply)
      + memory                     = (known after apply)
      + name                       = "tf-example-instance"
      + private_ip                 = (known after apply)
      + remark                     = (known after apply)
      + root_password              = (sensitive value)
      + security_group             = "firewall-jofwjzmw"
      + status                     = (known after apply)
      + subnet_id                  = (known after apply)
      + tag                        = "tf-example"
      + user_data                  = <<~EOT
            #!/bin/bash
            yum install -y nginx
            service nginx start
        EOT
      + vpc_id                     = (known after apply)
    }

Plan: 3 to add, 0 to change, 0 to destroy.

Changes to Outputs:
  + eip = (known after apply)

------------------------------------------------------------------------

Note: You didn't specify an "-out" parameter to save this plan, so Terraform
can't guarantee that exactly these actions will be performed if
"terraform apply" is subsequently run.

这段输出告诉我们,代码即将创建3个新资源,修改0个资源,删除0个资源。资源的属性少部分是我们在代码中直接给出的,或是通过data查询的,所以在plan命令的结果中可以看到它们的值;更多的属性只有在资源真正被创建以后我们才能看到,所以会显示“(known after apply)”。

然后我们运行一下:

$ terraform apply
data.ucloud_images.default: Refreshing state...
data.ucloud_security_groups.default: Refreshing state...

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # ucloud_eip.web-eip will be created
  + resource "ucloud_eip" "web-eip" {
      + bandwidth     = (known after apply)
      + charge_mode   = "bandwidth"
      + charge_type   = "dynamic"
      + create_time   = (known after apply)
      + expire_time   = (known after apply)
      + id            = (known after apply)
      + internet_type = "bgp"
      + ip_set        = (known after apply)
      + name          = "web-eip"
      + public_ip     = (known after apply)
      + remark        = (known after apply)
      + resource      = (known after apply)
      + status        = (known after apply)
      + tag           = "Default"
    }

  # ucloud_eip_association.web-eip-association will be created
  + resource "ucloud_eip_association" "web-eip-association" {
      + eip_id        = (known after apply)
      + id            = (known after apply)
      + resource_id   = (known after apply)
      + resource_type = (known after apply)
    }

  # ucloud_instance.web will be created
  + resource "ucloud_instance" "web" {
      + auto_renew                 = (known after apply)
      + availability_zone          = "cn-bj2-04"
      + boot_disk_size             = (known after apply)
      + boot_disk_type             = "cloud_ssd"
      + charge_type                = (known after apply)
      + cpu                        = (known after apply)
      + cpu_platform               = (known after apply)
      + create_time                = (known after apply)
      + data_disk_size             = (known after apply)
      + data_disk_type             = (known after apply)
      + delete_disks_with_instance = true
      + disk_set                   = (known after apply)
      + expire_time                = (known after apply)
      + id                         = (known after apply)
      + image_id                   = "uimage-awndwi"
      + instance_type              = "n-basic-2"
      + ip_set                     = (known after apply)
      + isolation_group            = (known after apply)
      + memory                     = (known after apply)
      + name                       = "tf-example-instance"
      + private_ip                 = (known after apply)
      + remark                     = (known after apply)
      + root_password              = (sensitive value)
      + security_group             = "firewall-jofwjzmw"
      + status                     = (known after apply)
      + subnet_id                  = (known after apply)
      + tag                        = "tf-example"
      + user_data                  = <<~EOT
            #!/bin/bash
            yum install -y nginx
            service nginx start
        EOT
      + vpc_id                     = (known after apply)
    }

Plan: 3 to add, 0 to change, 0 to destroy.

Changes to Outputs:
  + eip = (known after apply)

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

当我们运行terraform apply时,Terraform会首先重新计算一下变更计划,并且像刚才执行plan命令那样把变更计划打印给我们,要求我们人工确认。让我们输入yes,然后回车:

ucloud_eip.web-eip: Creating...
ucloud_instance.web: Creating...
ucloud_eip.web-eip: Creation complete after 3s [id=eip-pyjwpcgd]
ucloud_instance.web: Still creating... [10s elapsed]
ucloud_instance.web: Still creating... [20s elapsed]
ucloud_instance.web: Still creating... [30s elapsed]
ucloud_instance.web: Creation complete after 39s [id=uhost-e4heibq3]
ucloud_eip_association.web-eip-association: Creating...
ucloud_eip_association.web-eip-association: Creation complete after 5s [id=eip-pyjwpcgd:uhost-e4heibq3]

Apply complete! Resources: 3 added, 0 changed, 0 destroyed.

Outputs:

eip = 106.75.32.183

可以看到,Terraform成功地创建了我们定义的资源,并且把我们定义的输出给打印了出来。如果我们在浏览器里访问我们输出的弹性ip地址,我们就可以看到一个nginx页面:

nginx输出,我们的虚拟机正在工作
图 1.2/1 – nginx输出,我们的虚拟机正在工作

Terraform 清理

完成这个体验后,不要忘记清理我们的云端资源。我们可以通过调用destroy命令来轻松完成清理:

$ terraform destroy
data.ucloud_images.default: Refreshing state... [id=1609882940]
data.ucloud_security_groups.default: Refreshing state... [id=2820377529]
ucloud_eip.web-eip: Refreshing state... [id=eip-pyjwpcgd]
ucloud_instance.web: Refreshing state... [id=uhost-e4heibq3]
ucloud_eip_association.web-eip-association: Refreshing state... [id=eip-pyjwpcgd:uhost-e4heibq3]

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  - destroy

Terraform will perform the following actions:

  # ucloud_eip.web-eip will be destroyed
  - resource "ucloud_eip" "web-eip" {
      - bandwidth     = 1 -> null
      - charge_mode   = "bandwidth" -> null
      - charge_type   = "dynamic" -> null
      - create_time   = "2020-11-15T16:31:42+08:00" -> null
      - expire_time   = "2020-11-15T17:31:42+08:00" -> null
      - id            = "eip-pyjwpcgd" -> null
      - internet_type = "bgp" -> null
      - ip_set        = [
          - {
              - internet_type = "BGP"
              - ip            = "106.75.32.183"
            },
        ] -> null
      - name          = "web-eip" -> null
      - public_ip     = "106.75.32.183" -> null
      - resource      = {
          - "id"   = "uhost-e4heibq3"
          - "type" = "instance"
        } -> null
      - status        = "used" -> null
      - tag           = "Default" -> null
    }

  # ucloud_eip_association.web-eip-association will be destroyed
  - resource "ucloud_eip_association" "web-eip-association" {
      - eip_id        = "eip-pyjwpcgd" -> null
      - id            = "eip-pyjwpcgd:uhost-e4heibq3" -> null
      - resource_id   = "uhost-e4heibq3" -> null
      - resource_type = "instance" -> null
    }

  # ucloud_instance.web will be destroyed
  - resource "ucloud_instance" "web" {
      - auto_renew                 = true -> null
      - availability_zone          = "cn-bj2-04" -> null
      - boot_disk_size             = 20 -> null
      - boot_disk_type             = "cloud_ssd" -> null
      - charge_type                = "month" -> null
      - cpu                        = 2 -> null
      - cpu_platform               = "Intel/Broadwell" -> null
      - create_time                = "2020-11-15T16:31:46+08:00" -> null
      - delete_disks_with_instance = true -> null
      - disk_set                   = [
          - {
              - id      = "bsi-wnj4eh2x"
              - is_boot = true
              - size    = 20
              - type    = "cloud_ssd"
            },
        ] -> null
      - expire_time                = "2020-12-15T16:31:48+08:00" -> null
      - id                         = "uhost-e4heibq3" -> null
      - image_id                   = "uimage-awndwi" -> null
      - instance_type              = "n-basic-2" -> null
      - ip_set                     = [
          - {
              - internet_type = "Private"
              - ip            = "10.9.20.202"
            },
          - {
              - internet_type = "BGP"
              - ip            = "106.75.32.183"
            },
        ] -> null
      - memory                     = 4 -> null
      - name                       = "tf-example-instance" -> null
      - private_ip                 = "10.9.20.202" -> null
      - root_password              = (sensitive value)
      - security_group             = "firewall-jofwjzmw" -> null
      - status                     = "Running" -> null
      - subnet_id                  = "subnet-0azpshdq" -> null
      - tag                        = "tf-example" -> null
      - user_data                  = <<~EOT
            #!/bin/bash
            yum install -y nginx
            service nginx start
        EOT -> null
      - vpc_id                     = "uvnet-olpsy01g" -> null
    }

Plan: 0 to add, 0 to change, 3 to destroy.

Changes to Outputs:
  - eip = "106.75.32.183" -> null

Do you really want to destroy all resources?
  Terraform will destroy all your managed infrastructure, as shown above.
  There is no undo. Only 'yes' will be accepted to confirm.

  Enter a value: yes

可以看到,Terraform列出了它即将清理的资源信息,并且要求我们人工确认同意继续执行清理操作。我们输入yes,然后回车:

ucloud_eip_association.web-eip-association: Destroying... [id=eip-pyjwpcgd:uhost-e4heibq3]
ucloud_eip_association.web-eip-association: Destruction complete after 2s
ucloud_eip.web-eip: Destroying... [id=eip-pyjwpcgd]
ucloud_instance.web: Destroying... [id=uhost-e4heibq3]
ucloud_eip.web-eip: Destruction complete after 0s
ucloud_instance.web: Destruction complete after 6s

Destroy complete! Resources: 3 destroyed.

很快的,刚才创建的资源就全部被删除了。

Terraform与以往诸如Ansible等配置管理工具比较大的不同在于,它是根据代码计算出的目标状态与当前状态的差异来计算变更计划的,有兴趣的读者可以在执行terraform apply以后,直接再执行一次terraform apply,看看会发生什么,就能明白他们之间的差异。

实际上这段代码在apply以后,直接再次apply,得到的计划会是什么也不做,因为当前云端的资源状态已经完全符合代码所描述的期望状态了,所以Terraform什么也不会做。好了,这就是我们对Terraform的一个初步体验。

0 条回应