I get that, not everyone has the same infrastructure needs, but what worries me that the common scenario with:
isn't presented anywhere on a real example project.
I'm looking for just that, and in the meantime, I have researched and concluded that apart from those needs I also want:
My current structure, which does not utilize modules - only root module:
infra/ ------------------------------ 'terraform init', 'terraform apply' inside here* main.tf ------------------------ Sets up aws provider, backend, backend bucket, dynamodb table terraform.tfvars variables.tf ----------------- Holds few variables such as aws_region, project_name...
My desired structure folder tree (for a simple dev & staging simulation of a single bucket resource) is I think something like this:
infra/ dev/ s3/ modules.tf ------ References s3 module from local/remote folder with dev inputs stage/ s3/ modules.tf ------ References s3 module from local/remote folder with stage inputs
But what about the files from my previous root module? I still want to have a remote backend in the same way as before, just now I want to have two state files (dev.tfstate and stage.tfstate) in the same backend bucket? How would the backend.tf files look like in each subdirectory and where would they be? In s3/ folder or dev/ folder?
It's kind of confusing since I'm transitioning from root module 'terraform init' approach, to specific subdirectory 'terraform init', and it's not clear to me whether I should still have a root-module or another folder for example called global/ which I should consider my prerequisite which I should init at the beginning of the project and is basically leave alone from that point on since it created the buckets which dev/ and staging/ can reference?
One more question is: what if I have s3/ ec2/ ecr/ subdirectories inside each environment, where do I execute 'terraform plan' command? Does it traverse all subdirectories?
When I have the answers and a clear picture of this above, it would be great to improve it by DRYing it up, but for now, I value a more practical solution through example rather than just a theoretic DRY explanation. Thanks!
I work with terraform
5 years. I did a lot of mistakes with in my career with modules and environments.Below text is just share of my knowledge and experience. They may be bad.
Real example project may is hard to find because terraform
is not used to create opensource projects. It's often unsafe to share terraform
files because you are showing all vulnerabilities from your intrastructure
You should create module that has single purpose, but your module should be generic.
You can create bastion host module
, but better idea is to create a module for generic server
. This module may have some logic dedicated to your business problem like, CW Log group
, some generic security group rules
, etc.
Sometimes it is worth to create more specific module.
Let's say you have application, that requires Lambda
, ECS service
, CloudWatch alarms
, RDS
, EBS
etc. All of that elements are strongly connected.
You have 2 options:
Everything depends on details and some circumstances.
But I will show you how I use terraform in my productions in different companies.
This is project, where you have environment as directories. For each application, networking, data resoruces you have separated state. I keep mutable data in separated directory(like RDS, EBS, EFS, S3, etc) so all apps, networking, etc can be destroyed and recreated, because they are stateless. No one can destroy statefull items because data can be lost. This is what i was doing for last few years.
project/├─ packer/├─ ansible/├─ terraform/│ ├─ environments/│ │ ├─ production/│ │ │ ├─ apps/│ │ │ │ ├─ blog/│ │ │ │ ├─ ecommerce/│ │ │ ├─ data/│ │ │ │ ├─ efs-ecommerce/│ │ │ │ ├─ rds-ecommerce/│ │ │ │ ├─ s3-blog/│ │ │ ├─ general/│ │ │ │ ├─ main.tf│ │ │ ├─ network/│ │ │ │ ├─ main.tf│ │ │ │ ├─ terraform.tfvars│ │ │ │ ├─ variables.tf│ │ ├─ staging/│ │ │ ├─ apps/│ │ │ │ ├─ ecommerce/│ │ │ │ ├─ blog/│ │ │ ├─ data/│ │ │ │ ├─ efs-ecommerce/│ │ │ │ ├─ rds-ecommerce/│ │ │ │ ├─ s3-blog/│ │ │ ├─ network/│ │ ├─ test/│ │ │ ├─ apps/│ │ │ │ ├─ blog/│ │ │ ├─ data/│ │ │ │ ├─ s3-blog/│ │ │ ├─ network/│ ├─ modules/│ │ ├─ apps/│ │ │ ├─ blog/│ │ │ ├─ ecommerce/│ │ ├─ common/│ │ │ ├─ acm/│ │ │ ├─ user/│ │ ├─ computing/│ │ │ ├─ server/│ │ ├─ data/│ │ │ ├─ efs/│ │ │ ├─ rds/│ │ │ ├─ s3/│ │ ├─ networking/│ │ │ ├─ alb/│ │ │ ├─ front-proxy/│ │ │ ├─ vpc/│ │ │ ├─ vpc-pairing/├─ tools/
To apply single application, You need to do:
cd ./project/terraform/environments/<ENVIRONMENT>/apps/blog;terraform apply;
You can see there is a lot of directories in all environments. As I can see there are pros and cons of that tools.
Cons:
Pros:
Last time I started working with new company. They keep infrastructure definition in few huge repositories(or folders), and when you do terraform apply, you create all applications at the same time.
project/├─ modules/│ ├─ acm/│ ├─ app-blog/│ ├─ app-ecommerce/│ ├─ server/│ ├─ vpc/├─ vars/│ ├─ user/│ ├─ prod.tfvars│ ├─ staging.tfvars│ ├─ test.tfvars├─ applications.tf├─ providers.tf├─ proxy.tf├─ s3.tf├─ users.tf├─ variables.tf├─ vpc.tf
Here you prepare different input values for each environment.
So for example you want to apply changes to prod:
terraform apply -var-file=vars/prod.tfvars -lock-timeout=300s
Apply staging:
terraform apply -var-file=vars/staging.tfvars -lock-timeout=300s
Cons:
terraform plan/apply
. Then you have problemPros:
As you can see this is more architectural problem, the only way to learn it, is to get more experience or read some posts from another people...
I am still trying to figure out the most optimal way and I would probably experiment with first way.
Do not take my advantages as sure thing. This post is just my experience, maybe not the best.
I will post some references that helped me a lot:
I realized as @MarkB suggested, that terraform workspaces are actually a solution to multi-env projects.
So my project structure looks something like this:
infra/dev/dev.tfvarsstage/stage.tfvars provider.tfmain.tfvariables.tf
main.tf references modules, provider.tf set's up the provider, backend.tf would set up the remote backend (yet to add), etc.
The 'terraform plan' in this configuration becomes 'terraform plan -var-file dev/dev.tfvars' where I specify the file with a specific configuration for that environment.
I can share what we ended up doing for our Indeni Cloudrail service. Hope it'll help.
We created a folder with all the modules. Then, there's a module called "all" which basically calls the other modules (s3, acm, etc.) with the right parameters. The "all" modules has variables.
Then, there are environments. Each of them calls the "all" module with specific values for these variables.
This is the output of a "find" command on the root of the Terraform code (sorry it isn't prettier). I removed many of the files as they weren't needed to get the point across:
./common.tfvars./terragrunt.hcl./environments./environments/prod./environments/prod/main.tf./environments/prod/terragrunt.hcl./environments/prod/lambda.layer.zip./environments/prod/terraform.tfvars./environments/prod/lambda.zip./environments/prod/common.tf./environments/dev-john./environments/dev-john/main.tf./environments/dev-john/terragrunt.hcl./environments/dev-john/terraform.tfvars./environments/dev-john/common.tf./environments/mgmt-dr./environments/mgmt-dr/data.tf./environments/mgmt-dr/main.tf./environments/mgmt-dr/terragrunt.hcl./environments/mgmt-dr/network.tf./environments/mgmt-dr/terraform.tfvars./environments/mgmt-dr/jenkins.tf./environments/mgmt-dr/keypair.tf./environments/mgmt-dr/common.tf./environments/mgmt-dr/openvpn-as.tf./environments/mgmt-dr/tgw.tf./environments/mgmt-dr/vars.tf./environments/staging./environments/staging/main.tf./environments/staging/terragrunt.hcl./environments/staging/terraform.tfvars./environments/staging/common.tf./environments/mgmt./environments/mgmt/data.tf./environments/mgmt/main.tf./environments/mgmt/terragrunt.hcl./environments/mgmt/network.tf./environments/mgmt/terraform.tfvars./environments/mgmt/route53.tf./environments/mgmt/acm.tf./environments/mgmt/jenkins.tf./environments/mgmt/keypair.tf./environments/mgmt/common.tf./environments/mgmt/openvpn-as.tf./environments/mgmt/tgw.tf./environments/mgmt/alb.tf./environments/mgmt/vars.tf./environments/develop./environments/develop/main.tf./environments/develop/terragrunt.hcl./environments/develop/terraform.tfvars./environments/develop/common.tf./environments/preproduction./environments/preproduction/main.tf./environments/preproduction/terragrunt.hcl./environments/preproduction/terraform.tfvars./environments/preproduction/common.tf./environments/prod-dr./environments/prod-dr/main.tf./environments/prod-dr/terragrunt.hcl./environments/prod-dr/terraform.tfvars./environments/prod-dr/common.tf./environments/preproduction-dr./environments/preproduction-dr/main.tf./environments/preproduction-dr/terragrunt.hcl./environments/preproduction-dr/terraform.tfvars./environments/preproduction-dr/common.tf./README.rst./modules./modules/secrets-manager./modules/secrets-manager/main.tf./modules/s3./modules/s3/main.tf./modules/cognito./modules/cognito/main.tf./modules/cloudfront./modules/cloudfront/main.tf./modules/cloudfront/files./modules/cloudfront/files/lambda.zip./modules/cloudfront/main.py./modules/all./modules/all/ecs.tf./modules/all/data.tf./modules/all/db-migration.tf./modules/all/s3.tf./modules/all/kms.tf./modules/all/rds-iam-auth.tf./modules/all/network.tf./modules/all/acm.tf./modules/all/cloudfront.tf./modules/all/templates./modules/all/lambda.tf./modules/all/tgw.tf./modules/all/guardduty.tf./modules/all/cognito.tf./modules/all/step-functions.tf./modules/all/secrets-manager.tf./modules/all/api-gateway.tf./modules/all/rds.tf./modules/all/cloudtrail.tf./modules/all/vars.tf./modules/ecs./modules/ecs/cluster./modules/ecs/cluster/main.tf./modules/ecs/task./modules/ecs/task/main.tf./modules/step-functions./modules/step-functions/main.tf./modules/api-gw./modules/api-gw/resource./modules/api-gw/resource/main.tf./modules/api-gw/method./modules/api-gw/method/main.tf./modules/api-gw/rest-api./modules/api-gw/rest-api/main.tf./modules/cloudtrail./modules/cloudtrail/main.tf./modules/cloudtrail/README.rst./modules/transit-gateway./modules/transit-gateway/attachment./modules/transit-gateway/attachment/main.tf./modules/transit-gateway/README.rst./modules/transit-gateway/gateway./modules/transit-gateway/gateway/main.tf./modules/openvpn-as./modules/openvpn-as/main.tf./modules/load-balancer./modules/load-balancer/outputs.tf./modules/load-balancer/main.tf./modules/load-balancer/vars.tf./modules/lambda./modules/lambda/main.tf./modules/vpc./modules/vpc/3tier./modules/vpc/3tier/main.tf./modules/vpc/3tier/README.rst./modules/vpc/peering./modules/vpc/peering/main.tf./modules/vpc/peering/README.rst./modules/vpc/public./modules/vpc/public/main.tf./modules/vpc/public/README.rst./modules/vpc/endpoint./modules/vpc/endpoint/main.tf./modules/vpc/README.rst./modules/vpc/isolated./modules/vpc/isolated/main.tf./modules/vpc/isolated/README.rst./modules/vpc/subnets./modules/vpc/subnets/main.tf./modules/vpc/subnets/README.rst./modules/guardduty./modules/guardduty/README.md./modules/guardduty/region./modules/guardduty/region/main.tf./modules/guardduty/region/guardduty.tf./modules/guardduty/region/sns-topic.tf./modules/guardduty/region/vars.tf./modules/guardduty/.gitignore./modules/guardduty/base./modules/guardduty/base/data.tf./modules/guardduty/base/guardduty-sqs.tf./modules/guardduty/base/guardduty-lambda.tf./modules/guardduty/base/variables.tf./modules/guardduty/base/guardduty-kms.tf./modules/guardduty/base/bucket.tf./modules/guardduty/base/guardduty-sns.tf./modules/guardduty/base/src./modules/guardduty/base/src/guardduty_findings_relay.py./modules/guardduty/base/src/guardduty_findings_relay.zip./modules/jenkins./modules/jenkins/main.tf./modules/rds./modules/rds/main.tf./modules/acm./modules/acm/main.tf
Old article but thought I'd add my view as it's such a common question and there is no right or wrong approach (except to say that one massive deployment for ALL resources, that takes 20 minutes to figure out a Plan
is asking for trouble as the blast radius would be huge). There's no hard rule for size of deployment, but I try to go with a rule of thumb of around 20-30 resources (max) and of course common sense. If it takes 10 minutes for TF to figure out the plan for adding a tag, then your deployment is probably too big.
After using Terraform for 4 or 5 years, I've tried all sorts, PowerShell wrappers, workspaces, terragrunt, pipelines & Terraform cloud. When using Open Source, I tend to go with an approach similar to @deltakroniker, using a different backend.tf
file per environment as well as .tfvars. Run this from a pipeline to add approval gates etc and it works reasonably well, not perfect, but then what approach is?
It's similar to a workspace approach, except it allows you to specify different storage accounts for each env (when using Azure blob backend).
environments/dev/backend.tfenvironment.tfvarsstage/backend.tfenvironment.tfvars tf-deploy/provider.tfmain.tfvariables.tf
plan or apply to an environment would be through command terraform plan --var-file=../environments/dev/environment.tfvars --backend-config=../environments/dev/backend.tf
Authentication to the backend is via environment variables (not in the backend.tf file). If done via a Pipeline then all sensitive vars can be gathered from a vault of some kind as part of the pipeline initialisation.
It's not perfect, you still have a question about how you try new module or provider versions, but don't want to promote to higher environments (with this approach, what you get in Dev, you ultimately get in Prod). In this case, approval gates and management of these type of changes becomes key. Alternatively, incorporating some kind of branched deployment for these type of changes could be an option.