Table of Contents

I’ve written previously about using ChatGPT to provide Artificial Intelligence (‘AI’) support for coding and tutoring in unfamiliar areas. What about using it on home turf, i.e. in an area with which I am familiar- in this case a Terraform module for generic use?

(W)here’s the Code

It’s not necessary for following the below article but the code is published on GitHub and in the Community module registry

Background

Recently I was tasked, in a consultancy capacity, with providing a Terraform module for a (generic) AWS IAM Role for private/team use. It was up to me whether I wrote one from scratch or used public code. There are something like (approaching) 200 modules for this in the Hashicorp public community module registry so I reviewed some of the top candidates there. This is a requirement that has existed for a long time, where there have been various changes to recommended practice over time, new features introduced and some, whilst still supported, are deprecated. Notably it’s also a space where there aren’t necessarily clear boundaries to the problem and different publishers have provided solutions with different feature sets, e.g. some modules feature creating aws_iam_openid_connect_provider or aws_iam_instance_profile. In my view, whilst closely associated, these aren’t part of an IAM role. I didn’t have time to review every offering and so I created a new one for our team. Due to pressures of time and restrictions on tooling there were some corner features that I did not get to develop fully before we moved on, although they were not needed for the use case to that point.

Recently I had time to consider what this module might have looked like if I had had the time and access to my preferred tooling.

Speccing a new module

I had a very clear idea of which features I wanted to include and and which I didn’t, but my heart sank a little at the thought of having to redo all of the boilerplate resources, variables descriptions, ternaries, etc. I’ve done a little playing about with Wing recently and it just seemed so unnecessary. Since I had a clear spec I decided to see what sort of result I could get with ChatGPT. Starting in this way would have the additional advantage of removing any question of code re-use from a recent job.

Initial prompt

please write a terraform module to create an aws_iam_role with options to include:

zero or one inline polices

zero or more managed policy attachments

zero or more customer_managed_policy attachments

zero or one permission boundary

ChatGPT produced a module and had already elected to use iam_role_policy rather than inline_policy, in accordance with advised practice.

Incrementing features

I wanted a feature to allow creating and attaching an arbitrary number of bespoke (customer managed) policies within the module to avoid dependency issues within a single plan. Whilst I had made a half hearted effort using set(map(string)) and for_each, it was easier to have ChatGPT fix the issue:

I am trying to add a feature to create bespoke policies within the module. I currently have this but am getting errors

ChatGPT quickly suggested

A better approach is to use a map(object(...)) instead of set(map(string)). This way, you can loop through the policies using their names (or any unique identifier) as the map’s key.

Now, strictly speaking this means creating and using an ‘additional’ key for each collection in the array simply to oblige Terraform, which is why I had attempted to avoid it BUT, given the next iterative improvement, this is a tradeoff worth making. I wanted to be able to specify EITHER name OR name_prefix OR nothing for each bespoke policy. At this point to avoid ‘duplication’ and reference the collection key directly for each bespoke policy, I would have to mandate an entry for this purpose. Given that only the policy field is mandatory for execution, ‘duplication’ seemed the more sensible option. ChatGPT helpfully suggested some variable validation logic but needed some prodding to get to a correct solution including all 3 options correctly. The end result looks like:

resource "aws_iam_policy" "bespoke" {
  for_each    = var.bespoke_policy
  description = each.value.description

  # Use "name" if provided. If not, use "name_prefix" if provided. Else, let Terraform decide.
  name        = try(each.value.name, null)
  name_prefix = try(each.value.name ? null : each.value.name_prefix, null)

  path   = try(each.value.path, null)
  policy = each.value.policy
}

variable "bespoke_policy" {
  type = map(object({
    description = optional(string)
    name        = optional(string)
    name_prefix = optional(string)
    path        = optional(string)
    policy      = string
  }))
  default = {}
  validation {
    condition = alltrue([
      for policy in values(var.bespoke_policy) : policy.name == null || policy.name_prefix == null
    ])
    error_message = "For each policy, only one of name or name_prefix or nothing can be provided."
  }
  description = <<EOF
    Bespoke policies to create within module. Use policy names or name_prefix as keys:
    ```
      policy_name {
        description = optional(string)
        name        = optional(string) # The name of the policy. If omitted, Terraform will assign a random, unique name
        name_prefix = optional(string) # Creates a unique name beginning with the specified prefix. Conflicts with 'name'
        path        = optional(string)
        policy      = string # **Required**  JSON formatted string
        }
    ```    
EOF
}

Not a typical Terraform variable (the HEREDOC in the description is my own addition) and not a style I would normally pursue or endorse. Here though it has a clear, established, well understood and useful purpose. Given that there are under 150 lines of code total for the module itself, including comments, variable descriptions, empty lines, etc, it is again a tradeoff that I think is worth making.

Setting an Example

At this point I think that good free-standing Terraform modules include examples to demonstrate usage. Here is a great example- I wanted to test all the features in the code and also provide a concise set of examples for consumers. Let’s see how we can do this with ChatGPT- I don’t really care about the syntax or permissions for toy policies deploying to my private dev account temporarily and I don’t want to spend too long having to design everything myself.

Within the same conversation context with ChatGPT:

ok, with your help I now have the following code in the module. I would like you to create some examples using the module so that we can demonstrate/test all features with a small number of examples

I then pasted the code for the module (main.tf and vars.tf). ChatGPT gave me 5 examples, each with a title and code:

Example 1: Basic IAM Role

This will create a basic IAM role without any attached policies.

Example 2: IAM Role with Inline Policy

This will create an IAM role with an inline policy.

Example 3: IAM Role with Managed Policy ARNs

This will create an IAM role and attach existing managed policies to it.

Example 4: IAM Role with Bespoke Policies

This will create an IAM role and also create bespoke policies within the module and attach them to the role.

Example 5: IAM Role with All Features

Combines inline policies, existing policy ARNs, and bespoke policies.

In a classic ChatGPT error I had to ask:

which of these examples tests permissions boundary?

ChatGPT then wanted to create an additional usage example to incorporate this, requiring a pre-existing policy for permission boundary. I suggested instead:

lets add it to the ‘all features’ example instead

Which worked but again the policy was presumed already available and so I asked

can we instead create a policy for this as part of the example- or will that not work

ChatGPT came back with

Yes, you can create a policy as part of the example and then use its ARN as a permissions boundary for the IAM role. Here’s how you can do that:

IAM Role with All Features (Updated Including a Permission Boundary Policy Creation)

[with sample code]

In this example, we first define a permissions boundary policy named example_boundary_policy. The ARN of this policy (aws_iam_policy.example_boundary_policy.arn) is then used as the permissions boundary for the IAM role created within the module.

Room for Improvement

Impressed but suspicious, I asked:

this is great. are there any features that the examples don’t test?

To get:

Your module has a number of features, and while the given examples touch on most of them, here’s a summary and whether they’ve been covered by the examples:

  1. IAM Role Creation - Covered in all examples.

    • name and name_prefix: Demonstrated.
    • assume_role_policy: Demonstrated.
    • permissions_boundary: Demonstrated in the updated “all features” example.
    • role_description: Not demonstrated.
    • force_detach_policies: Not demonstrated.
    • iam_role_path: Not demonstrated.
    • max_session_duration: Not demonstrated.
  2. Inline Policy for IAM Role - Demonstrated in the “all features” example.

  3. Attach Existing Managed Policies to IAM Role - Demonstrated in the “all features” example.

  4. Create Bespoke Policies & Attach to IAM Role - Demonstrated in the “all features” example.

    • Here, you demonstrate the ability to provide either a name, a name prefix, or neither (and let Terraform assign a name).

To fully test all the features of your module, you might consider adding examples or expanding existing ones to demonstrate the following:

  • Setting role_description for the IAM role.
  • Using the force_detach_policies flag.
  • Setting a custom iam_role_path.
  • Configuring the max_session_duration.

You can integrate these features into the “all features” example or create separate examples to keep them isolated for clarity. Remember to keep your examples as close to real-world use cases as possible; this makes it easier for users to understand the purpose and application of each feature.

I chose let's add these to 'all features and took the updated code offered. The code is on Github but the ChatGPT explanation:

In this example:

  • A role_description is set to describe the IAM role.
  • A custom IAM role path (iam_role_path) is defined as /example_path/.
  • The max_session_duration is set to 3600 seconds (1 hour). This is the default, but it’s explicitly set here for demonstration.
  • force_detach_policies is set to true, meaning Terraform will remove any attached policies before deleting the role.

With these additions, the “all features” example now touches upon all the available features of your module.

I added the code for the examples myself and renamed them.

Read me loud and clear

I advised ChatGPT of my renamings (without giving the originals) and asked it to write a Readme as a base. This was ‘ok’ but I edited this myself considerably from the initial offered output. For the examples I used ChatGPT’s short descriptions and terraform-docs

Reiteration

At this point I realised that the options provided for naming bespoke policies were missing for the role policy and so asked ChatGPT is it possible to give the same options for name/name_prefix/auto_create for the role_policy? Although the response was affirmative, I had to go through the same silly dance of ’either/or or neither’ as before and again tell ChatGPT that we did not need to tell terraform to deliberately create a unique name on its own since AWS does this natively. I also wanted it renamed from the misleading inline_policy to role_policy as before. The initially-supplied variable validation code was faulty and took 2 more cycles to resolve.

Outputs

I had already hand coded a basic output for the IAM role produced by the module but now I asked i would like to add an output for the arns of the bespoke policies if created. This gave me the following, that whilst I could have witten it myself, would certainly have taken longer:

Sure, you can create an output for the ARNs of the bespoke policies using the aws_iam_policy.bespoke resource. Here’s how you can do it:

output "bespoke_policy_arns" {
  description = "ARNs of the bespoke IAM policies created."
  value       = [for p in aws_iam_policy.bespoke : p.arn]
}

When you apply your Terraform configuration, you’ll be able to see this output and retrieve the ARNs of the bespoke policies. If no bespoke policies are created, this will simply return an empty list.

From memory, for the equivalent that I had previously manually created, I had only provided for a single or no bespoke policy and had (uncollapsed) list output. This was a key thing I would have liked to fix, given the chance

The Takeaway

Speed

In the end I was able to spec, write and test this module and 5 examples in one day. From recollection it had taken me 3 days to hand-write the equivalent and 3 examples when I had done this previously in the recent past (with extremely restricted tooling and conditions). in part this is due to novelty of approach (see below), and in part not having to write boilerplate manually.

Novelty of approach

ChatGPT could suggest approaches that I might not have immediately considered, together with an implementation to implement iterative improvements. As I have seen before this is a great accelerator for iteration.

Caution

As so many have said, ChatGPT and other LLM’s can be confidently wrong. If I had not been fastidious in checking what was included, that it was advised best practice to use, and ensured I wasn’t using deprecated features, I can’t be confident that something unsuited would not have made it through- see the notes at the top of the Hashicorp documentation. Further, if I had not checked the deployment for each example (manually) then I would not be confident that they worked correctly. On the other hand I could, to some extent, ask ChatGPT to reflect on its own output as described above to ensure that test cases were covered and then to generate those tests.

Conclusion

As ever, I remain a big fan. This is an example of force-multiplying in an area where I certainly could have done it all myself - with a lot more work and time- and indeed previously have done. We are not yet at the point where the output may be fully trusted without verification but it’s a a lot quicker to verify and fix than to start over!