When working with HashiCorp Terraform, it’s important to understand how to interact with external systems and data. Terraform provides a mechanism to query and use information from external sources through data sources using a data
block. In this article, we’ll explore what data sources are, provide a use case to illustrate their importance, and clarify the difference between data sources and variables in Terraform.
What Are Data Sources in Terraform?
In Terraform, data sources serve as a means to retrieve information from external sources or systems and incorporate that data into your infrastructure configuration. Data sources are read-only and allow you to access information like Azure virtual networks, other cloud resources, or even data from an API. They are particularly useful for referencing existing resources or configurations that are managed outside of your Terraform project.
Here’s a simple analogy to understand data sources better: think of data sources as a way to query or look up information, while resources are used to create or manage infrastructure.
Data sources provide several advantages:
- Reusability: Data sources allow you to reuse data from other configurations, enhancing modularity and reducing duplication.
- External Data Integration: They enable you to incorporate external data into your Terraform configuration, such as cloud service information or API responses.
- Read-Only Access: Data sources are read-only, ensuring that you don’t inadvertently modify external systems or resources.
Data Sources are defined in Terraform code using a data
block instead of resource
block. This tells Terraform to read the properties of an existing resource, rather than create / manage the resource. See the following example on how to define and use a data
block in your Terraform code.
Example Use Case: Existing Azure Virtual Network
Let’s explore a practical use case for data sources in Terraform. Suppose you are managing resources in Azure, and you want to create a Subnet, network interface (NIC), and virtual machine (VM) connected to an existing virtual network.
Here’s where data sources come into play.
Creating an Azure Virtual Network Data Source
data "azurerm_virtual_network" "example" {
name = "b59-vnet"
resource_group_name = "b59-resource-group"
}
In this example, we create a data source using the azurerm_virtual_network
data source type. We specify the name
and resource_group_name
only of the existing virtual network we want to reference. We don’t need to define the full configuration, as this resource is not managed in this Terraform project. Rather, this project will only be referencing the existing resource.
Using the Azure Virtual Network Data Source
Now that we have defined the data source, we can utilize it when creating an Azure Subnet, Azure Network Interface (NIC) and Azure Virtual Machine (VM) within the Azure Virtual Network referenced by the data source:
local {
vnet_location = data.azurerm_virtual_network.example.location
vnet_rg_name = data.azurerm_virtual_network.example.resource_group_name
}
resource "azurerm_subnet" "example" {
name = "internal"
resource_group_name = local.vnet_rg_name
virtual_network_name = data.azurerm_virtual_network.example.name
address_prefixes = ["10.0.1.0/24"]
}
resource "azurerm_network_interface" "example" {
name = "b59-vm-nic"
location = local.vnet_location
resource_group_name = local.vnet_rg_name
ip_configuration {
name = "internal"
subnet_id = azurerm_subnet.example.id
private_ip_address_allocation = "Dynamic"
}
}
resource "azurerm_virtual_machine" "example" {
name = "b59-vm"
location = local.vnet_location
resource_group_name = local.vnet_rg_name
network_interface_ids = [azurerm_network_interface.example.id]
vm_size = "Standard_DS2_v2"
}
In this Terraform resource configuration, we use the data.azurerm_virtual_network.example
data source to retrieve the location, resource group name, and other properties of the existing Azure Virtual Network. We also assign a couple local
variables to make referencing these easier within the code. This allows us to create the virtual machine with the correct network configuration.
Data Sources vs. Variables
Now that we’ve seen data sources in action, let’s clarify the difference between data sources and variables in Terraform.
- Data Sources: Data sources retrieve information from external systems, services, or existing resources. They are used to reference and query external data within your configuration. Data sources are read-only.
- Variables: Variables are used to define parameters and values within your Terraform configuration. They are internal to your configuration and are used for providing input to your resources and modules. Variables are read-write, meaning you can set and change their values within your Terraform code.
Data sources are used to fetch data from external sources, while variables are used to store data within your Terraform configuration.
Conclusion
Data sources in Terraform provide a powerful mechanism for incorporating external data into your infrastructure configurations, enhancing reusability and modularity. By understanding the use cases and differences between data sources and variables, you can effectively manage and deploy infrastructure that relies on external resources, such as cloud services and APIs.