Cables2Clouds

Ep 7 - Terraform for the Network Engineer

May 17, 2023 The Art of Network Engineering Episode 7
Cables2Clouds
Ep 7 - Terraform for the Network Engineer
Show Notes Transcript

Welcome to the Cables2Clouds Podcast! In today's episode, hosts Chris Miles, Alex Perkins, and Tim McConnaughy are joined by special guest Will Collins, Principal Architect at Alkira and LinkedIn Learning Instructor. The topic of discussion is "Terraform for the Network Engineer."

The conversation begins with a brief overview of Terraform and its benefits, as well as its relevance to network engineers in the cloud. Will explains how Terraform allows network engineers to automate infrastructure deployment and management, saving time and reducing the risk of human error.

The hosts and guest discuss the fundamentals of Terraform, including its declarative language and infrastructure-as-code approach. They explore how Terraform compares to other automation tools and the role it plays in cloud infrastructure management.

Will shares his experience teaching Terraform to network engineers and how it can revolutionize their day-to-day operations. The group also discusses the importance of proper planning and testing when implementing Terraform and the potential pitfalls to avoid.

As the episode wraps up, the hosts and guest share their final thoughts and recommendations for network engineers interested in implementing Terraform.

Join us for this insightful discussion on Terraform for the Network Engineer with Will Collins on the Cables2Clouds Podcast. Don't forget to subscribe and leave a review!


How to connect with Will:

  1. Twitter: [https://twitter.com/WCollins502]
  2. Blog: [https://wcollins.io/]
  3. LinkedIn: [https://www.linkedin.com/in/william-collins/]

Show Links:

  1. Will's LinkedIn Learning Course "Terraform: Managing Network Infrastructure" : [https://www.linkedin.com/learning/terraform-managing-network-infrastructure]
  2. Hashicorp links:
    1. Terraform: [https://www.hashicorp.com/products/terraform]
    2. TF Cloud: [https://developer.hashicorp.com/terraform/cloud-docs]
  3. IaC, Patterns and Practices by Rosemary Wang: [https://www.oreilly.com/library/view/infrastructure-as-code/9781617298295/]

Check out the Fortnightly Cloud Networking News

Visit our website and subscribe: https://www.cables2clouds.com/
Follow us on Twitter: https://twitter.com/cables2clouds
Follow us on YouTube: https://www.youtube.com/@cables2clouds/
Follow us on TikTok: https://www.tiktok.com/@cables2clouds
Merch Store: https://store.cables2clouds.com/
Join the Discord Study group: https://artofneteng.com/iaatj
Art of Network Engineering (AONE): https://artofnetworkengineering.com

00:00
Welcome to the Cables to Clouds podcast.

00:15
Cloud adoption is on the rise and many network infrastructure professionals are being asked to adopt a hybrid approach. As individuals who have already started this journey, we would like to empower those professionals with the tools and the knowledge to bridge the gap. Hello and welcome back to the Cables to Clouds podcast. My name is Alex Perkins at Bumps in the Wire on Twitter and I will be your virtual host for today's episode. Joining me are my two co-hosts, Tim McConaughey at Juan Golbez and Chris Miles at BGP main.

00:45
What have you been up to lately? How was the trip to New Zealand? Oh, yeah. That was very nice. Yeah, I went to New Zealand for the first time ever just last week. I had a mate from the US going down and he invited me to join him. I will admit it was a very touristy trip. We went and saw Hobbiton and things like that, but it was very much worth it. I enjoyed it a little bit too much. But yeah, spent some time in Auckland. Now I'm back in Sydney and looking forward to the weekend.

01:15
Awesome. All right, Tim, how about you? What have you been up to since we last spoke? I also went to New Zealand. That is to say, I went on my, what is it, my treadmill, the iFit Nordic thing, Nordic track thing. No joke, they actually have like a whole series done by the actors that, some of the actors from Lord of the Rings and from The Hobbit actually, which is kind of cool. And so I got to walk around New Zealand. I did not see Chris, but I'm hopeful the next time I will. See, he wasn't really there.

01:44
Yeah, it sounds like it. Otherwise, nah, it's been busy, man. Believe it or not, I've actually worked this week quite a bit. I know. I know. For once. I know. This is pretty crazy, right? I actually did work this week. But yeah, man, I tell you what, I'm looking to work for the weekend. This weekend, we are going to a gaming convention. They have a yearly gaming convention here in Raleigh. Dungeons and Dragons and arcade games and all sorts of crazy stuff.

02:10
And then, actually I scored tickets to see the new D&D movie ahead of time on Sunday. So I'm taking the fam to go see that. That's awesome. That sounds like a real fun time. Yeah, what about you, man? Yeah, finally done all my two weeks of misery with a bunch of upgrades I was doing. So now I have like a month break until we upgrade another, like an ACI fabric that's got probably another 50 devices in it. So at least that's all at once. It's not like a...

02:38
a whole weekend thing where I have to spend my entire weekend in front of my computer. You keep good company though. There's lots of people to keep you company in front of that computer, right? On those bridges you join? On the next one, actually the customer was like, we all want to be on the bridge. I was like, no way. Yeah, it was great. Said no one ever. All right. Well, today we have a friend of the show and a very special guest. His name is William Collins and he is the principal cloud architect.

03:06
for a cloud networking startup called Alkira. You can find him on Twitter at WCollins502, as well as on his blog, which is WCollins.io, which of course we will add to the show notes. Will, how are you doing today? Never better, never better. Life is good. It's not raining outside, so we've had a lot of rain lately in Kentucky, as usual. So no rain the past few days, so I'm a happy camper. I saw the pictures of the flood. Everything is just gone now? That was crazy.

03:36
Yeah, well, we tend to get a lot of rain anyway. So whenever we get like a super overabundance and then, you know, we lost power for a long time. And I actually had my sump pumps connected to my UPS that was powering some of my other computer gear. And then I had another battery backup that I had, you know, rigged together. Just, you know, these pumps are running, you know, nonstop and luckily no flooding. So life continues. Wow, well, I'm glad to hear it.

04:02
All right, so before we jump in, Will, this is kind of a tradition. Can you give us a primer on how you kind of did your transition from a traditional on-prem network engineer to more cloud networking focused? I know you were at a large healthcare provider before Alkira, so various stops along the way. So what's your quick summary of your background? I'm really going to try to make this quick. So I started, I think the first.

04:29
quote unquote, like large enterprise I worked for. It was a multinational financial services. We had two data centers per continent. My guys in the data center, transitioning like CSS load balancers to F5, Perl, Net S&MP, Expect, coming from that world. And I remember like this one time we had this application team that came and they were telling me about the cloud. Like, this is so great, like blah, blah, blah, blah, blah. And I looked at them and I was like, there's no way in the world.

04:59
any company is going to put their intellectual property in someone else's data center. This is a disaster waiting to happen. And boy was I wrong. So like when cloud started picking up and really like the first time I remember the first time I set up a, you know, an AWS AWS, um, site to site, you know, and I remember thinking like, this is really cool. And then I started getting into AWS and like learning, you know, this is a long, long time ago and then slowly at first I started to become a believer and then I got really bullish.

05:28
you know, probably about, I don't know how many years ago it was. So before I worked for a small company in Louisville, um, that was also going through this sort of like, you know, how do we adopt cloud? Like how do we operationalize it? It's still like very new at that point. And then just kind of figuring it out. So I really became a believer then because they had like real problems that they needed to solve that like the cloud was a good fit for. So like in my mind, I could see like, okay.

05:55
you know, this is solving a problem. And I really started putting a lot of extra time into learning things and really trying to put together like proper designs and stuff like that. So, and then from there, it just spiraled. Yeah. And oddly enough, I think that that's actually how Will and I met. Now that I, now that I think about it, I believe Will reached out to me years ago. I was like, Hey, I'm building a cloud team. Like, would you be interested? Cause he knew I was in, in Louisville, um, Joe cards, first of all, but, um,

06:24
But yeah, he reached out to me and was like, I'm building a cloud team. And I was like, I remember I had just already decided I was moving. I was moving to Tulsa. So I was like, yeah, sorry. I was like, I'm just now leaving. It would have been awesome. But I remember in the back of my mind, I was like, a cloud team? I was like, I don't really know how I'd feel about being on a cloud team. That doesn't sound that fun. I'm glad that, I'm eating crow on it now. But more so I'm glad that we've come full circle.

06:53
and we can actually link up now. So this is pretty nice. Small world, really small world. Yeah, I remember that. It was funny like seeing your profile and like, oh, he's really close to me. You know, this could be great. We could have been teammates, you know? I know. Could have been. Now look at us, we're rivals, baby. Ah. Throw down that gauntlet, yeah. Well, awesome. It's really cool that, you know, I think you had a pretty unique coming into the cloud stuff if you came at it from a view of.

07:22
already seeing the types of problems it's solved. That's a huge thing that we've talked about before is that mindset of understanding what it actually enables. So I think that's really cool that you got in right away and that's what got you excited about it. Yeah, I think that's a good point, but one of the, I think, even more important things is I had the experience of being in some very early adopter scenarios and seeing a lot of ways that you shouldn't do it and learning a lot of valuable lessons along the way. So.

07:52
I think you learn more from failing than you do from succeeding. Oh yeah. Every outage is gold. It pays dividends later on. You learn from things. Yeah, absolutely. All right. So today's episode is going to be all about Terraform for the network engineer. So infrastructure is code. You know, you hear this everywhere right now. It's just all the rage. And I'm not sure there's a more popular tool than Terraform right now. There's lots of competitors here and there. You know, there's like Pulumi. There's...

08:21
all like the cloud native stuff, but Terraform is like a household name almost in the field at this point. So, you know, we're gonna start with something real easy. What is Terraform? Yeah, so on the surface, if you fire up the Googles and you take a look around, you'll learn that Terraform's just, it's an open source tool that you can use to provision and manage your infrastructure. And that infrastructure gets defined with Terraform using

08:49
human readable declarative config files. So as a product that interfaces essentially with platforms and services using APIs. Yeah. And what kind of language is that? You said it's human readable, but everybody, a lot of the people coming from networking know of YAML, right? From Ansible and all these different wrappers around Python. And how does Terraform do it? So they, HashiCorp has their own language, HCL.

09:17
But you can also, I mean, if you're really used to JSON, you can fire up Terraform with JSON as well. In fact, a lot of like automated testing and stuff, it's kind of easier to parse and do things that way. And then you can even transpose and do things with YAML. You can decode and code with YAML if you really wanna input configuration as YAML, which I've actually set up a lot of stuff like that. Like I was working a long time ago on something where, you know, a team.

09:45
They were doing everything with YAML. They had Ansible. They were using like a net box. You know, they had that, you know, sort of set up there and they really liked YAML. They didn't want to let go. So we did a lot of transposing with YAML for policy coming in, you know, taking it in and then running Terraform to, you know, deploy infrastructure. So pretty flexible. Actually, this is good because this is a good way to explain it. How would you say it's different than something that...

10:13
A lot of network engineers that have been moving into DevOps and automation for the last few years would describe it for Ansible. We know that Terraform is very different than Ansible, but if you were explaining to somebody, like, hey, I use Ansible today, why would I ever use Terraform? How would you throw that out there? That's a really good question. So when you think of, I think, the problems that both of these tools were aiming to solve. So in the early 2000s,

10:41
About the early 2000s, ESXi started really hitting in the data center and more things became software defined. And then you have these physical to virtual migrations. And then came the era of kind of infrastructure as code to where if you have a lot of VMs out there, you deploy these VMs and you want like certain security criteria set on them, certain packages for different types of VMs deployed like on a Linux image.

11:08
And then if you have a developer, like a rogue developer that's like, Oh, I'm going to change this and I'm going to try to do some secret squirrel stuff over here that infrastructure is code tools going to reach out. Hey, something's not right. And it's going to, you know, fix it. But this is predicated on physical infrastructure for the most part. So I'll try to keep this to networking since this is a cloud networking podcast, but say that you like deploy a physical router or a physical switch. You.

11:35
Go on the data center, you unbox it, you rack and stack it. And then there's like a base configuration that maybe you put on it, like TacAx, SNMP stuff, some generalized things. But then maybe you have a controller or something or a server where you have Ansible or one of these other tools running on it. It's gonna reach out and configure your router or switch maybe the way you want it. So this is what is referred to as runtime configuration. So I already have a running.

12:05
physical object that I'm connecting to and I'm configuring. Great. So Terraform is a little bit different. So instead of doing things like in the run, uh, the run space, it's going to shift that into the build pipeline. So when I said declarative earlier, it's like, literally you're declaring your configuration in these files. And then what you declare, it's going to go and build this immutable sort of infrastructure. So to kind of add on that, like instead of thinking, you know, I remember like I've

12:33
worked for places that kind of had some good mature network automation and some that were just kind of on the, you know, trying to get into it and think of like, how, how do I automate the software upgrades for these switches or routers across the U S or another continent? Like am I going to do this in the canary sort of fashion? Like do a few, take some feedback and then if it, everything's kosher, then do like a region or, you know, so I'm thinking about devices, a specific type of device over a large geography.

13:03
And with Terraform and this infrastructure's code paradigm and being declarative, I'm sort of shifting my thinking to environments. This is all the networking for this whole development environment for this product. So that's one way to look at it anyway. That's good, Ben, that's good. Yeah, that's a really great way to do it. And just to, I have to add this comment in here. So about, we were talking about YAML. Mitchell Hashimoto, one of the co-founders of Hashicore, he hates YAML. He's famously known for...

13:32
how much he hates YEML. But yet they had to add it in there. I'm not a fan of YEML either, man. There's better ways that have been invented to do things. If a space can destroy your entire pipeline, it's awful. Yeah, where do you put the hyphens? When do you use them? When do you not use them? Yeah, there's a lot. The nesting, just like I said. And God forbid if you ever hit tab instead of space. Oh, man. Anyway. All right, cool. That was a really good.

14:01
overview though, but declarative versus imperative. So how about state management? Cause this kind of really leads perfectly into state management, right? Because one of the things you're talking about declarative, if you're saying, I want 10 servers provisioned and you change it, like, how does it know how to reconcile what you're declaring and what's actually there? So that's a great question. And I wanna preface this with no tool is perfect.

14:30
We're network folks, we know that. There's problems and solutions and you have certain tools that are really suited for a specific type of problem. But I would say that every tool out there has its Achilles heel. It has something that is the thing that really, if you don't handle right, can really cause chaos in your environment. And that, specifically with Terraform, has to do with state management most of the time. I've spent a lot of time.

14:58
manually doing things with state files and just trying to, it can just be a problem if you don't approach it with the right lens, if you don't understand it. So basically to, to kind of summarize, like if you think about all the changes that happen in cloud, all the infrastructure that is required to run, like a global application, it'll make your head spin. There's a lot of stuff out there, lots of different changes. And not only that, but what about the...

15:28
the relationships that exist between all of these things, all these resources. So this is where Terraform state comes in. It's got to know about stuff. So basically every time you run Terraform, it's going to, it's going to basically compare your desired state, like what you want with your operational state, what is out there and then, and then adjust things accordingly. And it's really just a flat text file at the end of the day. So it's JSON formatted mapping of these.

15:57
these resources. Yeah. I was going to say that comes back to Terraform being, I guess, what do they call, idempotent, and that you're defining ahead of time what you want your end state to look like. So it just compares the state file to what the current state is. And that's how it determines those mappings that Will was talking about of what needs to be deployed and how things relate to each other. I'd look, since there's people that haven't probably

16:24
potentially not use Terraform, I'd love to spend just one more second to explain the problem statement, right? Because it may not be extremely clear. So with Terraform, you know, you did exactly what Will said, right? You say, this is my end state, this is what I want, build this infrastructure for me, and then, you know, hopefully we'll keep it the way it was built, right? So that's all great, right? Perfect. Thumbs up, everybody's happy. And then, you know, the junior guy comes along and says, oh crap, I made a mistake on...

16:53
this or they say, I want to add a nick to the server, you know, because we need it to attach to another load balancer or something like that. Right. And so they go into the AWS console or whatever, they add the Nick to the VM and, you know, they get it working and unknowingly they've completely destroyed, you know, the, the, the whole environment more or less, because, uh, the Terraform state is not informed of that change. And so, you know, you guys know when you

17:21
Next time you run anything, next time you do any kind of terraform, you have this drift where there's now a disconnect. And so your choices at that point are to like, like Will said, just get in there in the JSON and try to like fix it manually, like mess around with the TerraStorm save file. Or you know, that thing's going to get destroyed and kind of reset to the factory default, or you're going to have to start like unmanaging and then re managing resources. It's really ugly. And so that's kind of the problem statement. And think about is network engineers, you know how easy it is to

17:49
Like you have some developers that are testing for something and someone goes in and does like an, you know, any, any rule or they add some static routes to some device. They, things get added over time. And then that configuration, it looks like a completely different, it doesn't match anything you've got in your environment. And so it's, you know, it's a snowflake. I want to avoid that. We'll definitely touch on this later a little bit when we talk about maybe bringing in a CI CD, but.

18:15
I think it is easy for us as engineers by trade. It's just like whenever someone brings up a problem, we're naturally gonna be like, oh yeah, I can fix it. I know what the problem is. Boom, boom, boom. And you just, you know, in AWS, you can log into a GUI on-prem. You could just log into the router CLI and, you know, change the ACL or, you know, add the NIC, whatever. And it's fixed, right? And that's- Troubleshooting, yeah. Yeah, it's a lot of cowboy stuff. It's what we've referred to it as for a long time. But it is, once you add

18:44
infrastructure is code, it does. Yeah, there's, there's a lot more dependencies now, um, uh, that, that are, that are in play that you need to be considered of. Absolutely. And one last thing. So what I'm going to say right now, I'm actually, uh, so this, this podcast is a really good, it's really good timing as far as things going on in my life personally, cause I'm going to be releasing a new LinkedIn learning course in May, which is around the time that this podcast is being dropped. So if you really want to deep dive, you want to get into the CLI and you want to do a lot of this stuff.

19:14
you know, make sure and stay tuned. Yeah, we'll have to make sure that when you've got all the links and stuff for that, Will, that we get those in the show notes for you as well. Yeah, absolutely. And, Will, you already have a LinkedIn course on there as well, right? What's the one you already have out? Hybrid multi-cloud networking for the real world. Yeah, check it out if you haven't. More like lessons learned and like real, just going into like architectures, not so much. I mean, how many VPN tunnels can you configure? It's kind of IPsec is IPsec, so.

19:44
looking at like real life stuff. Yeah. OK, awesome. So let's take a little detour. So let's talk about kind of how in the past, especially you have a lot of experience with this. So in the past, how people kind of did infrastructure as code with on-prem physical networking devices and compare that to how the modern day infrastructure as code stuff kind of plays out. Oh, boy. So yeah, that's a loaded question. And I'll try to keep it as quick as possible.

20:14
One of the things that I think physical networking is working, especially if you're in the enterprise, you're working with a set of, you're working within a system that usually doesn't like a lot of change. In fact, like there's a lot of incentives sometimes to not make changes because you don't ever want to take anything down. And a lot of this ties into different verticals inside of an organization. So like change management, especially. So if you've ever.

20:43
If you've worked in networking for any given amount of time, you're maybe used to building some changes out, some routes or something for, you know, some configuration changes that need to happen. You put them in a notepad file, you attach them to a change order, and then you've got to go to cab. You have change advisory board meeting, and then many cab, and then, you know, cab to the regional cab, all the cabs. It's cab to the point to where you're just like.

21:10
All I do is sit in cap calls and it's just to make a change that literally has no impact. It's like a net new thing or something. And usually the folks vetting these changes are, they're not, they don't understand them, which rightfully so. Like these are very complicated. You know, this is why we have, you know, the Tim's of the world to, to really go through and do this complicated stuff. So, you know, change management, it sort of needs to evolve. So when thinking about the difference.

21:37
In terms of like on-prem, like on-prem is bound by a lot of this stuff. So any little change that's made, it's just so much scrutiny. And you're limited with how many of these changes that you can push out usually over a given amount of time. You have like very tiny change windows and this just becomes really problematic if you're really trying to transform. So usually like on-prem. No agility. Exactly. Zero. Yeah. You're going nowhere fast. Well, do you think some of that is because, um,

22:06
the kind of the crown jewels, if you will, tend to have sat on-prem for those many years and we haven't necessarily gotten there in the cloud yet. Or do you think form follows function on that? Or what do you mean? That's a good question. And I think the, so change management, when they usually they wrap their head around all the changes that are happening in cloud. And usually what does it is you're actually doing things in cloud. Now you have a production application there.

22:34
And something happens, some change was made or whatever, and this thing tanks. And it's important enough to where it gets that visibility. And then you have all the everybody crawling out to try to get their hands around it and slow everything down, prevent changes and all these things. Do you think that's gonna happen though, as more enterprises move their crown jewel kind of revenue generating apps to the cloud? I think it's inevitable, there's gonna be some level of that. It already has, to be honest. I mean, if you're a...

23:03
I've seen it happen quite a bit already. But I think the thing is, I think organizations are also realizing they have to move their change management along with it. So there's things that we can maybe talk about later, if we talk about CI, CD and stuff, I have some neat ideas around that. And I've seen some really neat things happen there to where it kind of helps change management evolve, essentially. Yeah, absolutely. I mean, how many...

23:30
I'm sure all of us have written, they're probably called something different everywhere, but everywhere I've been, it's always called an engineering implementation plan, an EIP. Like a mob or something? Yeah, and it's like, you're going to change management people that don't even know what it says. Yeah, it's brutal. And that's the, you know, back to the change management, whole, you know, infrastructure is code on-premises. Like how do you work within these boundaries and do real infrastructure as code?

23:57
Even for low hanging fruit, say that you just want to update the SNMP string for a swath of devices. Right. Like, yeah, good luck. This touches this many applications and we have to get all the app owners involved and all of that. Right. Yeah. The traffic technically goes through this device. You're not touching it. You're not doing anything. So that's the thing. I've seen it for the low hanging fruit. You can.

24:26
I did the whole Netbox Ansible thing for a while before that. Like I was back before, you know, there was so much scrutiny. I was doing everything in like Perl and then Python for a long time, you know, which is really useful. But then you're a one. You're you're an island. Nobody can help you. You leave that you leave that company and then you have people reaching out to you for months in there. You know, please help us. We went from Python to seven to three, whatever. And everything's broke on this thing. We can't turn down any more vips. You know, it's.

24:55
How did you, quick side note, how did you get stuff like that through change management? Because this is a lot of what I'm going through right now at work is a lot of these places don't even want you to do automation with Python. They're afraid of it. Yeah, they're like, no, we still do everything manually. That's just how our process is. So how did you get past that all those years ago? Well, the farther back you go, before I had more smarts in the brain up here,

25:24
I was kind of like what you would refer to maybe as a cowboy on the network. So, um, I, I've been known to cowboy a few changes or two, you know, over time, but, you know, for the big things. And that's the thing, like, if you know, your network, like I never, like on the fit, you know, in those times, I know, I don't think I ever caused a single outage with automation because I knew the changes I were doing, you know, I was doing. And, you know, they weren't.

25:49
messing with core routing or removing, just doing things like that. They were really low priority stuff, but they were things that took time. They take time and you're just sitting there clicking boxes when you can just execute script. Do you ever have to use LMS? I remember that. I remember using LMS to do things like automating the.

26:12
username and password changes across the enterprise and stuff like that. Those are the things you could get automation, well, you really needed automation to help you with, so you're not logging manually into 400 devices or something, but yeah, good old LMS. The names have changed, but still the same things. Just to tie this back to Terraform, is like Will said, your sandbox environment,

26:39
when you're using Terraform, it can be the exact same environment, almost to a T, right? You could have a sandbox that is very, very much so representative of your production environment where your crown jewels are, and you can take that Terraform code. Obviously, things can be put in as variables so you're not impacting the production environment, but you can take the actual code that you're gonna put into production and run it in that sandbox, and that's your change management right there, right? So it's like,

27:09
When you're writing Python scripts on-prem and things like that, you can be like, yeah, well, I ran this script against this router at my desk and it worked and it didn't blow anything up. But you have a much better visual representation and can calm the sea levels of the world and things like that. Yeah, that's a really good point. And this kind of like, well, you were saying how you used, you were cowboying it, right? And you had these scripts for like kind of smaller things that you already knew.

27:38
But the difference really is that on-prem, you're just changing one or two things here and there. And then what we're talking about with Terraform, it's the entire infrastructure as code that is being all at once. So it's not the same kind of changes either. You're not recabling stuff with Ansible. No. And there are some things, so when you roll out, if you run your Terraform, there's some resources that it's going to sync the state, and it will actually update some.

28:08
Um, like a security groups, for instance, like they're, you know, technically stateful technically, and you can update a security group in place. It's not actually going to destroy the security group and bring up a new one. Whereas some, and then this is why you got to know your resources, some resource it's going to like wipe out and just completely rebuild the resource. So it's really important. You know, you got to do a lot of reading. It turns out reading is good. That's always a fun thing when you run, when you run a Terraform plan and, and then you see it's gonna, you know, it's like.

28:37
48 items to destroy. Yeah, it's like, what, what is that? Yeah, to wipe out my server farm to build a NIC. Auto-approve everything. This will draw a crowd. Let's do it. Let's not actually, don't follow that advice. That's a really good call out though. I just want to say that again about the security groups where you can, there are some things you can't, it will update in place. It can update in place too. That's a really crucial thing to know when you're doing.

29:05
Thank God it tells you, right? The Terraform plan will actually say we're going to update this in place versus we're going to destroy and recreate this. Yeah, and before we move on to the next little subtopic, maybe this is already common knowledge. Some people might be familiar with this, but Terraform uses a DAG, which is, I just found this out recently, okay? So this is really cool to me. A directed acyclic graph, and it's basically just a mathematical formula.

29:32
for how it links all of the different pieces of the infrastructure together. There's tools like add-ons, extensions, stuff in like VS code that will actually show you how all your infrastructure is linking it, like in a graph. It's super cool to watch. Yeah, I'll have to, I'll find out the name of some of them and add them to the show notes, because it's really cool to see. Yeah, so if you go to the HashiCorp configuration language, like if you go to the language white paper, it'll actually go through that DAG and how it actually works. It's pretty comprehensive. I'll send you a link afterwards and you can.

30:02
throw that in there too. Yep, and look, they said you never need to know math. It's everywhere. I've never heard the term DAG. I thought you guys were just quoting Brad Pitt from Snatch. Do you like DAGs? It's a rabbit hole, man. I went and looked this up because I didn't know what it was either. There are so many different kinds. A big rabbit hole to go down. All right, so we're going to switch gears a little bit. Let's talk about what are modules, and how are they used, why would you use them. All right, let's talk about what these modules are.

30:31
Yeah, so like modules are, you know, when you really start, like you have your, you have a big enterprise and you have some team that went off and they were early adopters of cloud and they started building infrastructure, then you have another team that did so. And really the enterprise doesn't know what infrastructure is code is at that point, but then now you have different teams that are going out and they have their own infrastructure files and then you have this inconsistent sort of mess. And then how do you do compliance?

30:59
You have no way to do compliance and you have all these different overlapping IPs, all sorts of issues you can imagine. So I think the best way to understand a module is like looking at like how it solves a problem. So one way to sort of rope that back in like, okay, we gave, you know, cloud came in, cloud was hot, and we hired some full stack developers and just gave them the ability to do everything because they should have been experts at every.

31:24
vertical of technology, but it turns out nobody can be. So we start like roping that back in and you have like these sort of like a cloud platform or cloud center of excellence where you have, you know, basically a DevOps team or cloud team that is gonna build, you know, essentially modules like golden versions of what configuration should look like. And really what it is, if you think of like Python, you know, you have different code blocks that you might have and then you can build a function.

31:54
And then a class, maybe. Yeah, exactly. So classes and functions, like that's how you can look at a module. You have like a repeatable version of something to where you're limiting the dev- like, OK, these developers can only put in like three different values. The rest of everything else is taken care of. And we've published this in some, probably like a private registry with Terraform Enterprise or something. Right. And then we have, you know, maybe we have workspaces for each environment of this, like dev, QA, prod.

32:23
And then each of those is also a separate state file. So, you know, one thing to consider is like networks and blast radius, you know, no more stretching layer two across some data centers. We want to avoid that like the plague. So limiting blast radius means limiting, you know, actually creating more state files. Cause a state file in essence is actually a blast radius. Yeah. You know, it's a fault, you know, a failure domain. So when you create an additional state file and you separate it, you know, you're

32:51
limiting the blast radius of like a failed change somewhere. So there is like a holy war around this topic that you just touched on, right? So I think we've talked about this well, but what's the book called? It's Terraform Oh Terraform Updatings? Or which one? No, there's one that's actually talks about like design patterns. Infrastructure as code patterns. Yeah, I was gonna say Do you have it on your, do you have the bookshelf? Do you have it on the bookshelf?

33:18
I think that, I think the one with the cow. Yeah, yeah, yeah. We're gonna add this to the links for sure. Yeah, I've seen that one. It's a good book. And the author, she works at the HashiCore, I'm pretty sure, right? She's like a dev rel person at HashiCore. I don't know if she does anymore. She did at one point, I know for sure. Maybe she does now, I can't remember. Yeah, but just there are so many arguments in different areas of the internet around how you break out your modules and your state files. Oh yeah. It's, there's a lot, yeah.

33:47
Like what's the right way to build your module pattern or like how much you should include in your module versus how much you should expose for your user. Like, you know, from a literally modular, how modular should your module be versus, yeah, there's a lot of that going on. Yeah, I guess it's the same with programming languages, right? I mean, it's almost the same argument. Yeah, with like methods, like libraries with methods and stuff. Yeah, absolutely. You gotta go back to the business though. Like what is your risk tolerance? Like you've gotta be able to meet DR

34:17
You know, HA requirements, what is your risk tolerance? Like if something goes down, you know, I think HashiCorp recommends like one state file per functional technology, or no, one state file per environment per functional technology. So that means for like, you know, I have product day like web app. That means I have a network state file for dev, a network state file for QA, a network state file for prod. So...

34:44
That sounds, you know, at the end of the day, there's got to be balanced though, because that's a lot of state files. And if you're using a man, you know, some sort of tool for like workspaces, you got to pay for this. Oh yeah. I'll tell you what, nothing is cheap. Um, nothing is free. So you've got to find that balance and it's going to be different for every company. Any argument that's arguing that right now, it's just, yeah.

35:07
It's like arguing like certs or no certs, like that argument that spins up on the Twitters every once in a while. And I'm just like. Should I get a degree or should I get a certification? Yeah, it's gotta depend on the business. So you gotta meet the requirements and you gotta work with the funding you have. There's constraints. Yeah, yeah. So you touched on something again here that I wanna call out. So Terraform Cloud versus Terraform Enterprise, right? Let's talk about that. Cause you mentioned there are tools you can use to kind of get this in order. And that's.

35:36
Terraform Cloud is one of them, right? So what do you think about that? First of all, coming from a place where when, you know, I started, I first started using Terraform using, like you can basically put a state file, I mean, it's a file, you've got to be able to read the file. You can put it anywhere. You know, usually it's recommended to not keep it on your laptop locally somewhere, you know, to keep it in like some shared storage, like S3 or something, but.

36:02
You can pay for these services like Terraform cloud. There's a few of them out there where you don't have to worry about the state file management and availability. And to me, if not having to worry about that is just, it's an awesome feeling. Great feeling. Um, a lot of weight lifted off of your shoulders. So those are really good platforms if you have the capital. Um, but if you don't, there's cheaper ways, you know, like I said, you know, S3 doesn't cost very much and.

36:30
you know, using the remote back end with Terraform, you can basically put these things anywhere, but it's up to you to, you know, handle that high availability. You know, if something happens to it, you're gonna have a problem. Well, and the security of your repo, right? Like the security, because that state file is like gold, right? From a security perspective, like it's everything that you've built. 100%. Yeah, yeah, for sure. Yeah, that's a tricky area, because that's when you get into...

36:56
workspace in Terraform Cloud is like the equivalent to just a state file really. And how many of these things do I need to create? How is this going to grow over time? Am I going to have like thousands of these things? So we released tags and we have tags to tag them now. Yeah, and it's kind of funny, right? Because then in Terraform Cloud, it almost hurts itself. Because then it starts showing you just how many different things you have set up. And it's a lot to manage.

37:24
Tagging schemes are good things to have in cloud. Get your tagging stuff in order. Yeah, 100%. That's just broad advice, right? For everything in the cloud, you should have a tagging schema. Yep, because costs, I mean, costs can just get out of control so quick and you need accountability. So you have thousands of EC2 instances out there. You have thousands of just tons and tons of resources. You can't, it's not humanly possible. You can't hire enough people to keep track of that and say, oh, you know, someone, you know,

37:53
deployed an AI ML massive sized instance and it costs like $20,000 over a weekend or something because they were doing all this. Did they need to? You know, so like having that tagging scheme, you know, being able to shut things down and all that goes a long way. Yeah, absolutely. And I think, you know, you're dealing with customers on a regular basis that are using Terraform, deploying Terraform and likewise over here at Tim and I are. I'm curious to get

38:22
your feedback from the customers that you see, that it could be very large, very small, what have you. How do you see them arranging their Terraform workspaces by, like you said, by technology, by application, things like that? How are you seeing people typically structure it? And what do you think is good or bad about that? That's a really good question. So one thing that I'm happy to not be seeing as much

38:50
these days is like the big monolithic state file, like all of our VPCs, all of our networking completely in one state file. It's just danger zone, red flashing lights. Let's avoid that. So really it's a mix. So like I've seen a lot of customers that really kind of sort of what I said earlier about knowing your business and structuring, like doing right by the way your business is structured and the way your teams and everything are structured.

39:17
I see a few every now and then that have really taken that to heart. And that's kind of where I took that from seeing some really good examples of it. Like teams that really sat down, like they talked, you know, I was talking to a, an architect this like last year sometime, and he was talking about how they came to their decision for this, like they actually brought all their teams in. They flew all their teams in and they had like an in-person workshop and they thought about the pros and cons, the cost, they went through everything and.

39:46
you know, looked at their business or organizational structure and built this. Like they thought about it before they started deploying the infrastructure. And I thought, wow, that's something you don't see very often. That's amazing. So, but other than that, like just a lot of everything, uh, one thing that unfortunately, and I, I lived through this myself, like I was a product of having gone through this is.

40:11
you deploy things with Terraform and then HashiCorp releases some really cool new features. And this bit me really hard a few times because I was a like a long time ago, I was a heavy user of count and count works very different than foreach. Foreach, yeah. Yeah. So they released foreach and it was like a mind bender. I was like, yes, it actually fixed so many problems. So they've released things over the years and it's really hard if you're in any...

40:39
any big enterprise to go back and refactor all that infrastructure is code that you, it's running your production. It's so true. You know, you've got to test it. And it's just how many brownfield environments do large enterprises have that were deployed at different times and just stacked over the years. So no different with infrastructure is code. Just because it's code and it's not physical doesn't mean it can't cause an outage. So yeah, at the end of the day, I see just a lot of different things, you know, because you can only

41:07
deploy the best practices, you know, as they are best practices at that time, because something will come along and make your life easier in the future. This is really, actually, I know we, we don't want to go too far, but, um, this is a really good question. So has, has Terraform deprecated anything that you think is sitting in somebody's state file or not state file, but, uh, in their code repo somewhere? I'm trying to think, I don't think they have actually deprecated anything, but I'm trying to think like how

41:34
horrible that would be to have your like, you know, production code all of a sudden get deprecated when you upload or upgrade your Terraform. Not I don't think so syntax changes. So there have been syntax changes and that will break things. So if you the cool thing about Terraform, and one of the things I love, and again, there's no perfect tool, but this just makes life easier, is you can set the version that you want to use. That's right. And then with the provider ecosystem, you can set the version of that provider you want to use. So you have these very tight

42:03
You know, everything is version controlled very tightly. So it works together well, but then what you have is like some new feature will come along and you're just thinking, okay, what do I have to do to get where we're at right now, new enough to use that feature because it's awesome. And it'll make our lives so much better. And you know, that's difficult because then you have to go through and figure out all the syntax changes and things. If you wait too long and

42:26
Yeah, it's kind of like the Ansible days where you run Ansible and it'll say, Hey, this is being deprecated or this is this or that is that. So that's what I was thinking of specifically. Yeah, I wasn't going to say it. Yeah. What about the cost? So, so actually one more thing. Yeah, go ahead. Go ahead, Chris. Go ahead. I was going to say there hasn't, I haven't seen a ton of like resource tied things that they've deprecated over the years, but I, I definitely see like, like Will said, a lot of like the, maybe the CLI stuff has changed, like how you interact with it. Like, uh, another.

42:56
I don't know if it's on, they're phasing out the Terraform taint command to kind of mark a resource as no longer needed. You can now do that with an apply flag instead. But I'm sure there's people using probably very old versions of Terraform still to this day just to maintain what they have and deploy it. Right.

43:23
I'm thinking of the- Well the beauty is it's not hard to actually use a newer version for new stuff. If you have an old environment and you don't wanna touch it, you've got those versions set. And for your greenfield environment, you have the new version set. Well, and the real concern with the older stuff as the coupling between a provider and the Terraform version eventually is probably gonna drift if you have like really old Terraform in you, but you need to use the new provider and-

43:47
The new, you know, they upgrade. That's really the danger more than has she actually get rid of stuff I think at the end of the day. 100%. Yep. So before we drift off from the customer thing, I had one last question for you. So, and this is really just interest more than anything. So are you seeing platform teams a lot? And are the networking people integrated into these teams? Oh my goodness. That's such a funny, like the karma that's just whipping me right now. So one of the last,

44:17
You know, the last big company I worked for before I came to Alkira, they were, there was a lot of really, you know, I learned a lot from some of the folks there, there was some super smart folks there and they, they were early adopters with clouds, they learned a lot of lessons and they actually learned from those lessons, which a lot of times, if you're in a big enterprise, you know, sometimes it's hard, like, okay, I thought we learned our lesson here. You know, we're going to do it different next time and you just don't, but yeah, working there, um, we had a leader that coined.

44:45
the phrase platform team before this, all this stuff happened. This was a long time ago. So what we did was we had this idea of a platform team and we had this idea of embedded engineers. We'd have a cloud platform engineer, a network engineer and a security engineer, those three. And identity and access management didn't want to participate. They were kind of off on their own. They had everything locked down like crazy anyway. It was more like, you gotta come to us if you want anything, so, which is fine.

45:13
But yeah, we were embedded with these specific product teams. This was a long time ago and that actually worked. I mean, we weren't trying to do, we weren't trying to be famous or do, you know, whatever we just, we had a need to solve a problem and that solved the problem at the time and it worked pretty well, but it didn't this whole like platform team versus DevOps jazz, I think it's just funny because it's, I think it's a pointless conversation because it's not one or the other.

45:41
DevOps is still alive kicking and I don't think it's going to go away. It's a culture. I think it's just more, I think it's just more so each, each other, each side of the coin doesn't want to admit that they're moving towards the other side. They don't want to meet in the middle. You know, that's they're too firm on their stance of, you know, what they own and what they know, you know? Yeah. And words carry things too, like new names, new job titles, getting funding for new things, um, starting some new, the next big program, like you're a new CTO and you come in, Oh, this whole platform.

46:10
thing is hot right now. Hey, we're gonna build a platform organization. Right. I'm sure stuff like that happens quite a bit. We're gonna build an event-driven architecture firm or something like this. All right, well, there's so much more we can talk about. We are definitely gonna have to split this up into an episode two and really dive into stuff like the CI-CD pipeline that we kind of mentioned a few times. There's so much more we can go into around this topic. But is there...

46:38
any question that you wish that we asked that we didn't. And it doesn't have to be related specifically to the topic, it can be just anything in general. Maybe like what product I use for my beard. Actually, I'm very interested. Every growing mind, these guys have beards as well. Yeah, it's a problem because there's this itch factor, like when you reach a certain length, it's just, ah, so now I'm just totally, totally kidding. All the questions were good, I think this is a great conversation. And more back on you all, props for starting this podcast

47:08
Networking is, who would have thought, a severely underrepresented thought when it comes to anything, even in cloud. So we will build applications and things and then build so much complexity and weird stuff in the application to work around simple solutions in the network because nobody knows. And so having podcasts like this to bring visibility to certain things and have structured conversations, this is a home run. Super happy with what you all are doing. Yeah, I really appreciate that. Thank you. Thank you.

47:38
All right, Will, where you want to tell us where people can find you? I know we kind of mentioned it and you have the LinkedIn course coming out. But anything else you want to you want to mention? Yeah, LinkedIn, William Hyphen Collins, Twitter W Collins 502. And then and then the blog will be in the show notes. That's about it. I like that you're you're not from Kentucky, but you've embraced the 502. I like that. That's why you're the only one that's so I was on a packet fishers episode and they were like, what is this random number? It's almost like an area code.

48:07
or something, what is it? Like, yeah, I'm from Kentucky. That's it. Yeah, that's it. All right, Chris, Tim, any parting thoughts? I was going to say, I don't know if you're watching the video version of this podcast, you may have seen Tim sporting a very beautiful Cable to Clouds mug. There it is. These are very own Vanna White. But yeah, so you can check out store.cablestoclouds.com. We do have a small web store up just with a few items.

48:36
We're pretty much marked everything at cost. We're not looking to make a bunch of money here, but if you want any Cable to Clouds gear, that's where you can find it. And yeah, so check it out. Awesome. Thank you. All right. Well, this is gonna wrap it up. So thank you all very much for tuning into the Cables to Clouds podcast. We appreciate all the support we've been getting. If you liked the episode, please share it around to anyone you think might be interested. Give us a five-star rating on your favorite podcatcher and subscribe to our YouTube channel. Until next time.

49:06
Hi everyone, it's Chris, and this has been the Cables to Clouds podcast. Thanks for tuning in today. If you enjoyed our show, please subscribe to us in your favorite podcatcher, as well as subscribe and turn on notifications for our YouTube channel to be notified of all our new episodes. Follow us on socials at Cables to Clouds. You can also visit our website for all of the show notes at cablestoclouds.com. Thanks again for listening, and see you next time.