Cables2Clouds

Ep 14 - Navigating the Financial Labyrinth of Cloud Networking with Will Collins

The Art of Network Engineering Episode 14

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 48:58

Send us Fan Mail

Ever wondered how to design a cost-effective cloud network? Buckle up as the Cables2Clouds team, along with the ever-insightful William Collins, shed light on the labyrinth of cloud pricing. We'll dissect the costs associated with data, requests, and additional resources, providing you with tools to anticipate and contain expenses. More than just cost, we delve into the resources your applications are using and the implications of scaling. Our aim? To save your budget, while ensuring smooth sailing for production applications.

Keeping tabs on cloud costs, especially for larger businesses adopting the cloud, can often feel like acing a mind-boggling puzzle. Don't worry, we’ve got you covered. Alongside our guest, we tackle the challenges of being cost-conscious in network design, and explore the necessity of understanding how applications communicate to optimize costs. We discuss the pros and cons of various models and delve into the emerging automation tools and cost calculators that can help you become more cost aware. With over-engineering in cloud networking design being such an easy thing to do – is it really necessary, or just an expensive habit?

Venture with us as we explore the differences between how the different CSP's handle high availability and disaster recovery, and the associated cost implications. We’ll also take a look at the ever-changing landscape of compliance requirements in cloud networking. Spoiler alert: these can significantly bump up costs. Arm yourself with insights about understanding cloud documentation and communicating requirements and solutions to your business. Join us for an enlightening, no-nonsense discussion on cost optimization in cloud network design

Links:
Infracost- https://www.infracost.io/
Nick Matthews Twitter thread about VPC Lattice- https://twitter.com/nickpowpow/status/1645493575189430272?s=20

Where to find William Collins:
Twitter: https://twitter.com/WCollins502
LinkedIn: https://www.linkedin.com/in/william-collins/
Blog: https://wcollins.io/
Previous episode with Will: https://www.cables2clouds.com/2129055/12609925-ep-7-terraform-for-the-network-engineer

Purchase Chris and Tim's book on AWS Cloud Networking: https://www.amazon.com/Certified-Advanced-Networking-Certification-certification/dp/1835080839/

Check out the Monthly Cloud Networking News
https://docs.google.com/document/d/1fkBWCGwXDUX9OfZ9_MvSVup8tJJzJeqrauaE6VPT2b0/

Visit our website and subscribe: https://www.cables2clouds.com/
Follow us on BlueSky: https://bsky.app/profile/cables2clouds.com
Follow us on YouTube: https://www.youtube.com/@cables2clouds/
Follow us on TikTok: https://www.tiktok.com/@cables2clouds
Merch Store: https://store.cables2clouds.com/
Join the Discord Study group: https://artofneteng.com/iaatj

Designing Cost-Efficient Cloud Networks

Speaker 1

Welcome to the Cables to Clouds podcast. Cloud adoption is on the rise and many network infrastructure professionals are being asked to adopt a hybrid approach as individuals who have already started this journey. We would like to empower those professionals with the tools and the knowledge to bridge the gap.

Speaker 2

Hello and welcome back to the Cables to Clouds podcast. My name is Alex Perkins and I will be your host for today's episode. Joining me, as always, are my two illustrious co-hosts, chris Miles and Tim McConaughey. Chris, how's your week been?

Speaker 1

It's good man. Yeah, man, not much going on. The World Cup the women's World Cup is going on here, which is nice, so I've been watching a few of those games. There's obviously been a lot of tourism people come to support and things like that, which is super cool. So, yeah, I guess, go team USA. I usually don't root for the USA, but in this scenario I will.

Speaker 3

Will allow it. The women's World Cup is funny because it has the double negative of being soccer in the US and also being a women's sport, so it's like double. Nobody watches it, right? If you look at the women's US national team, they crush every World Cup man. How many times have they won now? They kill it dude, and people just don't watch. It's really kind of sad actually.

Speaker 1

Yeah, it's crazy. They're like world-class athletes that absolutely run the game but on their game and don't get any clout for it. It's crazy. Yeah, they're insanely talented.

Speaker 3

Yeah, compared to the men's team. Right, you watched the men's team and I can't remember the last time we made it past the semifinals in the men's World Cup. Anyway.

Speaker 2

All right, Tim. What about you? I know your trip to Japan is coming up quick, right.

Speaker 3

Yeah, actually that's why we're stacking these right, Because we want to make sure we have enough content to cover it while I'm out of country. So Chris mentioned we're recording a few of these kind of back-to-back. So yeah, there's not going to be a lot new to talk about, but yeah, I mean, I'm like a week out now, so it's getting real. We're getting all the stuff. Doing our last-minute Amazon shopping for stuff that we forgot to need to throw in the luggage and all that stuff. Man, I'm excited.

Speaker 2

Nice. Well, I hope it's awesome and I'm sure you're going to have plenty to talk about when you get back and we record an episode. Yeah, all right. So we also have a returning guest today that will be joining in our roundtable discussion. His name is William Collins. If you want to know more about Will, he was a guest on episode 7, which was Terraform for the network engineer. So, will, what have you been up to since last we spoke? Anything new?

Speaker 4

I think you might know, but my summer break kids are out of school and my in-laws are in Pennsylvania, so my wife flew out there spending some time there. So I've got the house all of myself. So I'm ripping out carpet, putting in vinyl plank, trimming painting. It's been such a not-relaxing quiet house. So, yeah, I keep in busy.

Speaker 2

I was going to say. I keep seeing you posting pictures of, like, tearing everything out and your workbench all ready to go.

Speaker 4

Exhausting. I've been starting at like 8, 9 PM and just kind of like working until midnight, like on the weekdays, and then this last weekend I just killed it Saturday and Sunday all day long. So trying to get finished, I'm going to drive up and pick them up, probably spend a few days there. Get some sun, nice Take at least some time to relax.

Speaker 2

Yeah, yeah, all right. So today's roundtable episode is about being cost aware when designing cloud networks. The cloud is not cheap, as we discussed in our recent episode around economies of scale, so the plan today is to talk about some of the things we should be looking out for and maybe some of the compromises that may have to be made in the name of saving costs. So to kick it off why is it important for network engineers to think about cost when designing cloud network solutions? And since we're the guests, we'll go ahead and kick it off with you.

Speaker 4

All right, Cloud is not cheap at all and just like on-premises. We came from the world of building data centers and we had a good idea of OK, we got circuit costs, we have our hardware, we have our procurement, we kind of know what's going on and we're planning till we have like a six-year refresh cycle on something it's not a lot to keep track of. So you get into cloud. It's different. You know VPC, Lattice, recent AWS release at the last year's re-invent comes to mind. If you look at the just understanding the resource costing, you know you have the charge of, you know, data that you're moving, your gigabit charge but then you get charged per hour and per the number of requests as well. So you have three different charges that you're dealing with.

Speaker 4

So if you're a network engineer, you've got to understand. You've got to understand that foundation. You got to understand what you're building. There's going to be a lot of applications running on it. You're going to have just to worry about the scale. You're going to have to worry about these costs inflating and blowing out the spend. And you know, once that's in place it's like concrete. Good luck ripping it out because you're going to have production applications running. So, yeah, being aware of how the resources work and how the charging for the resources you know works, and understanding what you're going to use them for as well.

Speaker 4

So you know solving the right problem with the right solution.

Speaker 2

So so a quick question on that right. So you mentioned under VPC, lattice and understanding, like the requests and stuff. Do you think that that is going to be the responsibility of the network engineer? I know the answer is it depends, right, and it's going to be. Every organization is going to be different. But in your mind, what do you think Like is that? How do you bridge that gap between the network engineer doing that and then talking to the application owner that should be giving you that information?

Cloud Cost Optimization and Forecasting

Speaker 4

That's a good question. Yeah, it depends, but the you know VPC lattice was. I mean it was ultimately invented so devs or cloud platform teams didn't have to worry about the network. There was a thread ages ago I'd have trouble trying to dig it out, but it was from Nick Matthews a long time ago about you know a long thread about why VPC lattice. Because I think he was on the product team that you know started teasing that out and building out the requirements for it and yeah, I mean it's you know, these, these network folks, these salty network folks, are slow.

Speaker 4

They always just say, no, we want to get moving. So it's supposed to empower devs to really, you know, move on their own without having to worry about the network. But every time I've seen something like that released into the wild and devs take over, yeah, there's always some gotchas and some catches that someone else should have caught that spilling mayonnaise on it. You know, it just happens.

Speaker 3

I'm glad you said it, because this isn't a, this isn't, obviously not a VPC lattice episode. But when you asked that question, Alex, I was thinking the exact same thing. Actually, the first I was thinking it was well, the question really is, who's going to own VPC lattice? Is it going to be the developers or is it going to be the network engineers? But that's a whole. It's a whole different episode, right? But yeah, so. So I like what you said about the, the, the concrete, the foundation and all of that, and it's, it's, it's a whole different episode, right, but yeah, so. So I like what you said about the, the, the concrete, the foundation and all of that, and it being just a pain in the asterisk up once you've laid it.

Speaker 3

But to get for your original question, Alex, about, like you know, network engineering and specifically being cost conscious in the cloud, you know it's, it's, it's like the data center, it's like when we were in the data center, but it's so much different, right, Because now you can't see the cables. You're not. You're not, you're not filling racks of gear, right, You're not. There's not 32 rack units, rack units to fill up with gear. It's. It's worse, right, Because it's like the ultimate micro transaction game. Now you know so. So do I have to pay, and you start stacking the charges. Right, that's 0.005 cents per gigabit per hour, oh, but if you go across availability zones, you know, tack on an extra 0.001 per gigabit per hour, and you know if you're going to egress. I mean it's, it really is just like the stack, stack, stack, stack, stack.

Speaker 2

It sounds small at first, right, but you start adding all these pieces together that nobody's thinking about.

Speaker 1

Yeah, no kidding, yeah, it's, it's. It's like this purely consumption model where you didn't have on prem right Because, you know, while you know, certain hardware vendors might be licensing you to, you know, use specific ports on the, on the kit that you've already bought from them, but it's they're not charging you egress on how much you push through that port right. That's, that's a foreign thing that we've, all you know, had to adapt to, to, you know, invest in whenever we design these networks right. We got to we got to consider cross AZ traffic. We had to consider cross read and traffic egress on prem right. And the number one question is I work with customers all the time and one of our main questions we ask at the beginning is you know what, what, what are your, what are your expectations as far as you know throughput and you know egress and things like that, like what are you pushing today? And most of them are like no idea, they have no clue.

Speaker 1

So, true, yeah, it's, it's it's it sucks because it's something you obviously have to do, because it's something you obviously have to consider when you move to the cloud, but you don't have the, you don't have the data to project what it's going to cost in the first place, and I think that's that might be why people have that sticker.

Speaker 3

You don't know what you don't know.

Speaker 1

Exactly.

Speaker 2

Yeah, yeah, and so I'm just going to add a point here. That's kind of like an underlying theme is right, it's really important for network engineers especially to understand how all the other things interact, because, like we were talking about with lattice, nobody really knows where things get handed off right. It's kind of just, as you notice, we all have something to say about this, which means a lot of the time things just get thrown onto the network and, oh, they'll figure out how the applications talk to each other, and that means we're expected to know how to optimize cost as well, and sometimes that's that's much more difficult than it sounds.

Speaker 3

Yeah, for sure. I mean, and and you know, not only do we have to optimize and try to figure out how the app works in order to optimize it and in the cloud that might be a little bit easier because of the whole loose coupling model should, in theory, make it break out the monolith a little bit more, even though it makes networking even more important than it ever was. You know, on prem, for even just the basic connectivity that an app used to be monolith, you put it all in one server and you only had to care about oh, it's got to reach out to a database or behind a firewall or whatever. Now it's like loose coupled, so you have to know way more about the app than you ever did before and how the app communicates.

Speaker 4

I don't see how large businesses can do it's so hard. If you have like multiple lines of business or business units and you're adopting cloud rapidly, how in the world can you do like accurate forecasting? It's yeah you the simple answer is really can't. It's difficult.

Speaker 2

I feel like I haven't mentioned this word in many episodes, so I'm glad I finally get to bring it back. But service mesh this always brings me back to I. There's got to be some kind of tie in, especially, like, tim, you were mentioning microservices. There's got to be some kind of way that someone will come up with to tie in you know, like service registry type stuff into cost is going to happen at some point, so I will continue to stand on that.

Speaker 3

Yeah.

Speaker 2

All right, so what options do people have to gauge cloud spend? And you know, feel free to touch on this can be single or multi cloud and specific products. You know non cloud native will call them, versus, like, a CSP native solution. What are, what are some of the things you guys see, tim, you want to start with yourself this one.

Speaker 3

Yeah, I mean so I forget what the acronym is. I think it's what CFN or something like that the cloud financial, something or other. There's there. There are. What am I thinking of? Is it apt to you?

Speaker 3

I forget the name of some of the ones out there that basically, will go dig through your it's not up to you, will go dig through your AWS cloud watch logs and like ingest it and spit out reports about how you know how you're spending and stuff like that. I remember before, before too long, I kind of just forgot the name, but there's several out there that that's all they do, right, they take, they take the telemetry that you know, say AWS or Azure or whatever. You're capturing it and you're sending it to the cloud native log ingestion engine, whatever is they'll, they'll pull it out or you export it or whatever. You have a job exported to them and then they produce some rich reporting about you know, hey, you're, you know these lines of business or these accounts are spending this much and there's I mean, it's a big industry. There's actually quite a lot of money in the money, if you will.

Speaker 2

Yeah, and you also got to pay for those services.

Speaker 3

Oh yeah.

Speaker 4

Yeah, I can come in here next. I've got a few thoughts. So the one one good thing is, of course these cloud providers have they're like, api, first, you know, with everything that they're doing, whereas, like on premises, you have, like, maybe, different vendors some are CLI based summer, some might have API, like you have all these different pockets of stuff that operate so differently, but in cloud you have these API. So there's a lot of you know like CSPM, cloud security, posture management. I know this isn't necessarily a cloud like a costing tool, you know, but something valuable that I've learned and I've seen. You know a lot of enterprises you know come to the conclusion of is it's good to know what's out there? A lot of times, because cloud it's just so easy to deploy things and things were getting deployed way back like 10 years ago and you've got all these AWS accounts, you've got resources everywhere.

Speaker 4

Okay, like, how many internet gateways do we have out there? Or how many you know security groups do we have out there? How many not gateways? And a lot of times you know you might have a ballpark guesstimate. But when you take that a step further and you're like, okay, maybe I'm multi cloud, you know how, how is this working across multiple clouds, how many gateway types everywhere, and those things have a tendency to.

Speaker 4

And then, and then you see these. You know these tools come in and they get all this data, they bring it back to and they show you. You can build some queries and okay, I have this many internet gateway. And you can say, okay, I don't need this many internet gateways. That's actually a problem. You know securities sweating they're gonna gonna sleep good that night and you know. So you want to get that under control, and that you know. It's always important to know what's out there to. And then you know the cost. And another thing too, it's I'm not a big advocate of just the percent or the approach of saying, hey, we'll just buy something off the shelf to do that after we've got millions of dollars of cloud stuff running like the. The real focus and the real strategy has got to be, you know, whatever they call it, shift left.

Speaker 4

You want that to be integrated before things are getting deployed. So you want to get that under control. That should be the number one focus, unless there's some security events you're dealing with. So get it under control before it gets deployed because, like we sort of tease out earlier, once it's deployed, good luck.

Speaker 3

Well, how do you think that impacts the the whole? You know what sort of looking for the value prop of agility though, because I agree with you, by the way, I do agree with you, especially as a network engineer, I agree with you with the idea should be that it should be an iterative process. You know, we dip our toes in, we get a good idea of, especially we don't already know for what our apps are doing, for example. You know we need to find that out and we need to do it before we're committed to a certain architecture and before we've spent $10 million, but it seems like that would necessarily slow down the, the whole agility and the value prop for the cloud there.

Speaker 4

Yeah, it does, and that's the thing. Like I was having a it was a good spirited debate, I would say with someone about automation in the cloud and my my take on it was, yeah, automation can make you go fast. But I see, like the real value proposition of like automation is building in standards into that code and like locking certain things down so you're deploying the same thing and you know what the security posture is, again and again and again. So you know it's. You have that reference architecture, like I'm not doing it to go really fast, I'm doing it to have really good standards that marry back to some blue printer design, like a design pattern that I have. So that going that direction, yeah it's, you're not doing it with an agility first approach and just like with cloud, you've got the benefits of scale.

Speaker 4

You've got the benefits of, ok, you can make DR happen much quicker, like if we're doing DR like 15 years ago, it's like, ok, we're ordering circuits, we've got to, you know, spin up another data center Maybe we have another data center and be their active standby, maybe we want to go active, active. We've got a lot of different things happen there. You know Zerto is doing circuit grooming. We've got some outages, we've got all this stuff going on, that's just. You know it's it takes forever. So when you kind of like looking back and, you know, looking at perspective, if you look at like where we were versus like where we are now, we're still really agile. But we want we don't want to go so fast and create a mess that impacts the business in the bottom line because you still have to sell or whatever your, your market is, and you've got to be efficient and you have competitors in your market. So you've got to, you know, stay in that that competition, stay in the hunt.

Speaker 2

Yeah, I mean, these are all. These are all awesome, awesome points. So, Chris, do you have any, any specific products that come to mind, or anything you know if customers are using and again, that's single or multi cloud, and they can be non CSP vendors that that provide them, or or CSP native tools?

Speaker 1

Not necessarily not nothing that hasn't already been called out. I think that is a is a space that is, I mean. I mean it's not lacking. There's definitely plenty of companies are going to tell you how much you're spending, but that forecasting and that design pieces, it's hard, to hard to incorporate. So, no, not not much to add there.

Speaker 3

If you Google, like cloud financial management, I think, which is the M I could remember with the most for do there's so many companies you've never heard of but, like there, they all say like you know, we'll look at your cloud, we'll ingest the details and we'll tell you where you're wasting money. We'll tell you how to save money. Like I think of, like the CSP native ones, like what is it and is it inspector or advisor? It's advisor? I think that does the that'll look at your like, running instances and make recommendations on sizing and stuff. Like that too.

Speaker 2

Yeah, yeah, and the one that comes to mind for me that you were talking about we'll look at through all your contracts and stuff is a duck bill group Right.

Speaker 3

Oh yeah.

Speaker 2

You always see Corey Quinn on everywhere just posting things, and they had some tweet recently I think the CEO posted it that they saved like some hundred millions for customers over like a quarter or something. That's how insane, which is insane costs balloon.

Speaker 3

More than the GDP of a nation in the cloud savings.

Speaker 2

And then there's only one more thing I wanted to call out and will. I think you might have told me about this a long time ago. There's like a tool called in for cost. So is this it's like a CLI or not a CLI like a automation tool, right that you add into some of your automation pipelines? Is that? Am I understanding that right?

Speaker 4

Yeah, so the my, my idea of what the value prop is. So look, developers, you know honestly they're not going to care about cost, it's not the first thing that they're thinking about, or even your cloud platform teams, like they're.

Speaker 4

They're building infrastructure. They're trying to support the business, trying to support product teams or something. And you know another thing that I know about developers is they don't like spreadsheets. They're not going to be. You know tallying cost and you know looking at price sheets and stuff. They just you know it's not their world, they're not accountants. You, you hired these Rockstar developers to write Rockstar code that is going to transform your business.

Speaker 4

So one of the you know, I got really obsessed a long time ago because I had this idea that, again back to my point of like, trying to catch things before they get in there. But like, how, how can you inform, how can you enable? Like, instead of instead of ringing them over the neck like, look, you spent this much, you left this on. You did these things. You know how do you empower your, your technical staff, with the tools and the know how to see how much something is going to cost, so what Infra cost does, and you can stick it in your pipeline, like if you use GitHub actions, or like Azure DevOps. That's not DevOps. I don't remember like.

Speaker 1

Azure Intra DevOps maybe? Yeah, like the 5003 name. But yeah, you can just build another stage in that pipeline.

Speaker 4

So the infrastructure you're going to deploy, maybe you accidentally you choose the biggest instance possible and then it'll calculate how much that's going to be over a given period of time and pop up for the person that has to do, like usually if you're doing some sort of code review you're not going to do your own.

Speaker 4

You know, pull request, You're going to have two or three other reviewers, so maybe either you see that and you're like, oh, that's so expensive, I don't want to do that, or you know the reviewer does. You know AKA, you get, you know, you catch it before it gets pushed into an elevated branch and makes it out in the wild.

Speaker 2

That's a really cool idea. That's what it does.

Speaker 4

It just sees what you know the costs are for the infrastructure out there and what you're going to do.

Speaker 2

Yeah, maybe you have like a finance person doing one of the approvals or something you know.

Speaker 3

It's kind of if it hasn't happened yet, it's coming Recently.

Speaker 1

I don't know how long they've had this feature, but there's also an estimated cost, like if you're using a service like Terraform Cloud, like whenever you run your speculative plans in Terraform Cloud there's an estimated cost for the infrastructure you're deploying. Don't know how accurate it is and don't know. You know where that data is actually pulled from, and etc.

Speaker 2

Yeah, estimate it.

Speaker 1

Yeah, right, but yeah, so that's another cool thing where it's incorporated into the deployment pipeline right, where you see it ahead of time, which is, you know, that's very useful to have, assuming that it's correct.

Speaker 4

Yeah for so much using OpenAI. There there's a lot of tools that are piggybacking on the AI revolution and I've it's getting better and better, like some stuff. You know I've tested a ton of stuff. You know some of it started out pretty rough but it's getting better and I think in you know you're going to see the accuracy you know just continue to climb For sure.

Speaker 2

Yeah, those tools are only going to get better. Okay, Chris, we'll give you this next one. So how do we design a cloud network with minimizing costs in mind? Like what are some of the strategies there?

Speaker 1

Yeah, I think the easiest thing to call out is that you do need to kind of be aware of that consumption model that we've touched on already whenever you're developing these designs and things that you're doing. Right, because I've seen common deployments where you know customers just think, oh, you know, we're deploying this many different VPCs across this many regions, et cetera, right, you know each one of these VPCs we're deploying needs access to you know, s3. So we're going to throw an S3 endpoint in there. You know they need access to the internet for patches, so we're going to throw Nat Gateway in there. Right, but if you're doing all these things on a per VPC basis, you pay hourly for all these services, for all these endpoint services, for.

Considerations for Designing Cloud Infrastructure

Speaker 1

Nat Gateways things like that. So obviously the price of that can scale up very quickly, right. So, you know, I think people need to maybe consider this kind of more centralized model where there's you know, you need to consider what goes into your shared Services piece of your network versus, you know, scattered across all your entire landscape, right, just to bring down those hourly costs. I feel like that's the easy grab bag, so I'll throw that one out there, but yeah, yeah, yeah, that's interesting.

Speaker 3

Specifically, you talk about the Nat Gateway model as egress and then like a centralized model, because I was actually thinking of something very similar and getting into the cost aware piece of it or designing for costs. There's just so many different ways that the cost is figured into it that it makes it very difficult, even when you know what the flow is going to be, to actually accurately predict. Like, for example, I've got 10 VPCs, I've got workloads in every VPC. Is the cheapest thing to do to put a Nat Gateway in every VPC and have that Nat Gateway? You know, do egress have those workloads? Egress directly to the Internet?

Speaker 3

Let's set security aside for the moment, because obviously that is a whole other thing, right? Or, you know, should I centralize it? And if I centralize it, should I centralize it in such a way that you know I'm trying to minimize VPC peering or am I using VPC peering? Is there a transit gateway that I need to be involved? You know, am I worried about cross AZ and egress charges and like? It's actually extremely challenging, even when you know the end to end flow.

Speaker 2

Sorry, the transit gateway thing is awesome, a good call out, because maybe people don't know this, but there are actually regions that don't even have transit gateway yet. This is really my point is, you know, you have to also keep pay attention and like, keep up with new products and how things change and future announcements that come out because of things like whatever region that doesn't have a transit gateway, they're going to be doing things differently and then one day it's going to become available. You know, somebody's got to remember to change the architecture after.

Speaker 3

Oh yeah, that's huge.

Speaker 2

Right, like that's a big one, that, and there's no way to standardize that, so that's huge. Going back to it, yeah.

Speaker 4

I think, thinking to I most companies I talk to have some sort of direct connect, interconnect, express route, fast connect thing which is, you know, you're led to like, hey, we're not going to use internet for everything, we want to have this private network and we want to, you know, bring that back. Usually, most of the time, I see it as a they have a lot of data on premises, they're going to do like migrations, they need some like a big pipe to, you know, pipe everything up to the cloud. But a lot of times it's like, oh, we're going to start with 10 gig, we're just going to, you know, it's just like in the days of the VMs. I mean, I, I'm guilty of this when I, when I would request a VM in the data center, it's like, oh, I'm going to dial up the specs on this and make sure that you know I'm taking care of it, because I don't know how long it's going to take for them to update it If I need it. Up to, you know, change management.

Speaker 2

Overbuild it. Yeah, we've all done it. Yeah, yeah.

Speaker 4

He'll abuse that power a little bit. But with express routes and direct connects. And you got to think, if you're you're co-loaning it out to connect to cloud, you want to be mindful of all the hardware that you're you're putting in that co-load because you're going to have to refresh it. Um, and if it's just you know like, look, if you need to drive to the grocery store, like you don't need to go buy a Ferrari. You know you can ride a bike or get a Toyota Camry.

Speaker 4

They're great cars, you know. I haven't a long time ago. It lasted forever Reliable. Is you know anything else?

Speaker 2

That's that's a good point. So, uh, this is like a sub point, but and anyone can chime into this Do you do you think there's a tendency to over engineer solutions Because it's so easy to do in the cloud and you literally just click a box, like we've talked about this before? You click a box and you have a global CDN Right.

Speaker 3

It's so easy to to overdo it on purpose. Yeah, I mean, I don't think there's any question, right?

Speaker 2

No, definitely no question they. They for sure make it very easy and inviting for you.

Speaker 4

Yeah, and it depends on who's clicking the button to or who's doing what Cause in like what level of you know control. They have to do it without, like change management and a proper set of eyes on things. You know, the more that gets deployed out there. And here's the thing. Like, whenever you deploy, you have stuff going out in the cloud. Networking oftentimes is like the last team to know about some stuff going down. It's like, oh, we're the last people invited to the party. You know, stay away. That's true.

Speaker 4

So then, when we get invited to that party, there's a lot of stuff that's already going on and they don't want to have to go back and do rework. So then, just like in the data center days, we're expected to come in and we have to do what we have to do to make it work, and part of that is a lot of times. Overengineering is something that should have been simple from the beginning, but hands are tied, can't do anything about it.

Speaker 3

I worked for a company that had a homegrown. It was a homegrown app that was a full CRM like did everything in it, and that app was written so poorly that we had to build our infrastructure to be the whole network infrastructure, pretty much, and all the server infrastructure as well pretty much had to be like one and a half times what we actually needed, just to just to like it was a hog, right. So like you had to overbuild it and this was in the data center with real, with real gear, right, so that it's it's. You know. So let's, let's not forget that that happens in the cloud as well. You know, like, if you're in a cloud, you know you're in a cloud, like especially with the lift and shift, where you're going to take an app, you haven't refactored it. You're going to pick up that monolith and that hog and drop it in the cloud. You're going to have to overbuild quite a bit, actually, even at the network layer potentially, you know, to deal with that. That's it.

Speaker 2

God, the dumpster fire lifting into the cloud man. People don't realize that a dumpster fire on-prem is one thing, because you at least can kind of finagle things, but yeah, you lift it into the cloud and it's like some things just might not even work, some things you really have to get super creative to find a solution to make it have that same behavior. That's, that's a good one. I like that.

Speaker 3

And it's going to cost like 10 times right, because it's all consumption model based and you've got something like, for an example, you have at this app say like I don't know it's a CRM or whatever, and say to retrieve customer data, instead of being efficient about it it's got to go talk to the database, like 30 calls to bring up one page of customer data. Like you know, that's all egress or not egress. Sorry, potential cross AZ charges. You're going potentially through load balancers. There's like a thousand little micro transactions happening every time you make a database call.

Speaker 4

Yeah, throw DR in there and you're really up. Oh yeah, because like okay, are the resources going to be deployed at half master? They're already going to be running, or are they not running and we're just going to turn stuff on if something happens? You know, because if you're running a whole bunch of stuff that isn't getting used in like an active standby scenario, you're eating that cost, you know. Or is it active? Active, you know? Just, it depends. It's expensive, though.

Speaker 2

In this same vein, I'll ask the next question. So it's about how do we weigh availability against cost, as we would on prem.

Speaker 4

I'll let you all go first. I got a lot to say on this one.

Speaker 2

Yeah, as I say, you're talking about it a lot. I guess you know, like I said well, you've made a lot of good points on this is you know there's a lot of hardware lifecycle. There's just a lot of things that are already established that you have to think about. There's like a cadence to how things work on prem right and I guess, in the cloud. It's just so different you really have to think about. You don't have to worry about the refresh cycles, but you have to worry about things that are going to change in the future. There's just so many different things to think about. Cross AZ traffic, you know availability, zone going down, region's going down, so it's really just a lot different yeah.

High Availability in the Cloud Considerations

Speaker 4

And a lot of times I see a lot of designs that aren't using like availability zones the right way within a single region. You know, because you think about it, high availability is like in the cloud world. High availability is like what would be on premises with two switches sitting next to each other. You know active standby or firewalls, active standby, active active, whatever they are. You know you have one, it goes down. Well, then you have another one and that's cloud with AZs. It's, you know, high availability. It's not DR, even though they're separate data centers.

Speaker 4

It's, it's a there's a big distinction to make there, but then when you want to go DR, you have to go to another region, and even the way the cloud providers do this is different, because AWS is like building with Legos.

Speaker 4

I remember that one blog from I think it was Netflix back in the day, but it really is. But if you go to like Azure, okay, like we're dealing with paired regions now that go back to the cadence of them upgrading and doing things within their environment, so there's not shared fate there, so there's a lot of different things that you have to consider. And then doing this again like there's a big distinction also with hey am I look, I'm designing the next Twitter. I need this sucker to be available everywhere all the time with, you know, capacity, you know imagine that that the Twitter app that wasn't flaking out all the time.

Speaker 1

I was going to say if somebody needs to design it, if you're going to do it, you better get to it quick. We need it yeah.

Speaker 4

But I mean the enterprise side. It's much more difficult because you're not. You're not, you're not developing the next Twitter. A lot of times you have a, you might have the bread and butter your bread, bread and butter of your business that needs to be available, like across the U? S or maybe the East region, but then you have a lot of low hanging fruit and really it has to be a thing of you know. What do you want to pay for? What is the availability that you want to pay for? What are your DR requirements? There's a lot of metrics there. We need to sit down and understand them. Usually, the answer I would always get is we need it to be up a hundred percent of the time right, we can't pay for five nines.

Speaker 3

We can't pay for five nines, but we need a hundred percent uptime. I mean, that's a tale as old as time, right that that just followed us straight into the cloud. That has always been the infrastructure builders like dilemma, right the business. Once five nines of uptime, they give us the budget for like three and a half.

Speaker 4

Yeah, and you can say that you can show them like hey, well, actually just take this and double or triple your cost.

Speaker 2

There you go.

Speaker 4

That's your cost, that's how much it costs, and then usually the response is well, actually, let's think about this.

Speaker 2

This isn't in the budget, so yeah, and then the whole region goes down, and then everyone blames you anyway, right.

Speaker 3

When I worked for that same company actually there was we did this thing with hybrid cloud because they didn't want to pay to put you know like a direct connect in or whatever. So it was all like site to site VPN stuff, right. So like the web web front end was in the cloud but all the databases were on prem. And I remember there was a critical P one incident where you know, like it broke, essentially like the VPN broke, it was it wasn't on our side, it was on, I think it was AWS.

Speaker 3

Anyway, long story short, I'm doing the, the after action report, like the, the, whatever you want to call it, to see process improvement. And I went in there and I was like you know this, this is what, this is the level of service you paid for. So you know, there's never really I have, no, I have no process improvement. We know what we, we, we told you what it would take to get better redundancy for this or high availability, whatever you want to call it. You know, essentially, the business signed off on the risk by not investing it and you know, made the financial decision, but it's the cloud.

Speaker 3

I think I actually said I think I actually said welcome to the cloud when I was done. Thank, you.

Speaker 4

You know what. To that point, I got one more point on this question. So one of the things I really were I spent a lot of time doing this. This was several years back and it kind of went nowhere, which was unfortunate.

Speaker 4

But in in cloud, everything had like every individual thing has an sla, but then when you put those slas together, you get a bigger sla for something.

Speaker 4

So when you think of in terms of what a business cares about like they don't care about the sla for a direct connect, they don't care about the sla for a circuit or a you know some component in the cloud so most applications out there, if you're in the enterprise, it's hybrid. You have something that it's talking to back in your data center. You've got something it might be talking to in another cloud or a service somewhere. So, you know, trying to build like what I, you know we refer to it as compound sla there's probably a correct term for that but trying to do the best that we could to actually measure and combine all these things and put together a view of what it would look like for the business with this application, like where all the different components are, because that's what they, you know care about. Is the app down? Yes, that's, that's the problem and that's what'll get scrutinized yeah, and you know, obviously there's.

Speaker 1

There's been points made about the bad design where things are just broken, um in that capacity, but also like, related to that is I think people have difficulties mapping dependencies correctly. Um, in that you know. You, let's say, you're deploying resource a and resource b and resource b depends on resource a and when you're deploying b, you're like, oh, we need to build this. You know, you know five nines, you know to the gills, right, you know it needs your redundancy, needs all this built in.

Speaker 1

But if it relies on something that doesn't have redundancy built in or can't have redundancy just by the way the, the system operates, then you know either you need to fix both of them or to fix none of them, right, you know? I mean you, like you can only have have what, what, what, the, what the system is capable of in the first place, right? So it's important to make sure that you're um, you know that don't design. You know the last piece of the puzzle with the, the biggest um or the most redundancy that you can, when you know the beginning can't, can't, uh, you know, account for that anyway, exactly, and it's so.

Speaker 4

It's got to be so hard to be on a dr team because a lot of stuff is out of your control and you're just trying to pick up the pieces and get teams to change things.

Speaker 2

Yeah, that's a really good point, chris just use microservices, man, and then everything, just you know, will work, and don't worry about the cost, because yeah, kubernetes is just so easy.

Speaker 4

And serverless, yeah, very yeah, server lock in with serverless, don't even need a server I got another interesting one.

Speaker 2

So compliance requirements, um, and I know I don't know will you might want to take this one. I know you've worked in, uh, some healthcare organizations, so are there any specific compliance requirements, uh, that they've seen that kind of force cost clouds to go or cloud costs to go higher?

Speaker 4

uh, I could answer this with one word, and it would be firewall yes, so good of course, uh, there's a lot of uh. Yeah, I mean so working on like government a little bit, a government contract side and the healthcare space, versus also working on the business, the proper side like we're, you know, even on the the, the, you know, commercial side.

Speaker 4

Really there's so many different. Like one of the things that I battled that I never could really get a good handle on is a lot of time security comes to you with requirements. They say, hey, this, you know x, y and z has to be done this way. And you're like, wow, that maybe doesn't make any sense. Why, why is it this way? And then like it's a requirement, and you know, I got to the point to where I'd start asking like okay, where's that written? Like, where is that at?

Speaker 4

right what, where is that? And and start chasing it down. And what I found on the you know the, the private side essentially is a lot of times it wasn't actually a requirement. It was like, oh, we did this thing for a while and we're just gonna keep doing that thing that way, um, but when you get on the, you you do anything with government. That's a different story. All together, there's a lot of writing, there's many things in writing and you're, yeah, a lot of things you might not make, you know, may not make sense to you, but you've got to follow them anyway. You got to do your best to make sure they're followed, because if you know something slips underneath the cracks, that means some big hits, you know, financially for that company potentially I have some, you caught some things to add to this, but I'll I'll let uh, chris, if you want to jump in or tell him go ahead well, I mean, I think the big.

Speaker 3

I think the big thing is that, uh, like you said, well, there's a lot of compliance stuff, and then what you find a lot of times is that, uh, if you actually chase that rabbit down, depending on who you're talking to with government, it doesn't matter, there's nobody to talk to, it's the government. You might as well just walk to talk to a brick wall about why that requirement exists, because somebody passed a law, and then it is 30 years later 30 years later, we have this requirement.

Speaker 3

They have nothing to do with each other, but that's that's why, um, but in a private organization, you know, a lot of times, security teams will, you know, quote certain, you know, pc I or sock to you know, whatever, the whatever. That is right. Um, but what? I think, just real quick, touching on the cost thing as well, um, what I find a lot is in the cloud, security patterns or requirements tend to, you know, just like on prem, uh, affect a lot about how the traffic patterns go.

Speaker 3

And then we've already talked about the, the, you know, the security plus traffic, plus weird traffic patterns that are required to make security work can equal a just a shit ton of money, right? So, uh, we need to be able to justify that beyond just saying, oh well, security set its requirement, right. I think we're. We have to be the good stewards, and a lot of times, of course, infosec may not like to be questioned about it, but you have to be able to say like, yeah, but if I'm using adibus firewall or azure firewall, plus all these traffic patterns, and I'm using, you know, load balancers and all this other stuff, I mean, there's, there's a serious amount of money involved in running that through this. Uh stack, you know, and it's just the right way to do. It is what is the true requirement here for the business?

Speaker 4

I got to jump in here. So do you think you just had a curiosity that security and network teams are going to consolidate at least network security teams networking and consolidate into a single team in the future?

Speaker 3

I think we should. I think we should. I do think it's a little question. I do think we should because I don't think you can have security. I mean, you can have physical security, obviously, but if we're talking about IT security or whatever you want to call it data security, every bit of it has to include the network, unless we're talking about, you know, burying a server and disconnecting from the network or something like the network has to be involved. So I hope that we will see more blending of the security and network teams.

Speaker 2

personally, Well, I completely agree, yeah, and you know to kind of give them a little bit of credit. You know the CSPs things like you know AWS has like gateway load balancer right. It actually helps a lot. I mean, it makes it very complicated, so that's a different conversation. But at least you don't have to worry about adding a firewall to every single VPC or you know just weird things that are going to seriously balloon the cost. So I think those are good things on the whole.

Speaker 4

For sure. I mean, there's major like that.

Speaker 1

I remember one point to talk about that, like you know that obviously gateway load balancer is a concept that was recently integrated, so it makes me this isn't really, you know, privy to the question, but just kind of the thought that I had is as it makes me wonder about the constant churn that customers have to be experiencing, with new products coming out that you know fill a use case for them, but you know they moved to the cloud when that use case wasn't there, right? So are they just in this constant cycle of migrating to new services? And and obviously your security posture, or how you're implementing security, can have a big, a big influence on how you implement. You know, your networking across the cloud right, until you can. And then, if you integrate services like this, then you know it's obviously hard to shoehorn that thing in afterward. But I mean, cloud is definitely more fluid than it is on prem, so it's probably easier said than done, but I don't know. What do you guys think about that?

Network Architecture and Compliance Requirements

Speaker 2

That's, that's great. I just lost my thought. I had a point I was going to add on, so feel free if anyone else has something.

Speaker 4

Yeah, I went through the trans. I mean just thinking of all the different network and the gateway designs, like going from just VPNs to transit VPC to transit gateway, all in like six years, like like just like PTSD. It was terrible and it was very expensive as well.

Speaker 2

Yeah, that was the point. So yeah, tim, or sorry, chris, to your point about the customer having to handle these things. It's not really the customer, it's the network and security teams, right, Like the devs don't care if transit gateway came out, okay, who cares? Right? So that's another thing.

Speaker 1

Exactly.

Speaker 2

And then super quick and then we're going to wrap up the government stuff. So like tick, tick requirements, man tick 3.0 is all about kind of having like distributed points of presence, with adding the cloud into this. That basically means that you do have a firewall everywhere and it's just like the government's always behind on these things.

Speaker 2

But yeah, I mean, it's part of the mandate, it's just how they designed it and it makes sense why they did it. But man does it require just some serious ballooning costs and thinking about the architecture and weird ways and it's just some mess your tax dollars at work.

Speaker 2

Yeah, and banking industry right, there's so many other industries that have weird compliance requirements, like banking too. Yeah, okay, so let's, we're going to go ahead and wrap it up here. I know we could talk longer, but we've got to cut it off at some point. So, real quick, we'll go around and get any last points. Whoever wants to kick it off will. Or Chris, do you guys want to chime in?

Speaker 4

Sure, yeah, just kind of like with getting certifications. Don't ask me how that made its way into this conversation. But you think of you just want to go so far, so fast. You like, okay, I'm studying for CCNA. Oh, you know, I'm going to get my CCIE in like two years or something. It's kind of like with cloud, like you've got to start basic. You really need to learn the foundations and the documentation is gold. The cloud documentation is actually very good. I have so many things bookmarked that I refer back to. They're always updated, well, usually up to date, you know. So learn those foundational things and learn TCPIP while you're at it. Like really learn it and that'll that'll take you far.

Speaker 2

Awesome, chris. Do you got any any last points to add?

Speaker 1

Not much to add, I will say. Typically, when doing designs and talking about solutions and things, cost is my, my the bane of my existence is the thing I want to talk about the absolute least, but thankfully you guys have made it fun today, so I appreciate that.

Speaker 2

Awesome Tim.

Speaker 3

I think the main thing, the main takeaway, if you will, is that you know, when we left on prem, we didn't leave behind the same fight that we've had to have, which is, you know, the business still has to get an outcome and they have, you know, budgeted a certain amount for it, and we have to be really good about making sure that they know essentially what they're buying, like what level of of DRHA you know, makes sense and and at the end of the day, it's not really our call and we need to be okay with that and we need to just make the business aware of that.

Speaker 2

Tim, yeah, absolutely you. Definitely you need to know how to articulate it, which brings back to the point we made a couple of times, where you really just need to keep up with this stuff. It's changing so rapidly and there's always new, new features and products coming out and new designs and new security requirements and new apps, and just goes on and on. So I think I think that's good All right. So thank you all very much for tuning into the Cables to Clouds podcast. If you liked this episode, please share it around to anyone you think might be interested. Give us a five star rating on your favorite pod catcher and hit those like and subscribe buttons on our YouTube channel. Till next time.

Speaker 2

Hi everyone, it's Alex and this has been the Cables to Clouds podcast. Thanks for tuning in today. If you enjoyed our show, please subscribe to us in your favorite pod catcher, as well as subscribe and turn on notifications for our YouTube channel to be notified of all of our new episodes. Follow us on socials at Cables to Clouds. You could also visit our website for all of the show notes at Cables to Cloudscom. Thanks again for listening and see you next time.