Season 1, Episode 4: "How do I measure an automation strategy?" with Juha Holkkola, CEO @ FusionLayer
Juha Holkkola: What you will find is, there's loads of manual work and in lots of things that haven't really changed since the late'90s.
Speaker 2: How have you enabled your infrastructure fundamental change over the last five years? And partnering with a business is critical. The tools exist on the cloud change at the rate necessary, secure by design. Network Disrupted.
Andrew: Hey. It's Andrew. And welcome to network disrupted, where IT leaders talk about navigating the disruption in our industry. In this episode, we unpack the right and wrong way to approach and measure an automation strategy. My guest today isn't an in- house IT leader but, like me, he works for a lot of people in that role. Juha Holkkola is the founder and CEO of FusionLayer. A Finnish- based DDI company that helps service providers automate their networks. He has a bird's eye view on how a lot of large organizations are moving their network operations forward. So I thought you'd enjoy what he has to say. Let me know what you thought of this episode. You can tweet me @ netwkdisrupted, leave a review on Spotify, or Apple Podcasts, or email me at andrew @ networkdisrupted. com. Let's get into it. Juha, thank you so much for joining me.
Juha Holkkola: Yeah.
Andrew: From your perspective, the addition of software, the drive to networking via software and using software more and more as the control plane... from your perspective, what is the driver of that?
Juha Holkkola: If you consider networking in general, not that many things have changed, really. I mean what engineers still do today is, they use the common line. They configure everything manually. There's not much automation that they would be using. And personally, as far as the software- driven nature of networking is concerned, I think there's a whole number of different use cases, different business cases for autonomous networking. Like 20 years ago if someone had asked me how mobility looks today, in 20 years, after you have automated, I couldn't have ventured a guess that having [inaudible 00:02:14]. And in the same way given the manual processing and networking, I think, there's big change happening. And that comes from the ability to dynamically inaudible works. And I think that will introduce a whole set of new business cases and use cases that today I couldn't even dream of.
Andrew: It's abstract but true. And a lot of it's just about the acceleration of change too. We need to change these things way faster than would be possible if we weren't driving it via software. And the way we used to manage the networks, assuming some static nature of the network, and then we can use things like a set of firewall rules to control what can get in and out. The networks aren't static anymore either. And so there must be a more dynamic approach to do that as well.
Juha Holkkola: Yes. That's right. Because, well, actually, if you think about Icom or DHCP or DNS it has been mostly about subnets. You have subnets. You manage them. You have maybe some DHCP ranges you integrate with those subnets. You manage the IP. You manage the names. That's essentially DDI. But what we've actually seen is there are other facets to networking that are still being managed manually. If you think about, let's say, a larger enterprise or a service provider. What you have is multiple different data centers and each one of those data centers, they have the lab spaces that can be overlapping. And also if you look at, let's say MPLS for example... So if you're a large enterprise or you're a service provider operating those MPLS networks, then again you have the VRX. Those VRX have connections to logical networks. But most of that continues to be managed, again, in spreadsheet.
Andrew: I think it's, I forget what the saying is and I'm sure I'm somewhere in the ballpark, but 80-90% of business is still managed on spreadsheets and 80-90% of spreadsheets have errors in it. And so it's a scary thing at the end of the day. And it's also when we talk about automation, I think that a lot of approaches to automation are simply," Let's use APIs." And there's a core problem with that. And I think the spreadsheet is a good example of that, which from my perspective, is that that's more of the view of," Okay. We were manually entering this. Now let's use an API because we can't do it fast enough or we want to make sure we don't have errors." But I think the right way to look at it is, the person who has the spreadsheet, where the team of people who are doing it manually today, spreadsheet or otherwise, oftentimes they're getting requests. And then they're translating that request into some action they do on the spreadsheet. And that translation is occurring based on their knowledge, their history, the stuff they know." Oh. That request is coming from that division. So we're going to pull a subnet from over here. That's probably in that data center. We'll pull..." And that knowledge that that person has in order to do something manually needs to be encoded and not the complexity because a lot of times that I give a couple of pretty simple use cases. But a lot of time the translation from business request to actually doing something requires a good deal of knowledge. And you can't teach the requester that. You don't want to. They shouldn't know. They should be able to give a lightweight request and get an answer. And therefore automation doesn't just mean calling an API. It means encoding that business logic, that might be very different based on the business into that API.
Juha Holkkola: I mean one of our key goals of knowledge management processes is to actually make implicit information explicit. And some of the best item tools I've seen, they are network engineers, they are working icon tools, they have everything in their head. But the problem is if I'm going to automate something, it doesn't work anymore. It used to work because these guys would be... they are command line wizards. They would just look in. They would do the magic. Things happen. That's the way we've been working for the last 30 years. But with the coming change, I don't think that'll work anymore.
Andrew: Totally agreed. And you touched on different things. So there's two sides to it. One is, what knowledge do I need to be the command line wizard or to be able to do this manually? But in the process of making the change manually, for sure that network engineers also going to test to make sure that the change is working. And if you simply call an API, that's going to make the change but you have no mechanism to ensure that the change was successful, then you'll measure automation the wrong way. And a lot of this comes down to metrics. And when I talk to customers who are... they're thinking of automation as a... the proper metrics is like man hours saved or we've automated 300, 000 man hours of stuff, and that now takes an hour or something. But they're not looking at the success of the automation. It's real easy to call an API but did it achieve what we were intending it to achieve? Did it have knock- on effects? Was it done the right way? And I think that's the piece of this transformation to servitize these sorts of functions that I think companies miss.
Juha Holkkola: It's a new paradigm. Things will be done differently. And that, of course, means also the processes, the organizations. There are lots of things besides technology that needs to change. And from that point of view, I think one way to look at automation is by start automating things, then clearly I have made an investment. But if I still do have things, let's say, old fashioned processes in place, it could actually be that the return on my automation investments won't be that high because they will be use cases that I just won't be able to implement. And so it's not just the effects, it's also the return on the entire automation investment. And that can't be that high in case you still have those old approaches in place.
Andrew: If you were to start a brand new company and start building data centers to scale or adopting cloud at scale, then you can do it from day one with automation in mind. But no companies are in that position, at least not the customers we work in. They still have data centers that were engineered over the last 20 or 25 years. And is that a solvable problem or not? Or in other words, where do you concentrate on for these areas? Where do you use bridging solutions versus net new solutions and not create a technology soup of just different things purchased over time for different points of view?
Juha Holkkola: Yeah. For sure. And, in general, think about enterprises, for example. I mean there's many of them that still continue to run like Microsoft Space the infrastructure, they have Windows servers in place for DMS, for DHCP, things like that. And to me it does shock. It's supposed to do quite well, in fact. So what I often tell customers as well, you don't necessarily have to work with everything. There are elements within your existing infrastructure that are fine for the things they've been designed for. But at the same time you will get public clouds or Hybrid Multicloud or CloudEdge or IOT, or whatever it is. Those things haven't been really designed for that. And so I feel it more in evolution. There might be certain layers in your technology stack that you might want to hold on to. But equally, the, as the industry progresses, you need these new tools and solutions in place to bridge the old with the new.
Andrew: Yeah. For sure. I was speaking to the CTO of cloud of a large financial institution yesterday. And he said something that I fundamentally agree with, but it's important, which is they have their five- year vision and strategy of where they're going to go. He's also certain they're never going to get there because technology changes too fast, business strategy changes too fast. So certainly part of that strategy and vision is ensuring they're investing things that they can continue to change as things... But my broader point is, he's not looking at what they need just this year and assuming they've got the North Star and they're trying to drive towards that. And he's looking for pragmatic and practical ways to do that without breaking what exists today. And I think that's a critical point that people forget, especially if something has pachinkoed into an RFX process, where now you're buying something based on a legal document of what it can do today that you can verify the product, as opposed to are we buying the thing that's going to help us over the next three or four years get to a set of requirements that we might not know right now?
Juha Holkkola: Yeah. Definitely. There's also consumerization of IT. So people are used to accessing portals. They are used to self service. But they also are used to getting what they want almost immediately. And if you think about most enterprise IT departments, not all of them are able to deliver on that promise. It might take forever to actually get something from them. And as a result, what we end up seeing is things like shadow IT. Rather than going to my IT department, I'll actually go to AWUS. And create an account. And get what I need pretty quickly. And so when we talk about the the OPEX and how to, let's say, assess the roadmap and the goals down the road, one of these things is you also want to have that UX in enterprise IT as well.
Andrew: I absolutely agree. I mean, IT has got to move from a explicit project- based method of working, where SLAs were written way too long ago in terms of how long any sort of change takes versus... We were in the middle of a go- live weekend. This was like four years ago or something. And they realized that in this specific case this physical appliance was plugged into the wrong switch. There's two switches in the rack. Somebody just plugged it into the wrong switch. That data center was managed by some outsource provider. And the SLA to pull out the patch cable and plug into the right switch was seven days. And it was seven days because they had to assess why it was plugged into the wrong one in the first place. Do they have the wrong requirements? And now they've tested plugging it into the first switch. What happens when they plug it in the second switch? Are there going to be problems? So from their perspective, that was a meaningful seven days. But come on. And that's just completely unacceptable. Things need to change way faster. And certainly as the business is building out more and more technology, in any company, I don't care what you're doing, you end up being a technology company as the business builds more technology, where historically maybe IT would build that technology. You've got to transform from projects and explicit project based SLA's to servitize the business. And yeah. Give them the cloud. Give them the shadow IT experience. Give them that UX. I like that you called it a user experience. I think that, that's dead on. Give them a similar user experience, where they can get what they need immediately. But govern it.
Juha Holkkola: Yeah. Definitely. But it's actually seven days is pretty packed.
Andrew: Yeah. I also look at it a bit like I look at software, quite frankly. One of the software books out there it's, I don't know how many years old now, is Martin Fowler's Refactoring. And I think the, it's either the introduction or chapter one basically starts with," If you can't test, put this book back on the shelf because you can't change things if you can't test them." Let me give you an example in networking. It's if I'm going to change things rapidly and I'm still relying on an SNMP trap if something has a error state, then I'm orchestrating a broad workflow but I'm looking for indicators of failure of components underneath that may or may not affect the success of the workflow. So how do I get the data? How do I get the telemetry? How do I understand what's happening to ensure that the higher- level automation was successful? Even if some switch or DNS server or whatever the case is inaudible is... That's fine. There's an error state there. But there's a high availability pair or another DNS server over there or this other switch is enough workload. How do I know if what I'm automating is actually successful? And I think from my perspective, it also comes with a strategy around data so that I can measure the quality of what I've deployed. And that requires a different, I think, a very different view on how I understand health.
Juha Holkkola: Yeah. For sure. That's the problem with if you start the automation from just one corner is that then the rest of the environment isn't really ready for what it is that you are going to be doing. And sometimes if you look at SD- WAN, which is software defined wide area networking, and has been getting some good traction within the enterprise. But what we see sometimes is, of course, we want to create networks that spend from the data center all the way to the crunch sites and not the SD- WAN controllers that then go and configure everything and spin up the networks and all that. But, of course, those orchestrators, they need a back- end for the configurations, for suitable network segments and all that stuff. And so what we've actually seen some organizations do, even Cisco sometimes, what they actually do is they create parsers that actually go and read. I've actually seen people having parsers that read CSV files. I've seen people having parsers that read text files. And so the way it actually works is people go and edit those text files or CSVs manually. They just go and get that information from somewhere. They go and write it down into these flat files with no validations. No nothing. And then they create a parser that isn't very intelligent and they just pull the data from those files. And push it to production automatically. And that's a recipe for disaster. You might be able to pull it off in the lab a few times when you're just doing the first, let's say, experiments with a new technology. But if you're going to take this and bring it to the operations team and say," Well. Here it is. Start operating." That's a recipe for disaster.
Andrew: Right. Sure. And some overlying metrics of success that can't be measured, speaking of SD- WAN, we've seen some SD- WAN projects go... well, the technology was implemented. But it didn't actually meet their actual business goal because they forgot about the other things that need to change too. And in this cases there might be DNS for instance. So SD-WAN is deployed because," We're going to do Director Net Access because we're going to consume Office 365," but the DNS path was still backed through the data center in a different continent. And so at the end of the day, they were doing Director Net Access but they were accessing an application 2, 500 kilometers away from the closest one they should have been accessing. But from their perspective, it was deployed successfully because the SD- WAN routers weren't complaining about anything. So I think it just an example of being able to measure what the goals are as opposed to looking at the components of the solution.
Juha Holkkola: But one thing is to look at your processes. See how much manual work is being done. Because particularly if you go to the networking rail, most of the things that are being done, they're still being done manually today. Oftentimes the best item tool you have in the house is a network engineer. And you have these telltale signals about how you can assess how traditional I am. And more often than not things haven't really evolved that much since the late'90s. And if that's the person's state of things, then surely there's something I need to do in order to get my house in order before I will be able to jump the automation ship and then do the things of the 21st century.
Andrew: Yeah. No. Again, it reminds me before BlueCat I spent many years in engineering management software and managing the life cycle of engineering. In fact, your former employer, Nokia, was a big customer of mine for many, many years. But these were changes that they were going through in the'90s? And so it's interesting that we're going down the same path here as well. And I think a lot of the indicators are the same. And the number of sticky- pad, work- stream mappings we would do back in the day to understand where is the cost in this process and what's still being done manually, and why is it still being done manually, and what data do we need in order to get this stuff done manually, and what are the goals here... And so I think, for sure, there's a bunch of that that's required without... At the same time, going back to something you said before locally optimizing something that on its own doesn't have any business value.
Juha Holkkola: Yeah. Definitely. I mean, as far as networking is concerned, oftentimes people have to be more focused on applications. And so then when you go and ask people," Well. How do you get IPs?" They say," Well. I get those from my network team. Or I get those from my ISP or whoever it is." But people don't have the faintest. They don't know actually what is going on, on the networking side of things because it's pretty deep down there. And if you're a C level guy, it might be that you don't really get that much visibility into the day- to- day activities of your networks and teams. But one suggestion is to go there. Speak with the guys. See what they are doing. Because what you will find is there's loads of manual work and lots of things that haven't really changed since the late'90s. It's been the same for the last 20 years. And so if the networks are the foundation of your operation, then it logically follows that in order to build on that foundation, you need to make sure that whatever you have running as your operational backbone, that the processes are being done in a standardized, systematic way.
Andrew: Yeah. For sure. Well. Fantastic. It was a super to talk to you. I really enjoyed it.
Juha Holkkola: Yeah. Thanks, Andrew.
Speaker 2: Thank you for listening. I'd love to know what you thought of this episode. And I'm all ears if you have a guest recommendation. You can tweet @ netwkdisrupted, leave a review on Spotify or Apple Podcasts, or email me at andrew @ networkdisrupted. com.
In this episode, I talk with Juha Holkkola, CEO at FusionLayer. We discuss the reality of networking at many large organizations, commonly observed automation strategy mistakes, the best IP address management IPAM tools, and his advice for effective automation.
Juha is the co-founder and CEO of FusionLayer, a DDI company focused on helping large customers automate their infrastructure. While a bit different than my typical guest, Juha – like me – talks to a large number of IT leaders. He therefore understands where widespread holes (and, conversely, opportunities for improvement) exist.
P.S. Know who I should speak to next? Drop me a line at andrew [at] networkdisrupted [dot] com.
P.P.S. Short on time? I send episode summaries to my email subscribers. Add yourself to the list if you’d like.