Hybrid Cloud Integration Requires Unified VM and Kubernetes Management - Episode Hero Image

Hybrid Cloud Integration Requires Unified VM and Kubernetes Management

Original Title:

TL;DR

  • Moving legacy systems to containers front-loads complexity, requiring significant architectural changes that may not yield business value for applications not needing rapid innovation.
  • Virtual machines remain relevant for enterprise applications because millions of existing systems will never be rewritten, necessitating coexistence with newer containerized workloads.
  • Siloed infrastructure teams managing separate VM and Kubernetes environments create networking and identity management complexities, hindering seamless communication between legacy and modern applications.
  • Running Kubernetes within VMs simplifies hybrid environments by leveraging existing VM networking policies and addressing hardware utilization issues, unifying infrastructure management.
  • Enterprises are increasingly moving workloads back on-premises, realizing that for consistent, high-utilization workloads, owning infrastructure can be more economical than renting cloud resources.
  • AI, particularly LLMs, shows promise in modernizing legacy systems by translating older codebases like COBOL or Pascal into modern equivalents, facilitating mainframe decommissioning.
  • On-premises Kubernetes adoption lags cloud adoption due to the significant operational overhead of managing clusters; simplifying this on-prem experience is a key hurdle.

Deep Dive

Code reviews front-load quality costs, preventing 10x debugging time during production incidents. This initial investment, however, creates bottlenecks at scale, forcing organizations beyond 50 engineers to transition from human review gates to automated quality systems or face exponential coordination costs. The core argument is that while containers enable faster software shipping and innovation, the persistence of legacy applications running on virtual machines (VMs) necessitates sophisticated integration strategies.

The persistence of VMs is driven by the sheer volume of existing applications, many of which will never be rewritten. These legacy systems, often developed over decades, must coexist and communicate with newer, containerized applications. Poorly managed, this integration leads to siloed infrastructure teams and disparate networking and identity management systems, making cross-environment communication difficult. When implemented effectively, however, a common substrate, such as running Kubernetes within VMs, can unify networking and security policies, simplifying interactions. This approach also addresses hardware utilization issues, allowing organizations to leverage existing VM capacity for container workloads instead of requiring dedicated, potentially underutilized, hardware for Kubernetes clusters.

The challenge of integrating legacy systems with modern cloud-native stacks is compounded by the ephemeral nature of IP addresses in Kubernetes environments, where new containers are spun up and old ones torn down, potentially changing IP assignments. This contrasts with the more stable IP assignments of VMs. While direct IP-to-IP communication is possible, it often necessitates complex network policies and whitelisting, creating management overhead. The use of AI, particularly large language models (LLMs), offers a promising avenue for modernizing legacy codebases, potentially leading to the decommissioning of mainframes by generating test cases and modern code equivalents.

The pathway to cloud-native adoption for legacy organizations hinges on simplifying Kubernetes deployment and management on-premises, mirroring the ease of cloud offerings. This includes providing essential developer tools like container registries and CI/CD pipelines. Furthermore, a significant trend is the repatriation of workloads from the cloud back to on-premises environments, driven by economic considerations where owning infrastructure proves more cost-effective for consistent, baseline traffic than renting it. The ideal scenario is to match workloads with the right infrastructure, leveraging the cloud for its flexibility, on-demand scaling, and access to specialized services, while utilizing on-premises for predictable, high-utilization workloads. Nutanix differentiates itself by offering a unified platform that integrates distributed storage, enterprise hypervisors, and container management, providing a comprehensive solution for both legacy and modern applications.

Action Items

  • Audit VM networking: For 3-5 core applications, document current VM-to-Kubernetes communication paths and identify potential network policy silos.
  • Create runbook template: Define 5 required sections (setup, common failures, rollback, monitoring) to standardize documentation for hybrid VM/Kubernetes environments.
  • Measure VM resource utilization: Track resource allocation and actual usage for 10-15 critical VMs to identify underutilized hardware that could support Kubernetes.
  • Evaluate AI modernization strategy: For 2-3 legacy applications (e.g., COBOL, Pascal), assess the feasibility of using LLMs for code translation and test case generation.
  • Design unified identity management: Propose a single identity and authentication system to bridge VM and Kubernetes environments, reducing communication complexity.

Key Quotes

"Fundamentally the reason that you want to learn to adopt containers is because developers who are pushing code to containers end up being able to push more frequently and the example I like to give people for that is think back to the early 2000s for those of you who can remember the early 2000s when the state of the art for search was Altavista and the state of the art for maps was MapQuest and the state of the art for mail email was Hotmail right with with a 10 megabyte limit in your inbox and then this company comes out of the blue and says you know what we're going to revolutionize all these different things at once right we're going to revolutionize all of a sudden Google search comes out and blows everything away Google Maps so much better than MapQuest or Yahoo Maps and you know look at Gmail unlimited storage and how do they do that the reason they were able to do it was because they had they were moving they were innovating so much faster and one of the big reasons was that they had adopted containers they were able to innovate quickly."

Dan Ciruli explains that the primary driver for adopting containers is the ability to increase the frequency of software deployments. He illustrates this by contrasting early 2000s internet services with Google's rapid innovation, attributing much of Google's speed to their early adoption of container technology. Ciruli argues that this faster innovation cycle is a fundamental benefit for any organization, regardless of scale.


"The reason you have VMs is because you have VMs right and and the fact is that moving to containers writing code for containers isn't fundamentally very difficult but when you have an application that's already running changing it so it can be adopted to run in containers that can be quite difficult and this is the thing that I think we even we in the cloud native community were blind to for a while because we kept saying well just re architect your application and the fact is that when you re architect an application it can take a long time it can take a lot of work and it fundamentally is the same application it doesn't give you any business value you can operate it differently and maybe you could move more quickly but maybe this is an old application that doesn't need to move quickly."

Dan Ciruli addresses the continued relevance of Virtual Machines (VMs) by highlighting the difficulty of re-architecting existing applications for containers. He points out that while writing new code for containers is manageable, modifying legacy applications can be a significant undertaking that may not yield immediate business value. Ciruli suggests that some older applications may not require the rapid iteration that containers enable.


"When things are done well you don't have silos like that between the infrastructure on which things are deployed and the simple example is you've got an app that's running in a container that is needs to communicate well with one of your existing applications in a VM ideally you can run those in such a way that you've got the same networking between them and you can write a single network policy in one system that says yeah I want that thing to be able to talk to this thing and vice versa protect them both you know don't open up everything protect them but but let them do that."

Dan Ciruli describes an ideal scenario for integrating containerized and VM-based applications, emphasizing the elimination of infrastructure silos. He explains that when done correctly, both types of applications can reside within a unified networking environment. Ciruli suggests that this allows for a single, consistent network policy to manage communication and security between containers and VMs, ensuring controlled access.


"The problem with IPs in Kubernetes land is that IPs in Kubernetes land tend to be ephemeral right right things can things can move in you know a pod with a VM you when there's when there's a new piece of software when that that VM gets upgraded or patched or something you you tend to do that on that running piece of software your IP stays constant in in Kubernetes land when when there's a new container you you don't update your container you spin up a new one you tear down the old one right that by definition that could land somewhere else could have a different IP and then you can get into things like well okay well where everything's going to go through an egress and we're just going to whitelist the egress but then you're having the right again different network policy so now you've got some network policy that's that's doing your maybe your l2 or l3 but then you've got something else internal to the cluster that's doing your l7 and saying oh is this particular thing allowed to talk here and and it just gets complex."

Dan Ciruli explains the complexity of IP address management when communicating between Kubernetes and VMs. He notes that Kubernetes pods, unlike traditional VMs, are ephemeral, meaning their IP addresses can change frequently as new containers are spun up and old ones are torn down. Ciruli highlights that this dynamic nature necessitates intricate network policies, often involving multiple layers of management (L2, L3, and L7), which can become difficult to maintain.


"One thing that that we're trying to solve is you know I said before you know a huge percentage of the of the kubernetes that runs in the world runs on hyper scalers and and the reason that is is that the hyper scalers made it really easy to get a kubernetes cluster effectively at a button click or an api call if you're automating right and to get it and have it be secure have you know upgrades that to be something you don't have to worry about you don't even think about it it just it just happens when you're in the cloud the reason why kubernetes hasn't taken off as much on prem is that that wasn't true right on prem you had to do a lot of care and feeding you were in charge of that kubernetes cluster."

Dan Ciruli identifies a key challenge in on-premises Kubernetes adoption: the difficulty of provisioning and managing clusters compared to cloud-based solutions. He explains that hyper-scalers offer a simplified, automated experience for obtaining secure and up-to-date Kubernetes clusters. Ciruli contrasts this with on-premises environments, where organizations have historically faced significant "care and feeding" responsibilities for their Kubernetes infrastructure.


"The thing that that really differentiates Nutanix is that for most enterprises they think about one vendor for virtualization one vendor for storage and a different vendor for for container management and Nutanix while we really play nice we can interoperate on any of those tiers we can use anybody's storage anybody's hypervisor anybody's container management we do offer them all in one package and so we do have customers who say yeah I'll take that because it's got everything I need and and there's no other company that can say yeah we we really do all those things well."

Dan Ciruli highlights Nutanix's unique value proposition by contrasting it with the typical enterprise approach to infrastructure. He explains that most companies manage virtualization, storage, and container management through separate vendors. Ciruli states that Nutanix offers a unified package that integrates these components, providing a comprehensive solution that other companies cannot match.

Resources

External Resources

Books

  • "The Pragmatic Programmer" by Andrew Hunt and David Thomas - Mentioned as an example of a book that advocates for continuous learning and improvement in software development.

Articles & Papers

  • "Containers are easy--moving your legacy system off your VM is not" (Stack Overflow Blog) - Mentioned as the title of the podcast episode.

People

  • Dan Ciruli - VP and General Manager of Cloud Native at Nutanix, guest on the podcast.
  • Ryan Donvin - Host of The Stack Overflow Podcast.
  • David Ferenczy Rogožan - Winner of the Necromancer badge on Stack Overflow.
  • Andrew Hunt - Co-author of "The Pragmatic Programmer."
  • David Thomas - Co-author of "The Pragmatic Programmer."
  • Louis Ryan - Mentioned as one of the inventors of gRPC.
  • Kostadis - Colleague of Dan Ciruli who joined from VMware and published posts about Nutanix.

Organizations & Institutions

  • Nutanix - Company where Dan Ciruli is VP and General Manager of Cloud Native.
  • Google - Mentioned for its early adoption of containers and development of technologies like gRPC.
  • Cloud Native Computing Foundation (CNCF) - Dan Ciruli was on the steering committee for Istio, a project within CNCF.
  • Stack Overflow - Host of the podcast and source of the episode title and badge information.
  • AWS - Mentioned for a project that saved developer work on a Java upgrade.
  • VMware - Mentioned in relation to Kostadis joining Nutanix from VMware.

Websites & Online Resources

  • Nutanix (nutanix.com) - Company website for product information.
  • Linkedin (linkedin.com) - Platform for connecting with Dan Ciruli.
  • Bluesky (bsky.app) - Platform for connecting with Dan Ciruli.
  • Stack Overflow (stackoverflow.com) - Platform where the Necromancer badge was awarded and the question was answered.
  • Art19 (art19.com) - Provider of privacy policy and California privacy notice.

Other Resources

  • Kubernetes - Infrastructure orchestrator discussed in relation to cloud-native environments.
  • Virtual Machines (VMs) - Traditional infrastructure discussed in contrast to containers.
  • Open API Initiative - Dan Ciruli was a founding member.
  • Istio service mesh - Dan Ciruli was on the steering committee.
  • gRPC - Protocol discussed as a modern solution for server-to-server communications.
  • Mainframes - Legacy systems discussed in the context of modernization and decommissioning.
  • AI (Artificial Intelligence) - Discussed as a tool for modernizing legacy systems and decommissioning mainframes.
  • LLMs (Large Language Models) - Discussed as a tool for generating modern versions of legacy code.
  • Conway's Law - Mentioned in relation to organizational structure influencing data center design.
  • AOS - Nutanix's distributed storage utility.
  • Hypervisor - Technology discussed in relation to running workloads and its role in cloud computing.
  • HTTP/2 - Protocol on which gRPC is based.
  • REST - Standard HTTP approach contrasted with gRPC.
  • Necromancer badge - Awarded on Stack Overflow for answering an old question.
  • gRPC (Google Remote Procedure Call) - Mentioned as a protocol for server-to-server communication.
  • BigQuery - Google service mentioned as an example of a service difficult to duplicate on-premises.
  • GKE (Google Kubernetes Engine) - Mentioned in relation to the scalability of Pokémon Go.
  • Distributed Storage - Core technology of Nutanix, discussed in relation to its complexity.
  • Three-tier architecture - Traditional IT architecture contrasted with distributed storage.

---
Handpicked links, AI-assisted summaries. Human judgment, machine efficiency.
This content is a personally curated review and synopsis derived from the original podcast episode.