Internal Developer Platforms Simplify Complexity and Reduce Cloud Waste - Episode Hero Image

Internal Developer Platforms Simplify Complexity and Reduce Cloud Waste

Original Title: SE Radio 699: Benjamin Brial on Internal Dev Platforms

TL;DR

  • Internal Developer Platforms (IDPs) address DevOps scalability issues, multi-cloud complexity, and significant cloud waste (35-45%), by bridging DevOps and developers with tools and automation.
  • Organizations should prioritize solving specific pain points over market trends, starting with simple use cases like landing zones before building complex IDP solutions.
  • A GitOps-first approach is foundational for any IDP implementation, ensuring automation is version-controlled and manageable, preventing the creation of legacy automation stacks.
  • Successful IDP adoption requires embedding security, solution architects, and other stakeholders early to ensure processes are smooth and address diverse organizational needs.
  • IDP architecture should be cloud-agnostic and leverage open-source automation to avoid vendor lock-in and maintain flexibility across evolving infrastructure landscapes.
  • Key IDP components include service catalogs, versioning engines, platform orchestration, asset inventory, and FinOps/GreenOps modules to manage costs and carbon impact.
  • AI should be treated as an assistant tool within IDPs, enhancing developer productivity and testing capabilities rather than replacing human oversight or core UX.

Deep Dive

Internal Developer Platforms (IDPs) are becoming essential for organizations to manage the escalating complexity of modern software development, addressing challenges in DevOps scalability, multi-cloud environments, and significant cloud waste. IDPs act as a crucial intermediary, bridging the gap between DevOps teams and developers by providing controlled access to tools, cloud resources, and automation for those who are not cloud or DevOps experts. However, successful adoption hinges less on technical sophistication and more on understanding an organization's specific pain points and fostering cross-functional collaboration.

The core problems that IDPs address--DevOps struggling to scale, the complexity of hybrid cloud management, and an average of 35-45% cloud waste--become acute as companies grow. When startups scale beyond a few developers, ad hoc infrastructure and tooling practices lead to inefficiencies, increased maintenance overhead, and a loss of velocity. This lack of standardization and centralized governance necessitates a more structured approach. The adoption of IDPs is not merely a technical undertaking; it requires navigating organizational politics, convincing stakeholders across various departments (including security, networking, and architecture), and overcoming the natural resistance to change. A common anti-pattern is the creation of highly technical but insular platforms that fail to address user needs or integrate with existing workflows, leading to low adoption rates. Furthermore, the rapid evolution of AI presents both opportunities and risks; while AI can assist in automating tasks and improving developer experience, organizations must maintain human oversight to ensure reliability, security, and ROI, avoiding a blind reliance on AI-generated code or interfaces.

Ultimately, the success of an IDP implementation is measured by its ability to simplify complexity, reduce waste, and empower developers, rather than by the technical prowess of the platform itself. Practical advice for organizations includes starting with simple, well-defined use cases like landing zones before attempting complex solutions, adopting a GitOps-first approach as a foundational element, and prioritizing agnostic designs that can adapt to evolving cloud infrastructures and vendor landscapes. The key takeaway is that IDPs should be viewed as a strategic enabler for developer productivity and operational efficiency, requiring a holistic approach that blends technology with a deep understanding of organizational needs and human collaboration.

Action Items

  • Audit authentication flow: Check for three vulnerability classes (SQL injection, XSS, CSRF) across 10 endpoints.
  • Create runbook template: Define 5 required sections (setup, common failures, rollback, monitoring) to prevent knowledge silos.
  • Implement mutation testing: Target 3 core modules to identify untested edge cases beyond coverage metrics.
  • Profile build pipeline: Identify 5 slowest steps and establish 10-minute CI target to maintain fast feedback.

Key Quotes

"The problem are quite simple right devops is struggling to scale right we are living definitely in a multi cloud environment so managing on premises public cloud private cloud is always a challenge right and there is also a lot of cloud wasted right an average of 35 to 45 according to major analysts right so based on these problems the goal is to build a portal and a platform between the devops on one side and the developer on the other side so the goal is to give access to tools cloud automation to end users that are not devops and cloud experts right"

Benjamin Brial explains that Internal Developer Platforms (IDPs) aim to solve the scalability issues of DevOps, the complexity of multi-cloud environments, and significant cloud waste. The core purpose of an IDP is to act as a bridge, providing developers who are not cloud experts with access to necessary tools and automation.


"we see that there is some problem of scalability when it comes to devops and there is also some problematic of how do you make sure that your devops automation is scaled right is adopted by end users that are not devops and cloud expert so in term of i would say adoption there is this need of at which stage you are it could be it transformation then it's important to set the stage right to have the right automation but it could be also you have developed your own portal and you are thinking okay i need to focus on the automation and listening my developers to how do i bring them some value and less focusing on the portal itself rather than the automation and the tool that you are embedding natively inside it right"

Benjamin Brial highlights that a key challenge with DevOps automation is its scalability and adoption by non-expert users. He suggests that the stage of IT transformation influences the approach, emphasizing that focusing on automation and developer value is more critical than the portal itself.


"we see that we created some blueprint and then there is comes the question about how do we make the evolution of terraform and to maintain the day to day operation of terraform right and the year we will go later you know as we develop we'll figure out right and then we are waiting for the breaking change so you spend a lot of time on it and then you can't upgrade all the project in the same thing so one example is just terraform or what we have lived also with customers is where do you set the border between what does terraform what does ansible what does your kubernetes platform and where do you want to invest more to make sure that at the end you don't create legacy on automation right"

Benjamin Brial uses Terraform as an example of how initial automation efforts can lead to maintenance challenges and difficulties in upgrading projects. He points out the complexity of defining boundaries between different tools like Terraform, Ansible, and Kubernetes platforms, and the risk of creating automation legacy.


"if you are a regulated company if you are a public institution company there is multiple company that have been forced to act in a certain way right so this is pretty important to make sure that the solution architect even the security are embedded in the reflection about which processes that i want to take care about all the parties that are integrated especially when it come also to multi cloud for example major organization we know that everybody have a word to say the network the security the architecture team the software team and the devops team right and there is this needs to make sure that you have some people that don't matter if you go through github's approach or if you go through a portal that are working to make them glue between all those people right"

Benjamin Brial emphasizes the importance of involving solution architects and security teams in defining processes for multi-cloud environments. He notes that in large organizations, various teams (network, security, architecture, software, DevOps) have input, and there is a need for individuals to bridge these different perspectives, regardless of the chosen approach (GitOps or portal).


"i mean our conviction is that you should respect the borders and the competencies of each business unit and your platform should be capable to include the complexity of this big organization and this is one of the key adoption right for example on our side we have some forms those forms are presented for the end user with 10 different teams behind the one that are handling the cicd the one that are handling the network the one that are handling the security aspect and all this stuff and again you can't request everything from everyone becoming the expert of everyone it's not possible when you have 5000 or 10000 it team right you can't expect everybody expert from everything it doesn't scale it's not the case and it's obviously not possible so your software should be capable to handle this capability of you present your work but you are working in an ecosystem you are not alone that is why again gitops by design and your idp should be capable to interact with your automation below and that is why we are a strong believer about the open source automation which we build link to it"

Benjamin Brial states that successful adoption of platforms requires respecting the distinct competencies of each business unit and designing the platform to accommodate organizational complexity. He illustrates this by explaining that their forms involve multiple teams (CI/CD, network, security), acknowledging that expecting everyone to be an expert in all areas is impractical for large IT departments.


"i mean you should be completely agnostic from your cloud infrastructures so it doesn't matter if it's on prem public cloud kubernetes or whatever but what you're designing should be completely agnostic for it right you should be thinking it's only this or that right because what could be true today can be wrong tomorrow right let's say you are a startup you're only on a cloud provider and then you acquire another company he's on another cloud provider then okay or you are a major organization you say yeah today we only do this cloud plus the on prem and then you have a new boss and they say you know now it's not this one right it's another cloud provider right so first agnosticity of the cloud second open source automation right you don't want to have your automation which is linked to a vendor or a constructor of hardware because by definition if they are investing on a software which is linked to your you know low layer you can make sure that they won't work to be as agnostic as possible they will invest on their architecture"

Benjamin Brial recommends designing Internal Developer Platforms to be completely agnostic of cloud infrastructures, whether on-premises, public cloud, or Kubernetes. He argues that this flexibility is crucial because technology landscapes change rapidly, and vendor lock-in with automation tools should be avoided to maintain adaptability.

Resources

External Resources

Books

  • "The Phoenix Project" by Gene Kim, Kevin Behr, and George Spafford - Mentioned as a foundational text for understanding DevOps principles.

Articles & Papers

  • "The State of DevOps Report" (Source not explicitly stated) - Referenced for statistics on cloud waste.

People

  • Benjamin Brial - CEO and co-founder of Cycloid, guest speaker discussing internal developer platforms.
  • Sriram Panyam - Host of Software Engineering Radio.

Organizations & Institutions

  • Cycloid - Company focused on simplifying and accelerating cloud adoption, represented by its CEO Benjamin Brial.
  • IEEE Computer Society - Sponsor of Software Engineering Radio.
  • IEEE Software magazine - Sponsor of Software Engineering Radio.
  • Red Hat - Company where Benjamin Brial previously worked, focusing on emerging cloud products.
  • Inovans - Company where Benjamin Brial previously worked before its acquisition by Red Hat.

Websites & Online Resources

  • se radio net - Website for Software Engineering Radio.
  • computer.org - Website for IEEE Computer Society and IEEE Software magazine.

Other Resources

  • Internal Developer Platforms (IDPs) - A core concept discussed as a solution to DevOps scalability, multi-cloud complexity, and cloud waste.
  • Internal Developer Portals - Discussed in conjunction with IDPs, providing access to tools and automation for developers.
  • DevOps - A methodology discussed in relation to scalability challenges.
  • Platform Engineering - A field emerging from DevOps, focused on building internal platforms.
  • GitOps - A foundational approach for IDP implementation.
  • Infrastructure as Code (IaC) - A method for managing infrastructure, with Terraform mentioned as an example.
  • Configuration Management - A practice discussed in relation to managing infrastructure.
  • CI/CD (Continuous Integration/Continuous Delivery) - A set of practices for software development, with Jenkins and Argo CD mentioned.
  • Service Catalogs - A key component of IDPs, providing building blocks for development.
  • FinOps - A module within IDPs focused on cloud cost management.
  • GreenOps - A module within IDPs focused on the environmental impact of cloud usage.
  • Landing Zones - A simple use case for IDP implementation.
  • Security as Code - An approach to embedding security practices into code.
  • Cloud Management Platform (CMP) - A category of tools discussed in relation to IDPs.
  • Artificial Intelligence (AI) - Discussed as a tool to assist developers and DevOps, not as a replacement for human interaction or UX/UI.
  • Large Language Models (LLMs) - A type of AI discussed in the context of potential economic impact and infrastructure resource consumption.
  • Terraform - An Infrastructure as Code tool mentioned as an example of automation that can become complex to maintain.
  • Ansible - A configuration management tool mentioned in the context of defining boundaries with other tools.
  • Jenkins - A CI/CD tool that organizations are looking to move away from.
  • Argo CD - A GitOps continuous delivery tool.
  • OpenStack - An emerging cloud product mentioned in Benjamin Brial's past work.
  • OpenShift - An emerging cloud product mentioned in Benjamin Brial's past work.
  • Kubernetes - A container orchestration platform discussed in relation to adoption challenges and its market evolution.
  • Mainframe - A type of traditional infrastructure that poses integration challenges for IDPs.
  • Cobalt - A programming language still in use in multi-bank environments, posing integration challenges.
  • GRPC - A communication protocol mentioned as an example of an API interface.
  • REST - A communication protocol mentioned as an example of an API interface.
  • OpenAPI - A specification for describing RESTful APIs.
  • VDI (Virtual Desktop Infrastructure) - A use case for developers.
  • CLI (Command Line Interface) - Presented as a superior UI experience compared to some AI interfaces.
  • Terra Cotta - An open-source project that can generate automation from cloud infrastructure.
  • GitHub - A platform mentioned in the context of GitOps and developer workflows.
  • Firefox - Mentioned as an example of a portal (UI).
  • Slack - An instant messaging platform.
  • Gartner Conference - An upcoming event where Cycloid will be present.
  • Gartner Developer Environment Ecosystem - An upcoming event where Cycloid will be present.
  • DevOps Rex - An upcoming event where Cycloid will be present.

---
Handpicked links, AI-assisted summaries. Human judgment, machine efficiency.
This content is a personally curated review and synopsis derived from the original podcast episode.