home About Us Podcast Episode 35

The Critical Lowdown Podcast Episode 35

Scale. Innovate. Profit: Overcoming Data Center Challenges through Open Networking

Continue Listening: Part 2

Data Centers continue to be the backbone of our digital world. The increase in demand for faster and more efficient interconnectivity has lead to the on-going challenges of increasing scalability while managing costs.
 
Our Senior Systems Engineer Barry McGinley, along with guests Prasanna Kumara from IP Infusion and Kei Lee from UfiSpace, delve into how Open Networking can meet the twin challenges and surpass them.
 
Discover how this focus on affordability is achieving the ultimate goal: driving down prices for consumers while maintaining high standards of innovation and scalability.

Subscribe to The Critical Lowdown from EPS Global wherever you get your podcasts:

Apple
Spotify
ximalaya
 
Barry McGinley - EPS Global

Barry McGinley
Head of Technical for EMEA, EPS Global

Prasanna Kumara S - IP Infusion

Prasanna Kumara S
Technical Marketing Engineer, IP Infusion

Kei Lee - UfiSpace

Kei Lee
AVP Technical Sales, UfiSpace


If you have any questions about or need advice or tech support for your upcoming project, don’t hesitate to get in touch.


Transcript of Podcast

Barry: Thanks for joining us for another installment of the Critical Lowdown from EPS Global. I'm Barry McGinley, Head of Technical for EMEA and have been with EPS for nearly 7 years, working on POCs in the Data Center Service Providers market, including CSP, ISPs, and MSPs.

Today's podcast is titled "Scale, Innovate, Profit - Open Networking for Cloud and Managed Service Providers." We'll explore scale, innovation, and profit within the Open Networking realm for cloud and managed service providers. I'm joined by two great guests: Prasanna Kumara from IP Infusion and Kei Lee from UfiSpace. With expertise in both hardware and software, we have most of the bases covered for today's discussion.

Kei, welcome, and thanks for joining us. Can you start by introducing yourself, your role at UfiSpace, and a little bit about what UfiSpace does?

Kei: Absolutely, Barry. Thank you. My name is Kei Lee, I'm responsible for Technical Sales in Northern California, and I've been with UfiSpace for a few years. UfiSpace has been around for over a decade, focusing on two main lines of business: Data Centers and Telecommunications for carriers. We specialize in hardware and firmware design using open disaggregated solutions, leveraging off-the-shelf merchant silicon for both the data plane and control plane, as well as other supporting technologies around the white box.

Later today, I'll discuss why open disaggregated solutions make sense in today's environment.

Barry: Thanks Kei. Prasanna, the same question to you. Can you introduce yourself, your role at IP Infusion, and provide a bit of background on the company? I know we've had IP Infusion on the podcast before, but it would be great if you could go into a bit more detail.

Prasanna: Thanks, Barry. I'm Prasanna, working as a Technical Marketing Engineer for IP Infusion. Before this role, I spent nearly two decades in development, focusing on EVPN and VXLAN. I've been with IP Infusion for around 12 years. IP Infusion has been in the network operating system space for over 20 years, initially providing software for ODMs. For the past 8-9 years we've been focusing on disaggregation, offering solutions from access to core networks. It's great to be here with UfiSpace and EPS. Let's dive into our discussion and explore what we can achieve together.

Barry: Great. I'm always a little bit surprised when I hear how long IP Infusion has been around, as it's older than the concept of Open Networking itself. Let's dive into understanding Open Networking.

What is Open Networking, and how does it differ from traditional networking?

Prasanna: Open Networking is quite amazing because it integrates various open standards and components. From the northbound management layer, we have open config, and then we have an open Network Operating System (NOS), such as the Open Compute Network Operating System (OcNOS), which communicates with the management layer. The NOS then interacts with Open Compute Hardware, which in turn connects to open optics. Essentially, it's open at every level, working together with all these standards.

This approach enables disaggregation, providing greater flexibility by allowing multiple vendors to contribute their own solutions and excel in their specific areas. Combined, these contributions create a comprehensive, cost-effective solution that fosters innovation. Each vendor focuses on their niche, gaining deep expertise and delivering significant value to users. This collaborative and specialized approach makes Open Networking very beneficial.

Barry: Something you mentioned is the integration of multiple components—hardware, open optics, and software. While it's great to have these open and interoperable elements, having a single point of contact for support is crucial. IP Infusion has done well in the market by offering bundled solutions. This means whether it's a hardware or software issue, you have one phone number to call for support. This approach simplifies the process and mitigates some of the hassles associated with Open Networking, providing the benefits of multiple vendors while ensuring streamlined support through a single vendor. This is a significant advantage that you guys offer.

Kei, we hear a lot about Open Networking, but there are so many different names for it. We've always called it Open Networking, but when telecoms got involved, organizations like TIP and ONF started calling it "Disaggregation", "Disaggregated Networking", or "Bare Metal Networking".

Can you explain the concept of disaggregation from the hardware perspective?

Kei: Traditionally, OEMs dominated the market before Open Networking gained the momentum it has today. To answer your question, Open Networking and Disaggregation essentially revolve around one major concept: decoupling hardware and software. This decoupling benefits the end user by providing the choice to pick and choose the type of hardware and software they need.

I used to work in OEM technical sales before joining UfiSpace. In that environment, vendors typically bundled hardware and software together, regardless of whether you needed all the software features. You had to pay for the entire package, and the software image was often quite large compared to something like IPI's Atlas. With the decoupling of hardware and software, customers can now select specific hardware and merchant silicon based on performance needs and choose only the necessary software features.

For example, if a customer only needs to run BGP and MPLS on their network, they can pick an open disaggregated software solution like IPI. This results in a smaller, more efficient software image loaded onto the hardware, unlike traditional bundled solutions where you might end up with a large software image containing many features you don't need but still have to pay for. This flexibility and efficiency are the key advantages of open disaggregated networking.

Barry: Choice is a key aspect we've always emphasized to our customers as well. Some software solutions are more mature than others, like IPI, which has been around longer than Open Networking itself. Thanks for that explanation, Kei.

Let's move on to our main topic: Scale, Innovate, and Profit.

We'll start with scale, focusing on Data Centers and Service Providers, including MSPs (Managed Service Providers) and CSPs (Content Service Providers). Prasanna, scaling any network comes with challenges, but what are the key challenges for Cloud and Managed Service Providers when they're trying to scale their networks?

Prasanna: Thanks for the question, Barry. Scaling networks, especially for Cloud and Managed Service Providers, comes with several challenges. One major issue is the growing data, which necessitates capacity expansion. Existing legacy L2 networks often have limited switch capacity and are challenging to replace. L2 networks also pose additional challenges.

Resource planning is another significant hurdle. This includes dealing with vendor lock-in from traditional vendors, high network infrastructure costs, long lead times, and management complexity. Deciding where to buy devices and what to purchase adds to the complexity. All these factors combined make scaling a network a challenging task.

Barry: We've always told our customers that with bare metal hardware, you can keep spares on a rack. You don't need to have the software or the latest image on them initially. If something breaks or you need to scale, you just take one off the shelf, load the NOS, apply your licensing, and you're good to go.

Kei, how does Open Networking enable service providers to scale their infrastructure more efficiently on the hardware side?

Kei: Open hardware is based on off-the-shelf merchant silicon, which is very different from traditional OEMs that use vendor-specific ASICs (Application-Specific Integrated Circuits). The reason we can leverage this today, and not maybe 10 years ago, is that merchant silicon has advanced significantly in terms of features, capacity, and power efficiency. This allows us to build hardware with high performance at an economical cost, which we can then pass on to the customer.

To add to what I mentioned earlier about decoupling hardware and software, this approach greatly benefits managed service and cloud service platforms. Managed services tend to be more personalized, allowing customers to specify their needs, whereas cloud services often follow a more templated approach, offering infrastructure as a service, platform as a service, or software as a service.

When it comes to scaling, three factors are always crucial:

  1. Power

  2. Cooling, and

  3. Space.

Merchant silicon designs in a disaggregated model allow us to build devices in compact 2RU or even 1RU boxes, making scalability, power, and cooling much more efficient. Additionally, the decoupling of hardware and software makes the infrastructure highly scalable and flexible.

We employ a "Pay-As-You-Go" model, which is particularly advantageous. For example, if you start with a managed service or cloud service for 100 or 1,000 users, this model allows you to scale up to several thousand or even tens of thousands of users without changing your infrastructure. These are the key points I wanted to highlight and share with the audience.

Barry: Thanks, Kei. Off-the-shelf merchant silicon is a big, long name, but we can just say Broadcom. Marvell is also in the mix, but Broadcom has been brilliant on the ASIC side, especially with their Tomahawk series. The power and performance of these ASICs are incredible, even outpacing Moore's Law. So, thanks for that.

Now, let's move on to innovation. Everyone is always looking for an edge, especially in this market.

Prasanna, how does Open Networking help data centers innovate compared to the traditional approach?

Prasanna: That's an interesting question, Barry. Open Networking addresses several key aspects that drive innovation in data centers. Firstly, it helps with scaling by adhering to industry standards, which optimizes performance and reduces costs. It also offers better lead times compared to traditional approaches. One of the significant advantages of Open Networking is the faster release cycles for both software and hardware. This rapid development brings a lot of innovations to meet the demanding network requirements, whether it's handling more data or providing faster responses. Innovations are happening on both ends—software and hardware—allowing the end customer to choose what they need now and what they might need in the future. This flexibility and continuous improvement are what Open Networking offers to data centers.

Barry: OK, I'll continue with you, Prasanna, since you're more on the technical side. EVPN over VXLAN is one of the buzzwords for Data Centers, along with technologies like RDMA over Converged Ethernet (RoCE) and AI, which we're seeing everywhere.

Can you explain EVPN over VXLAN and how it simplifies network operations compared to traditional approaches?

Prasanna: To understand EVPN over VXLAN, we need to look back a bit. VLANs were introduced as a form of network virtualization and were widely accepted in the networking world. More than 90% of medium and smaller MSPs and data centers still use L2 networks. VLANs are effective, similar to how multi-threading improved computing. However, as we've seen with the rise of hypervisors and Docker in computing, the network also needs to evolve to meet growing data demands.

Network virtualization overlays, like EVPN over VXLAN, bring more advanced virtualization to networking. They help connect different L2 and L3 networks by building an overlay on L3. The advantage of L3 is that it is loop-free and guarantees the shortest path networks, eliminating issues like VLAN loops and blocking.

EVPN provides a range of services over this overlay, whether you need L2 VPN, L3 VPN, or a combination like IRB (Integrated Routing and Bridging). It also supports multicast and other services through various RFCs. EVPN can manage services end-to-end, from Cell Site Routers to Service Providers, through aggregation and into the cloud, all with a single, unified control plane.

In data centers, using a leaf-spine architecture with EVPN over VXLAN and unnumbered interfaces offers a simplified and scalable way to manage network demands. As Kei mentioned, you can grow your network as needed by expanding your leaf and spine layers. This flexibility and scalability are the key benefits of using EVPN over VXLAN in a data center environment.

Barry: That was a brilliant description as it's not the easiest thing in the world to explain!

Kei, I've got a tough one for you now. Sustainability is a hot topic in every boardroom around the world. How does UfiSpace's hardware enhance the sustainability of data centers, especially when considering green initiatives within a data center?

Kei: We already touched on hardware flexibility a moment ago, and that ties directly into sustainability. The scalability and quick time-to-market of our hardware contribute significantly to sustainability efforts. Again, there are 3 key aspects to consider: Cooling, Power, and Space.

When we design our hardware, we focus heavily on thermal simulation, airflow, and power management. For instance, in the design of a switch, you have the Data Plane and the Control Plane. Traditionally, the Control Plane uses an Intel processor to run software like OcNOS, while the data plane uses merchant silicon to execute instructions from the control plane, pushing packets back and forth.

To address sustainability in terms of power management and cooling, we use a separate processor called the BMC (Board Management Controller). This distributed architecture means we don't burden the data plane or control plane with housekeeping tasks. Instead, the BMC handles these tasks, allowing for more efficient power and thermal management.

We conduct extensive thermal simulations to control airflow, whether it's back-to-front or front-to-back, and we use micro venting hole technology to optimize this airflow. Another critical aspect is the fan design. We use multiple fans, typically in an N+1 configuration. The BMC, equipped with temperature sensors, dynamically manages these fans. If the temperature inside the box rises to a certain level, the fans kick in at a higher RPM. Once the temperature drops to a specific threshold, the fans slow down, saving energy and reducing noise.

This intelligent and dynamic management adjusts based on the environment inside the data center or central office, enhancing overall sustainability. These are just a few examples of how our hardware design incorporates intelligent features to promote sustainability.

Barry: That's really interesting about the BMC chip. I remember looking at 1G switches about 15 years ago and the size of the fans, and now I'm looking at 400G Routers from UfiSpace and the fans are much bigger. The heat from these optics is significant, so it's smart to have that intelligence inside the box, not just relying on the temperature sensors in the data center.

So, the final area, and this is the important one, is profit. It's all about money in the end for these guys.

Prasanna, can I ask you, what are the key benefits of adopting Open Networking for Data Centers, whether for Cloud or Managed Services, in terms of profitability?

Prasanna: The key benefits of adopting Open Networking for data centers, whether for Cloud or Managed Services, primarily revolve around cost savings and flexibility.

First, there's a significant reduction in costs. With Whitebox Networking, you get open hardware, open software, open optical systems, and open configurations. You also have access to open controllers. This openness allows you to choose components from different vendors that best meet your requirements and data demands, often at a lower cost compared to traditional, single-vendor solutions.

In terms of CAPEX (capital expenditures), having multiple vendors to choose from fosters innovation, leading to higher capacity in smaller form factors at competitive prices. For OPEX (operational expenditures), the adoption of EVPN / VXLAN simplifies operations. Once an operator is knowledgeable about EVPN, they can manage everything from service provider aggregation to the data center core, leveraging their experience across the entire network.

Operational efficiencies are also improved due to reduced delivery times. Our software, for instance, features an industry-standard CLI, which means a shorter learning curve for operators. This reduces OPEX as operators can quickly learn and adopt the system. Additionally, faster deployments help meet growing demands more efficiently.

Overall, Open Networking provides end users with cost savings, flexibility, and operational efficiencies, making it a profitable choice for data centers.

Barry: That's a really good answer and something we've found with our customers as well. Not having some of the licensing models that bigger vendors impose is a huge benefit. With Open Networking, you make one upfront payment for your switches and the license, and then there's a small support contract depending on your 1, 3, or 5 year needs. There's no recurring annual fee or additional charges for new features. Everything is upfront and transparent.

So, Kei, something we hear about all the time is vendor lock-in and how Open Networking avoids it. Can you explain the financial benefits of avoiding vendor lock-in?

Kei: I'll summarize this into two key points.

The first key point is the Pay-As-You-Go model. When you look at Data Centers or Central Offices, power, cooling, and space are crucial considerations beyond the technical design of the hardware itself. With the Pay-As-You-Go model, especially in open disaggregated systems, devices are typically one RU or two RU, not large chassis. This allows you to build infrastructure incrementally. For example, you can start with a manageable scale and easily expand as your user base grows from hundreds to thousands without altering your existing infrastructure. This scalability is a significant financial benefit.

The second key point is the distinction between CapEx and OpEx. For CapEx, using off-the-shelf components rather than proprietary ASICs reduces costs while maintaining performance. The economies of scale from leveraging merchant silicon allow us to build devices at a lower cost, which is why the open disaggregated model is gaining momentum, even among large industry players. Merchant silicon has advanced significantly over the past decade, matching many of the features of proprietary ASIC designs.

For OpEx, in a closed, vendor-locked model, you typically have to buy hardware, bundled software, and a service support contract from the same vendor. This lock-in means you can only purchase service contracts from that vendor, often at a higher cost. In an open disaggregated model, hardware and software are decoupled, reducing service contract costs. As you mentioned, Barry, this allows companies like IPI to act as system integrators, providing a one-stop shop for end customers, which significantly lowers maintenance and operational costs.

Lastly, from my experience, financial decisions often drive technical decisions. If a solution makes financial sense, it will eventually be adopted. I've noticed that CTOs often report to CFOs, underscoring the importance of financial considerations in technical decisions.

Barry: I completely agree with you there. While I don't like it when customers focus solely on cost savings, in the end, that's often what it comes down to. Off-the-shelf silicon and initiatives like the OCP over the last 14 years have allowed companies to make hardware more affordably. The goal is for everyone to use the same components, driving prices down for consumers, and it is working.

Glossary of Terms

  • ASIC (Application-Specific Integrated Circuit): A type of integrated circuit customized for a particular use, rather than intended for general-purpose use.
  • BMC (Board Management Controller): A specialized microcontroller embedded on the motherboard of many computers, especially servers, to manage the interface between system management software and hardware.
  • Data Plane: The part of a network that carries user traffic, as opposed to the control plane, which carries signaling traffic.
  • EVPN (Ethernet VPN): A network technology that provides Layer 2 and Layer 3 VPN services over an IP/MPLS network.
  • Merchant Silicon: Standardized silicon chips used in networking hardware, as opposed to proprietary ASICs.
  • RDMA (Remote Direct Memory Access): A technology that allows data to be transferred directly from the memory of one computer to another without involving the CPU, cache, or operating system of either computer.
  • RoCE (RDMA over Converged Ethernet): A network protocol that allows RDMA over an Ethernet network.
  • VXLAN (Virtual Extensible LAN): A network virtualization technology that attempts to improve the scalability problems associated with large cloud computing deployments.
  • IRB (Integrated Routing and Bridging): A network configuration that allows for both routing and bridging within the same device, providing flexibility in network design.
  • Leaf-Spine Architecture: A network topology commonly used in data centers where leaf switches connect to spine switches, providing a scalable and efficient network design.

Need Help?

We have local language and currency support in each of our 28 locations, ensuring you always have access to friendly customer support to deliver your hardware solutions regardless of your location.