HomeNewsDeep Dive into ZStack Cloud v5.4.0 LTS: Three Core Infrastructure Breakthroughs

Deep Dive into ZStack Cloud v5.4.0 LTS: Three Core Infrastructure Breakthroughs

2025-09-27 17:27

Table of Contents

Following our previous article “ZStack Cloud v5.4.0 LTS Preview: Opening a New Chapter in Intelligent Cloud,” this piece takes a deeper look at the infrastructure-level innovations in ZStack Cloud v5.4.0 LTS. Through significant advancements across virtual machines, networking, and storage, it helps enterprises build a smarter, more efficient, and more reliable digital foundation.

 

These optimizations stem from real-world practices across thousands of ZStack customer production environments, deeply integrated with cutting-edge technologies. Each improvement directly targets core pain points in today’s enterprise IT infrastructure and intelligent transformation.

 

1.Intelligent Virtual Machines: A Leap from “Passive Response” to “Proactive Prediction”

 

1.1 Intelligent High Availability: Precise Protection That Ends “Collateral” Reboots

 

Traditionally, HA handles all VM stop events uniformly—whether a planned maintenance stop or a genuine fault—triggering an automatic reboot. While seemingly “safe,” this complicates normal operations.

 

In 5.4.0, we redesigned HA trigger logic around intelligent determination. The system first judges whether the VM stop is a planned operation, and only initiates HA when it confirms an unexpected failure.

 

Meanwhile, via the management node, at least three physical hosts in the cluster simultaneously probe the target host’s status and connectivity. This “multi-point verification” prevents single-point misjudgment and greatly improves accuracy.

 

1.2 SR-IOV Live Migration: Resolving the “You Can’t Have Both” Between Performance and Availability

 

ZStack has long supported SR-IOV, delivering near bare-metal network performance—essential for latency-sensitive workloads. The tradeoff, however, was the inability to live-migrate VMs, meaning maintenance or host failures would inevitably impact services. Many customers were forced to choose between high performance and high availability.

 

After extensive R&D, version 5.4.0 introduces live migration for SR-IOV VF NICs. In short, during migration the system automatically preserves all NIC state—network connections, configuration parameters, etc.—and fully restores them on the destination VM. The entire process is transparent to the workload.

 

1.3 Large-Scale Operations: Goodbye to “One-by-One Manual Tasks”

 

As ZStack adoption has grown, customer VM fleets now range from dozens to thousands. At scale, previously simple tasks—like patch distribution or pushing security scripts—become time-consuming grinds. To address this, we’ve added several practical capabilities:

 

Bulk file distribution: Push files to hundreds of VMs at once. The system handles concurrency and retries to ensure reliable delivery.

 

Bulk command execution: From a unified web UI, issue commands to multiple VMs and view per-host results in real time—no more juggling countless SSH sessions.

 

Script library management: Store commonly used Ops scripts centrally and invoke on demand. Scripts are encoded to secure transmission.

 

Extensibility: With an XML Hook mechanism, automatically run custom scripts at key VM lifecycle events (create, start, stop), enabling automated Ops workflows.

 

2.Cloud Networking: Upgrading from “Best Effort” to “Deterministic Performance”

 

2.1 OVS-DPDK: A “Turbocharger” for Network Performance

 

Traditional network virtualization has a performance ceiling—especially for NFV, real-time processing, and similar scenarios where vanilla OVS often falls short. OVS-DPDK acts like a turbocharger by bypassing the kernel networking path and processing packets in user space.

 

2.2 Feature Completeness: Performance Without Sacrificing Capabilities

 

A common concern is that the pursuit of extreme performance can come at the cost of features—an issue seen in many historical high-performance solutions.

 

In version 5.4.0, our product and engineering teams invested heavily to ensure feature completeness with OVS-DPDK. It now fully supports core cloud networking services:

 

QoS traffic control: Apply bandwidth limits and priorities by traffic class to assure resources for critical workloads.

 

DHCP auto-configuration: VMs automatically obtain IP addresses and network settings.

 

Security groups: Fine-grained access control to precisely permit or block traffic.

 

NIC bonding optimizations: Support for Active-Standby, Load Balancing (SLB), and Load Balancing (TCP) bond modes for HA and better performance.

 

2.3 Dual Optimization of Cost and Security: Flexible Resource Configuration

 

Network costs—especially public bandwidth—are a significant part of IT budgets, while security isolation is non-negotiable. How to control costs without compromising security? Version 5.4.0 offers compelling answers.

 

Shared bandwidth: Reducing the sting of public egress fees

Traditionally, each public-facing service purchases bandwidth independently—expensive and inefficient. The new shared bandwidth service lets multiple public IPs share a single bandwidth pool, like ridesharing for traffic costs. The system intelligently allocates within the pool to ensure fair guarantees, especially effective when services have complementary demand peaks.

 

PVLAN: Fine-grained isolation under one roof

PVLAN addresses a realistic challenge: saving on network gear while achieving granular isolation within the same environment. In essence, PVLAN subdivides a single VLAN into isolated subdomains that can’t talk to each other but can all reach the upstream gateway—like residents in separate rooms of the same building who all access the lobby via the elevator. PVLAN enables full tenant isolation while reducing device count, and even within a single tenant, cleanly separates dev, test, and prod.

 

2.4 Visualized Operations: Making Load Balancing No Longer a “Black Box”

 

Traditionally, operating LBs felt like “feeling the elephant in the dark”—you know it’s working, but where the bottlenecks are often only becomes clear after issues arise.

 

Version 5.4.0 provides x-ray-like visibility for load balancers. Intuitive line charts show real-time ingress and egress traffic trends. Need deeper insights for a time window or service? Multi-dimensional filters let you slice by time, service, backend server, and more.

 

More importantly, detailed session metrics—connection counts, concurrent sessions, response time—offer actionable clues to pinpoint problems fast. With these visual tools, issues that previously required multiple monitoring systems can now be diagnosed in one interface. For example, if a service slows, you can quickly check whether it’s uneven LB distribution or a bottleneck on a specific backend.

 

3.Cloud Storage: Beyond “One-Size-Fits-All” to Tailored Options

 

3.1 New High-Performance Choices: Vhost and ZBS Step In

 

Vhost is our new user-space storage type. Its hallmark is bypassing the traditional kernel storage stack to process I/O in user space—like switching from a detour to a straight express lane—significantly reducing latency.

 

Vhost already integrates with leading high-performance distributed storage such as XEBS-XINFINI, providing new options for workloads with extreme storage performance requirements.

 

ZBS is an all-flash distributed storage product developed in-house over four years. With an architecture purpose-built for all-flash performance, ZBS targets high-performance HCI and disaggregated deployments, delivering extreme performance for I/O-intensive apps like databases and real-time analytics.

 

Compared to traditional solutions, ZBS performance is striking: on equal hardware, ZBS achieves about 2x the FIO performance and up to 3x DD performance. Crucially, ZBS natively supports RDMA, fully leveraging 25Gb/100Gb networks.

 

3.2 CDP on High-Performance Storage: Taking the “Undo” Button to the Extreme

 

Continuous Data Protection aims for seamless protection. Rather than periodic backups, it continuously records data changes, enabling recovery to any second-level point in time—like a “time machine” for your data.

 

In 5.4.0, we added CDP support for high-performance storage, solving the pain point where core services on ZBS and similar systems also require CDP. Previously, CDP targets were limited to standard storage, and mismatched performance could cause I/O backpressure and task failures. Now, with continuous snapshots on high-performance storage, VMs can preserve changes at second-level granularity.

 

Conclusion: Infrastructure Value Lies in “Silent Enablement”

 

Good infrastructure is like city utilities—unseen in use, yet quietly powering everything. The infrastructure improvements in ZStack Cloud v5.4.0 LTS aim for exactly that: making enterprise IT more reliable, efficient, and intelligent, without adding complexity.

 

Three core capability boosts:

Intelligent VMs: Smarter systems with less manual intervention

High-performance networking: Breaking ceilings to support new scenarios

Elastic storage: More choices to fit differentiated needs

 

The value of these improvements shows not just in technical metrics, but in customer success. When hospital information systems see fewer incidents thanks to better HA, when banking transactions respond faster with high-performance storage, and when manufacturing Ops teams double efficiency with automation—this is the true meaning of innovation.

 

Next Up: In the next installment, we’ll focus on intelligent operations and security compliance, detailing how ZStack Cloud v5.4.0 LTS enhances platform operability, data security, and regulatory adherence to provide comprehensive enterprise assurance.

//