Latest News
A decade ago, the Cloud and Infrastructure Services team explored migrating CIT’s on-premises VMware virtual machine environment to a hyper-converged solution. The prohibitive costs of the new hyper-converged infrastructures, combined with Cornell’s high disk I/O requirements, made that transition impractical.
Disk I/O refers to the input/output or read/write transfer of data to and from the disk of a storage device, a critical component for physical servers supporting multiple virtual machines (VMs). Frequent and fast I/O operations can strain systems and reduce performance, and the early hyper-converged infrastructures did not yet handle that demand as cost-effectively as the on-premises environment.
But change is constant and, in November 2023, Broadcom’s acquisition of VMware introduced significant price increases for on-premises virtualization services. Decreases in I/O requirements due to database migrations to cloud, combined with industry advances in Non-Volatile Memory Express (NVMe) drives and the rising costs of legacy Storage Area Network (SAN) hardware, meant the increased VMware price had shifted the cost differential. Hyper-converged was back on the table.
Opportunity aligns with Cost Savings
The Cloud and Infrastructure Services teams conducted a strategic reassessment of on-premises virtualization costs and objectives. The timing of that reassessment coincided with the expected expense of lifecycle replacements for legacy storage components. Clearly, the opportunity for migrating to the Nutanix hyper-converged infrastructure would result in immediate cost savings and long-term cost-effectiveness. One of the on-premises storage virtualization components ripe for change was the SAN Volume Controller (SVC).
“Retiring the IBM SVC alone saved approximately $800,000 in licensing fees over the past four years,” noted Dave Shirk, manager of CIT’s Storage, Backup, and Servers team.
In moving away from the traditional VMware three-tier architecture of servers, network, and storage as discrete components, Shirk's team faced a major shift of focus—although some fundamentals remained the same and could be leveraged to manage the Nutanix clusters.
The decision to migrate and the migration itself occurred quickly. In just over a year, the CIT team successfully architected and deployed the new infrastructure, replacing legacy ESX hosts, VMware environments, and SAN storage.
Moving 300 Servers
After migrating several of their own servers into the new environment and launching parallel efforts to build Nutanix support into internal provisioning and management tools, the Systems Support team, led by Scott Sorrentino, began collaborating closely with customers to migrate over 300 servers to the new Nutanix platform.
"The idea of picking up 300 VMs and moving them from one hypervisor to another is something I'd never seen done in my 20 years at Cornell," said Sorrentino.
The team built and tested a process to schedule migrations as weekly batches, leveraging a vendor-provided data replication tool and some custom scripting to prepare each VM with the necessary operating system level changes to support running under the Nutanix Acropolis hypervisor. The actual “lift and shift” was eased by continuous communications within the team and with their customers. This major server transformation was seamlessly completed over a period of fourteen weeks, with the impact on most customers coming in the form of a brief reboot.
"Cornell’s transformation to a hyper-converged environment has positioned the university for greater scalability and cost efficiency, ensuring a more resilient infrastructure for the future," said Shirk.
Comments?
To share feedback about this page or request support, log in with your NetID