
Addressing Data Center Relocation Challenges Insights from H3C's bilibili Project
Bilibili is a popular platform that serves as a unique blend of social media and video-sharing site within the Chinese internet landscape. It offers a wide range of content, including diverse activities, lifestyle insights, gaming, entertainment, and technology knowledge. Users can engage with a variety of topics as well as participate in community-driven content through user-generated videos (PUGV) and professionally produced videos (OGV). The platform not only supports content consumption but also emphasizes commercial video production, allowing creators to reach their audiences directly and engage in monetization opportunities.
After 18 months of work across multiple regions and the relocation of tens of thousands of servers and switching equipment, the bilibili data center has successfully completed its relocation project. The new data center features more advanced infrastructure and enhanced technical support. This upgrade will optimize the business layout, support overall remote multi-active operations, improve resource utilization and operational stability, and provide a better access service experience for bilibili users.
Data center relocation is a complex systematic project that not only involves the relocation of servers, switches, routers, firewalls, and storage devices, but also requires consideration of data and business migration, network connection migration, computer room environment adjustment, and other aspects. Improper operations during relocation may lead to serious consequences such as equipment damage, data loss, and business interruption.
To ensure the shortest downtime and zero impact and interruption to user services, bilibili collaborated with H3C to carefully plan the migration and relocation process. This partnership leveraged H3C's extensive experience in data center migrations and its strong team support. The project team devised a comprehensive emergency plan that addressed various scenarios, including unexpected changes in business operations, adjustments to data center policies, network outages under special circumstances, alterations in entry and exit procedures, and emergency repairs for transportation equipment in the data center. Each potential risk and issue was met with a clear and detailed emergency strategy, ensuring the smooth execution of the relocation while safeguarding data security and maintaining business continuity.
During the 18-month multi-batch rolling migration, the project team effectively addressed various challenges, including complex scenarios, lengthy cycles, multiple coordinating parties, and difficult execution. The team executed the process step by step on-site.
For instance, one batch of relocations involved over 1,700 devices, which required completing the entire process—from equipment removal to business re-launch—within one week. Team members dedicated themselves to the tasks at hand, which included shutting down businesses, backing up data, dismantling servers and switches, as well as transporting, installing, and shelving the equipment.
All processes were carried out in an orderly manner, allowing the team to complete all relocation tasks on time. They achieved an impressive failure rate of less than 0.1% while ensuring a smooth business restart.
With the national "dual carbon" strategic goal in mind, the new data center at the bilibili is focused on green energy conservation. The project incorporates the principles of a low-carbon economy along with energy conservation and emission reduction into its design and construction. Through careful layout planning, the use of advanced energy-saving equipment, and efficient operation and maintenance management, the overall Power Usage Effectiveness (PUE) of the computer room has been reduced from 1.5 to below 1.25. This improvement significantly lowers energy consumption and carbon emissions while enhancing the service level agreement (SLA) of the computer room.
Additionally, the new data center employs state-of-the-art network equipment, which greatly improves network transmission efficiency and response times. The optimization of network topology and security measures has also considerably decreased the risk of network failures and downtime.
It is noteworthy that H3C seized the opportunity to assist Bilibili in conducting a comprehensive management overhaul of the servers. This included replacing faulty hardware in batches, updating problematic firmware versions, standardizing host BMC/BIOS configurations, and aligning kernel versions and system environments. These steps were taken to ensure consistency across the system, simplify operation and maintenance management, and ultimately improve the operating efficiency and stability of the new computer room.