Please ensure Javascript is enabled for purposes of website accessibility

Jefferson Lab to lead $300M data hub

The Thomas Jefferson National Accelerator Facility will lead a $300 million to $500 million data science computing hub that will make scientific data more accessible to sciences nationwide, the U.S. Department of Energy announced Monday. 

Known as the High Performance Data Facility hub (HPDF), the project will be based at the Newport News-based lab in a new data center to be funded by the state, which has allocated $6 million in seed funding and committed to provide $43 million for construction. Jefferson Lab, one of several U.S. labs run by the DOE, will partner with the Lawrence Berkeley National Laboratory, the DOE’s facility in Berkeley, California, and the new data center will have room for upgrades and expansions, according to the announcement. 

The data center will initially be 65,000 square feet, but can be extended up to 110,000 square feet, and it’s expected to be operational by fiscal year 2028. The DOE estimates an operational budget of $75 million a year, beginning in FY2028. 

“It opens the opportunity for significant research grants through federal agencies at our Virginia universities,” Jefferson Lab Director Stuart Henderson told Virginia Business on Monday. “The facility will be utilized by many thousands of scientists and engineers annually, including researchers [scientists and engineers] from private industry, academe, national labs and other government labs.” 

The investment includes creating 150 jobs, to be added over time as the project ramps up, including 50 for computer scientists and engineers, Henderson said. The facility will be organized as a hub and spoke model, with the lead infrastructure to be at Jefferson Lab in Newport News and mirrored at the Berkeley Lab. The spoke sites will be based on the Jefferson Lab-Berkeley Lab model. 

The HPDF will be led by Amber Boehnlein, Jefferson Lab’s associate director for computational science and technology, and the lab will collaborate with several Virginia universities, including Old Dominion University, William & Mary, Virginia Tech and the University of Virginia. 

The hub’s primary mission is to create “seamless integration” of scientific data from multiple facilities, allowing researchers to use and share large, complex datasets with other scientists quickly, as they pursue scientific advancements. Also, the hub will use “FAIR data principles, meaning data will be findable, accessible, interoperable and reusable,” according to the announcement. Artificial intelligence tools and machine learning are also expected to make data more easy to find via search.

“High-quality research data is the rocket fuel of the AI era and all other forms of emerging technologies,” Geraldine Richmond, DOE’s under secretary for science and innovation, said in a statement. “At the same time, modern collaborative science demands linking distributed research resources. The High Performance Data Facility will play a central role in the operation and success of the [Integrated Research Infrastructure] program, which is designed to serve the data and analysis needs of our many DOE national laboratory user facilities and more.”