Donate to Science & Enterprise

S&E on Mastodon

S&E on LinkedIn

S&E on Flipboard

Please share Science & Enterprise

Data Architecture Studied to Speed Genomic Sequencing

Full data screen

(Markus Spiske, Unsplash)

29 Jan. 2019. An academic-industry team of researchers and engineers is studying better ways of structuring the large volumes of data storage needed to speed up genomic sequencing. The project brings together researchers from the genomics institute and engineering school at University of California in Santa Cruz with colleagues from data storage device maker Western Digital in San Jose, California, the funder of the initiative.

The research team seeks to solve a growing bottleneck of computing power needed to conduct genomic sequencing in the clinic for diagnosing disease. Even with modern computers, genomic sequencing requires large-scale data storage and complex algorithms using intensive analytical power. Today’s computational methods normally call for moving these high volumes of data to the computing sites designed for multiple business applications, which calls for large bandwidths, greater power requirements, and more time needed to complete the sequencing. Those demands for computing resources are expected to grow quickly as precision medicine, identifying treatments to meet the patient’s unique molecular composition, continues to expand.

Western Digital’s engineers believe that process can be accelerated by reversing the process and moving computing power closer to data storage devices. The project will test the company’s idea with real life genomics data volumes, stored on different devices within a single computing system. “By moving the compute to the data, rather than the data to the compute,” says Robin O’Neill, head of emerging systems and software at Western Digital in a company statement, “we expect to take advantage of significantly greater bandwidth access to the genomics data by the near-media compute, as well as the custom, parallel computing capabilities within each computational storage device.”

The Western Digital team, with associates from the UC Santa Cruz Genomics Institute and the university’s engineering school, are expected to identify optimal methods for partitioning genomics data across different storage devices. The researchers also anticipate studying core functions on each device to accelerate performance and minimize the load on host systems. These greater efficiencies should also lead to reductions in power consumed by the system and eventually lower computing costs.

The project is funded by Western Digital, with the company also providing its Ultrastar Data60 hybrid storage platform for use by UC Santa Cruz faculty and graduate students in genomics and computer engineering. No further financial details were disclosed.

“Genomic data is on a trajectory to grow faster than almost every other type of data in the world,” notes Benedict Paten, Director of the computational genomics lab at UC Santa Cruz. “Moving the compute to the data is a key strategy already being adopted at the software layer in several global genomics initiatives. With the support and close collaboration of Western Digital’s team we’ll be able to take this all the way down to the hardware layer.”

More from Science & Enterprise:

*     *     *

1 comment to Data Architecture Studied to Speed Genomic Sequencing