[WIP] GPU-based whole genome analysis

[WIP] GPU-based whole genome analysis

10/1/25

After doing CPU-based whole-genome analysis in my previous post, I experienced just how bad single-threaded and multi-threaded CPU-based genome analysis is. So, I looked for alternatives like Parabricks DeepVariant, DRAGEN, and DNAnexus. Senteion was in the list of software’s to review, but they have extremely optimized CPU software, and I don’t see them being competitive in the future as we scale up GPU/TPU production + the associated neural networks that come together.

And that’s another thing I saw significantly lacking- there aren’t many neural network’s that are doing this. The whole-genome sequencing companies like Nebula and Nucleus are going for this- meaning they are creating neural networks that can read DNA and describe variants, using the health market as a means to create this. It is a perfect machine learning problem.

Even the GPU-based options for variant calling and analysis are just heavily optimized GPU programs, like the CPU ones, but have broken down the embarrassingly parallel sub-computations to be tightly integrated with the specific GPU architecture that the analysis is happening on.

So, in this project, we’re going to check out the GPU-based variant callers + analysis products and run them in a single pipeline, just like we did with the CPU-based one.