This site may earn affiliate commissions from the links on this page. Terms of use.

Dna data storage is a big deal. Partly, information technology's because we're based on DNA, and whatever research into manipulation of that molecule volition pay dividends for medicine and biology in general — just in part, it'due south also because the world's most wealthy and powerful corporations are getting discouraged at cost estimates for data storage in the futurity. Facebook, Apple, Google, the Us government, and more than are all making phenomenal investments in storage ("exabyte" is the buzzword now). But even these mega-projects can only put off the inevitable for so long; nosotros are simply producing too much data for magnetic storage to keep up, without a major unforeseen shift in the technology.

That'due south why a company like Microsoft recently decided to invest in the prospect of storing data with a totally different sort of tech: biotech. It might seem off-brand for the software giant, but teaming up with academics to accept on molecular biology has produced stunning results: The team was able to store and perfectly call up digital information with incredible storage density. According to an accompanying blog postal service, they managed to pack about 200 megabytes of data into only a fraction of a driblet of liquid, including a compressed music video from the band OK Get. Even more impressive, that information was stored in a speedily and hands accessible form, making it more akin to computer RAM, than computer storage.

So how did they accomplish this incredible feat?

First, they had to catechumen the digital lawmaking of 1's and 0's to a genetic code of A's, C's, T's, and Chiliad's, then take this lowly text file and manually construct the molecule it represents. Each of these is a feat in and of itself. Dna storage requires cutting-edge techniques in data pinch and security to pattern a sequence both info-dense plenty to realize DNA'southward potential and redundant enough to allow robust fault-checking to improve the accurateness of data retrieved down the line.

dna storage 5Very petty of the technology on display hither is new, since the most of import parts of the organization have existed much longer than mankind itself. Merely if all the information necessary to lawmaking for Albert Einstein was contained within the nucleus of every single cell of Albert Einstein'southward body, as it was, then this classical approach to data storage must have something going for it. Researchers in this field set out to understand and harness that something, and they're getting better at information technology seemingly every couple of months.

At the cease of the day, Deoxyribonucleic acid's key special attribute it data storage density: how much information can DNA fit into a given unit volume? The NSA's largest, nearly notorious data-center is an enormous, sprawling complex full of networked racks of magnetic storage drives — but co-ordinate to some estimates, Dna could take the volume of data independent in about a hundred industrial data centers and store it in a space roughly the size of a shoe box.

DNA achieves this in ii means. One, the coding units are very small-scale, less than half a nanometer to a side, where the transistors of a mod, advanced computer storage drive struggle to vanquish the 10 nanometer mark. Just the increment in storage capacity isn't only ten- or a hundred-fold, but thousands-fold. That differential arises from the 2d big advantage of Deoxyribonucleic acid: it has no problem packing three-dimensionally.

GENOME SPEED.jpg

Sequencing has gotten much faster and cheaper over fourth dimension — and that's skillful, because we need to sequence Dna information to read it!

Meet, transistors are generally aligned on a flat plane, meaning their ability to fully employ a given space is pretty low. Nosotros can of class stack many such flat boards i atop another, just at that point a new and totally debilitating trouble arises: heat. One of the most challenging parts of designing new transistor-based technologies, whether they're processors or storage devices, is oestrus. The more tightly yous pack silicon transistors, the more heat you'll create, and the harder it will be to ferry that heat away from the device. This both limits the maximum density, and requires that we supplement the price of the drives themselves with expensive cooling systems.

With its super-efficient packing structure, the DNA double helix offers a great solution. Chromatin, the Deoxyribonucleic acid-protein system that makes up chromosomes, is essentially a very complex mechanism designed to allow an inherently viscid molecule like DNA to roll up actually tight, yet withal unroll chop-chop and easily afterwards on, when certain patches of Deoxyribonucleic acid are needed by the body.

dna storage 4

Hither's a simplified wait at how DNA packs and so tightly into three-dimensional space.

This at-paw nature of the chromatin system, which allows any gene to exist "called" from whatsoever part of the genome with roughly equal efficiency, has led the researchers to dub their storage organization a DNA version of a computer'due south random access retention, or RAM. Similar RAM, the concrete location of a piece of data within the drive isn't important to the reckoner'southward ability to access that information.

DNAHowever, storing information in DNA differs from figurer RAM in some pretty pregnant ways. Nigh notable is speed; part of what makes RAM RAM is that its piece of cake-access system is also a quick access system, allowing it to hold data the calculator might need at an instant'due south detect, and make it bachelor on those timescales. On the other hand, DNA is significantly harder and slower to read than conventional figurer transistors, pregnant in terms of access speed it's actually less RAM-like than your average computer SSD or spinning magnetic hard-bulldoze.

That'due south considering the incredible abilities of development'southward data storage solution were tailored to evolution'due south unique needs, and those needs don't necessarily include performing thousands of "reads" per 2d. Regular, cellular DNA data storage has to untangle the complex chromatin structure of stable Dna, and so unwind the Deoxyribonucleic acid double helix itself, brand a re-create of the sequence of involvement, then zilch everything right support the way it was — it takes a while.

For our purposes, we must then add the extra step of reading the DNA. In this example, that's achieved past using an age-old technique in biotech labs called the polymerase chain reaction (PCR) to amplify, or repeatedly indistinguishable, the sequence nosotros want to read. The whole sample is then sequenced, and everything but the many-many-many-times repeated sequence we amplified is discarded. What remains is our sequence of involvement. These stretches of Dna are marked with footling target sequences that let the PCR proteins to bind, and the replication process to begin.

gene therapyIn cells, genes are turned "on" and "off" largely by changing the availability of these target sequences to the ever-waiting machinery of Deoxyribonucleic acid replication. This tin can be done via the winding and unwinding of chromatin, the direct addition or removal of a blocker poly peptide, or even interaction with other areas of the genome to promote or foreclose transcription. In a man-made information storage system, nosotros could theoretically make something ameliorate suited to our needs, stronger or more efficient or less wasteful on forms of security nosotros don't need for this purpose, simply that would require a level of composure in protein engineering that even so seem a ways out.

Check out our ExtremeTech Explains serial for more than in-depth coverage of today'southward hottest tech topics.

At present read: How DNA sequencing works