It is inarguable that we've recently undergone a "Greater than Moore's law" advancement in sequencing our genome. 16 years ago, sequencing an entire genome would have taken 16 years and cost $3 Billion dollars. Now you can do it at home for a cost of $1,000 and it will take about a week.
Logically, this will become a simple sensor on a device like a smartphone and it will sequence your entire genome in a few seconds. That's perhaps 5 years out at the earliest, right?
Except that it's not 5 years out anymore, it's here right now...
You can know your entire genome, and it is already small enough to fit in the free space on your cellphone or laptop.
You can get that down even further. If you simply take the deltas (differences) between you and the "reference genome". The maximum size of the delta set for a viable human is 1.4MBs. This is small enough to fit on a floppy disk. Or, a blockchain?
But what is this "reference genome?" Where did it come from?
The genome published by the HGP does not represent the sequence of every individual's genome. It is the combined mosaic of a small number of anonymous donors, all of European origin. The HGP genome is a scaffold for future work in identifying differences among individuals. Subsequent projects sequenced the genomes of multiple distinct ethnic groups, though as of today there is still only one "reference genome."
A small number of anonymous donors? All of European origin? Why?
It's an artifact of the selection process. The work was carried out in areas that were predominately European. No Nazis or Illuminati conspiracies were involved, despite the humor I was trying to convey in my last blog topic.
The fact is, these are just normal people like you and me. They are very smart and dedicated to what they do.
Still who were these donors?
In the IHGSC international public-sector Human Genome Project (HGP), researchers collected blood (female) or sperm (male) samples from a large number of donors. Only a few of many collected samples were processed as DNA resources. Thus the donor identities were protected so neither donors nor scientists could know whose DNA was sequenced. DNA clones from many different libraries were used in the overall project, with most of those libraries being created by Pieter J. de Jong's lab. Much of the sequence (>70%) of the reference genome produced by the public HGP came from a single anonymous male donor from Buffalo, New York (code name RP11).
We know that 70% of the reference genome came from a single person who is anonymous, but I'm wiling to bet it's this guy...
If you're singlehandedly responsible for sequencing 70% of the reference genome, at a time when the failure rate was enormous, you'll probably need more than a single sample.
So this is telling us that >70% of the "reference genome" came from what the future may well describe as a "genetic adam". There is a precedent in nature for it. About 10,000 - 75,000 years ago, the population was "bottle-necked" for some reason.
Why Adam/Eve or possibly Noah?
It's because Religions aren't just stories someone made up!
They are scars imprinted on all of the descendants of a rag tag band of survivors of something catastrophic. This is why these stories fascinate us endlessly and I'm willing to bet it's our spirituality that allowed us to whether that without going completely insane.
What the hell happened? We just don't know. These stories predate our known history and all we have are artifacts. Scars embedded in our language, our religions and our genome. This survival came at the cost of our genetic diversity.
You can take any 2 humans on earth and their genetic differences would fit on a floppy disk. But you can take any two puppies, born any where in the world, and their differences would require a stack of floppies two feet high.
So besides the one guy we know about, who are these other "reference people" that we have already backed up?
DNA from five different individuals were used for sequencing. The lead scientist of Celera Genomics at that time, Craig Venter, later acknowledged (in a public letter to the journal Science) that his DNA was one of 21 samples in the pool, five of which were selected for use.
Alright so Craig Venter is obviously one.
In 2007, a team led by Jonathan Rothberg published James Watson's entire genome, unveiling the six-billion-nucleotide genome of a single individual for the first time.
So James Watson is another.
Note that James Watson recieved the Nobel Prize for figuring out what DNA looked like.
These are the 2 people we know for sure and one other guy that we can be 70% certain about.
The bulk of the reference genome is actually chimera/mosaic composed of 5 people. All of whom are documented to be European descended, three of which are males.
If we needed to "restore our humanity", from this reference genome.
We're all going to end up with blond hair and blue eyes.
But why would we need to "reboot" our species? Isn't that a bit extreme?
Yes, it's extreme and that's the point. You don't take a backup of your data because you know your hard drive is going to fail tomorrow. You keep regular backups in case the unforeseen happens.
The fact is, the Human Genome Project to sequence the genome, was a "moon shot".
Just like going to the moon triggered enormous advancements in information technology. The Human Genome Project 1.0 was a "moon shot", that resulted in a "Greater than Moore's law" increase in knowledge about our genome as well as advances in the technology to sequence it.
Now we're about to undertake Human Genome moonshot 2.0
The Human Genome Project - Write (also known as Human Genome Project-Write, Genome Project-Write and HGP-Write), formally announced on 2 June 2016, is a ten year extension of the Human Genome Project, to synthesize the human genome. The human genome consists of three billion DNA nucleotides, which have been described in the Human Genome Project - Read program, completed in 2003. The newly created HGP-Write project will be managed by the Center of Excellence for Engineering Biology, a new nonprofit organization. The researchers expect that the ability to synthesize large portions of the human genome would result in many scientific and medical advances.
Genetic engineering has already reached the stage where anyone can do it at home. There are actually kits you can purchase in order to modify model organisms in novel ways. These kits have very detailed instructions because...
They're intended for Junior High students!
The "hello world" of genetic engineering is to get bacteria to glow in the dark.
What happens if you accidentally replace your own microbiome with this glow in the dark, ampicillin resistant e. coli? That's not the future, it is something actually happening right now.
Yet now we're about to moonshot the ability to synthesize our own genome, not just the genome of model organisms. But the fact is, we don't really know our own genome. If we make a mistake and end up having to "fix it", by restoring from a backup, we're all going to be using someone else's backup.
Herein lies the problem. Despite having what amounts to a "core dump", of our genome. We still don't know what the bulk of this code does and there is currently no debugger to step through it. We're going to have to reverse engineer it and that's what we're doing now.
The moonshot 3.0 we've just embarked upon will change all of that. There has never been a more exciting or interesting time to be alive. We'll get a genomic debugger and editor out of this and probably a compiler and runtime simulator as well. The truth is we've already sorted out most of the "update" mechanism, and that will only get easier as nanotech and genetics begin to merge.
We already have the editors and we already have tools to update living organisms with new code.
We just don't really know what we're looking at yet, when we stare at that code. Because mother nature is the ultimate hacker and she didn't leave any comments.
The logical extension of this is that eventually we will be able to control our own genome directly. We just need to figure out what this machinery and it's programming are actually doing. The logical net result is that we will have the ability to write "genetic algorithms and apps", to modify our own bodies directly.
You can bet we're going to be sharing these one way or the other.
Because we already do this "sharing of new code", every year when we spread the cold or flu around.
I am not afraid of the technology. However I am concerned with the consequences.
If we edit out traits that we consider negative, such as eventually turning into a fat old bald guy. What happens when the same selective pressures that caused the bottleneck, repeat again?
The solution here, is not to demonize the technology. The solution is to provide an adequate backup solution in order to save the "missing information". This should allow us to restore our individual genetic information back to the original programming that nature spent a few billion years hand optimizing to run on our unique bits of molecular machinery. Unique bits which are not included in the reference genome at all.
We should do this, not only in case we make a colossal mistake, but also in case we make a relatively minor mistake.
How would it be, to download a simple "burn fat +10%" app to our body. Only to find out years later that as a side effect, our bodies are constantly flooded with sugar, thereby turning our blood into syrup. Forcing our pancreas into a "hyper stressed" mode eventually shutting it down and making you an insulin dependent diabetic, with no possibility of a "code roll back".
Also, there's the good side of this. Someone who is a Type 1 diabetic could just download a cure from someone genetically similar with no diabetes. If it doesn't work or turns out to have nasty side effects, they can just roll it back.
So here is what I would like to accomplish.
I propose building a blockchain backed "life bank", that allows us to upload our own genomic information and place it under lock and key so that it can be read on demand if needed, but cannot be overwritten. Something we can append to, as more information about our own internal diversity becomes available. The primary purpose would of course be to protect against existential threats. But also just the occasional "whoops!". All without worrying about suddenly waking up as a 6'2" 180lbs Nordic man with blond hair and blue eyes, because we had to restore from someone else's backup.
Some final thoughts from the inventor of CRISPR/Cas9, you should hear her out too.
I am not afraid of the technology. However I am concerned with the consequences.
The next step will be our connectome.
Oh wait, we have that too.
Perhaps, we should consider taking a backup of that as well.
I'll explain more on how to do that in my next blog, if this topic generates sufficient interest.