PC: Shuttershock

When Humans Create Life: Synthetic Biology

Leveraging Cells to Do What we Want, and Expanding the DNA Alphabet to 6 Letters

Michael Trịnh
10 min readFeb 11, 2019


3.5 Billion years ago, the first known traces of life emerged on our planet.

Simple amino acids came together through complex reactions, to form genetic information in the form of Ribonucleic Acid (RNA). Somewhere after this point, the RNA gave way to the first single celled organism, a prokaryote.

Fast forward the years, and things have started coming full circle: a distant species and child of the first prokaryote, is now beginning to yield their own interpretation of life into the real world. Humans are now making progress in the field of synthetic biology: where humans create entirely new species of organisms, in order to fulfill human-set outcomes.

This can be anything from creating a cell that produces rose oil for your rose-scented perfume side business, to creating a new form of DNA which holds an exponentially higher amount of data than the DNA we all currently know.

The Synthetic Biology Thought Model

The fundamental premise of synthetic biology is to augment biological life, in order to have it producing outcomes which we dictate. We call these forms of augmented life, semi synthetic organisms (SSO’s). The orders which the synthetic biologist would give to their SSO, is written in the form of DNA codes that are pre-written by human geneticists.

With the outcome of eventually creating a synthetic organism from the bottom-up, there is a common thought model which is applied when building up new organisms:

  1. Synthesis: This is the writing of actual DNA code that we want followed in synthetic cells. From small sequences, to eventually entire genomes, this process is where it is all written up. The prelude to this step may include identifying new DNA sequences that make up certain genes, by analyzing sequenced DNA.
  2. Standardize: In the standardization step, scientists try to establish minimum requirements that can be reproduced in multiple different synthetic biology experiments later on. A key example of this was establishing a minimal viable genome, which dictated the lowest amount of genes (473 genes) that we could reduce a cell’s entire genome to, while keeping it alive, healthy, and dividing in a controlled environment.
  3. Abstraction: This is the focus on the functionality of the synthetic cell as a metric of success, and a main center of focus. Skipping over many technical aspects and factors where possible, abstraction pays attention to small-scale details after an experiment is completed, allowing for more emphasis to be put on getting an SSO working and alive first.

When combining these thought processes, we get a generalized rationale when building up new synthetic organisms. Knowing the thought process, we can then jump into the actual processes of building our SSO’s.

Genetic Engineering vs Synthetic Biology

This may all sound very familiar to the field of genetic engineering, where we manipulate organisms’ genomes by placing genes from other entities in theirs. However this is where synthetic biology differs, as humans are now writing up their own gene sequences.

There may or may not be a gene that codes for a cell to produce biofuel, but we can definitely figure out how to write one up that does!

Our sequences may or may not be based off a natural equivalent of the gene and thus completely unique, but nevertheless humans typed up the sequence so we get to pat ourselves on the back for that. Synthetic biology explores how human-written sequences effect how the cell behaves, and lives overall.

Artificial Genome Sequencing: Writing DNA Code

In order to type up and print out entire man-made genes, scientists need to first understand what they are coding for. This is done by Genome Sequencing, which can be done in many economic and timely ways.

Biobricks: Cooler than Lego

Gene sequencing allows us to identify the DNA sequences of certain genes or an entire genome, depending on the sample size. Sequencing tells us the exact order of DNA sequences of A,T,C,G that make up individual genes, or entire genomes altogether.

The end goal after everything, is to be left with Biobricks: small bits of synthetic DNA that when added into a cell, alter its function and nature.

In order to end up with these biobricks, we need to kickstart the process of Artificial Genome Sequencing.

Imputing electronic data into a DNA printer, it’s less complex than you may think. (PC: Illumina)

Once the scientists know what they want to code for, it then becomes a process of typing up the appropriate biobricks (gene sequences), making them work as functional, and implementing them in a cell.

Printing Genetic Material: DNA Printing

Building entire genomes is not something you have time to do inefficiently. With hundreds of thousands of A,T,C,G DNA base pairs making up individual genes, the industry looks to a process called DNA printing to quickly build the biobricks we want.

It’s really in the name as to how this works: like a regular printer, gene sequences (Electronic data) is fed into the machine. However rather than having overpriced ink cartridges inside the printer, the printer contains the 4 fundamental chemical base pairs of DNA: A-T-C-G. The printer then reads, scans, and prints actual DNA polymers according to the sequence instructions it’s fed.

The end result? Fresh mini strands of artificial DNA, known as oligonucleotides, which we can then assemble!

Cloning DNA With Gibson Assembly

Creating new genes and DNA, is really a game of numbers. The more of a certain modified gene you have in the cell, the more likely it will get replicated and transcribed. So in order to get more of our desired gene, we gotta go beyond DNA printers. We need our version of a DNA photocopier, which is essentially cloning and its different implementations.

PC: Tobias Vornholt

For the sake of the synthetic biology field, let’s go with the cheaper, faster, and overall more specialized approach: Gibson Assembly.

Gibson Assembly specializes in cutting up DNA sequences it’s given, and then promptly re-assembling them in a different order that we can still influence. This process takes advantage of 3 key genetic modification molecules:

  • Exonuclease for making cuts to the DNA that DNA Polymerase can easily work with.
  • DNA Polymerase for building an entire new DNA strand that is complementary to the DNA cut
  • DNA Ligase for sealing any loose ends between the originally cut strand, and the new cloned strand.

Putting the new genes through this process, amplifies the number of copies of the gene’s sequence. Being able to clone DNA is alike to having a 3D photocopier to accompany a 3D printer; accurate prototyping and mass production are all handled accordingly.

Redefining Genetics: 6 Base Pair DNA

Creating new organisms is pretty amazing, but one specific sub-field of synthetic is using DNA on an entirely new level: electronic data storage.

As a storage unit, our DNA is insanely efficient, having to contain entire genetic instructions in a tight, confided space. Or to put this potential in terms we can understand:

1 gram of DNA can store up to 215,000,000 gigabytes (215 petabyes) of raw data.

With that level of storage, you could theoretically store all of the digital information in the entire world using enough DNA to fill up the inside of a car. Humans have already managed to store the Mp3 file of Martin Luther King Jr’s I Have a Dream speech in DNA, in fall quality and everything!

However for some people, 215 petabytes/gram of DNA is not enough; they think we can get DNA even better at data storage.

Designing DNA With 6 Base Pairs!

In 2016, scientists fundamentally re-engineered the DNA molecule form consisting of 4 fundamental base pairs (A,T,C,G), to having 6 base pairs.

Two new base pairs: d-NaMTP and d5SICSTP (dubbed X and Y respectively), were added to the genetic code. Bonded together by hydrophobic interactions, the X and Y pairs are still somewhat different than A-T and C-G, which are conjoined by hydrogen bonds.

However the most significant area to watch for, is how X and Y bases being added into the genome allows for entirely new protein creation. With X-Y bases now, brand new codons can be put through cell ribosomes. From that point on, entirely new amino acids can be produced by the processes of protein synthesis.

Adding base pairs to the DNA adds entirely new dimensions of benefits. Everything from greater DNA data storage capacity, to an easier means for synthetic biology to implement biological manipulation can be applied by adding these base pairs to synthetic genomes.

The true challenge that lies before everything though, is to get a functioning cell with 6 base pair DNA which can efficiently spread its genetic material to its daughter cells.

In the same 2014 study, the scientists describe how they engineered a 6 base pair genome for modified e.coli bacteria. This gave rise to an entirely new strand of e.coli; the first ever organism to retain and spread 6 base pair DNA to its daughter cells.

Synthetic Life Is Already a Thing

In 2010 the world saw the first fully synthetic cell, created by Dr. Craig Venter and Dan Gibson (The guy who invented Gibson Assembly), dubbed Synthia or Syn 1.0. In the words of Dr Venter himself:

“(Synthia) is the first self-replicating species on the planet, who’s parent is a computer.”

Microscope Image of “Sythia” Cells

As you can imagine, creating an unnatural cell which can healthily function and reproduce, is an immense task. Synthia’s has a problem; it did not function as well as its natural counterpart, nor did it replicate as fast as intended.

With mutations of X and Y base pairs, we can get by with the gene editing features of the CRISPR Cas9 enzyme, to cut out genes that lose their X-Y bases. This ends up forcing the remaining genes to keep their XY genes.

Another crucial optimization step comes with smarter means of delivering new DNA. If we can deliver X/Y base pairs that don’t cause toxicity, the 6 base DNA cells should then be able to proliferate more naturally. In the case of the semi-synthetic cell, a delivery mechanism known as PhTT2 reduced toxicity in the cell, and helped the synthetic cells proliferate more.

Thanks to the work done in optimization of these semi-synthetic organisms, tangible progress has been made since Synthia first emerged. Using optimization processes, Dr. Gibson and his team created two subsequent successors to the original Synthia cell, dubbed Syn 2.0, and Syn 3.0.

Unlike their predecessor, versions 2.0 and 3.0 of this cell type can healthily function, proliferate, and retain/spread our synthetic DNA to their daughter cells!

Synthetic DNA Meets Cell

After the artificial genome sequencing is finished up, the most crucial step follows: introducing the ready-to-go synthetic DNA into a natural cell’s genome. A sense of DNA “hacking”, if you will.

As everything else in Synthetic Biology, this follows a methodical process:

  1. A “transporter” protein is used in order to move synthetic DNA into the cellular medium.
  2. The synthetic gene gets integrated into the cell’s nuclear DNA.
  3. Either natural messenger RNA’s (mRNAs) of the cell, or synthetic mRNAs that the scientists engineer, transcribe the DNA.
  4. You get the original product that you coded for in the artificial gene, produced. The cell is idelally in a healthy state, and diving at expected rates, passing on your synthetic genes to its daughter cells.

This is a process that is still very much a work in progress, with many more areas for optimization and improvement. Keeping cells healthy tends to be be an consistent issue for a lot of synthetic biology projects, and it makes sense considering we do not know the function of about 1/4 genes we’re putting in our synthetic genome.

This all alludes to the need for more work to be done in identifying the roles of more of these genes being engineered for a cell’s minimal viable genome.

Beyond Micro-Factories and Key Takeaways

Producing ethanol and biofuels from synthetic cells may sound incredible, but that is just one of many directions where the field can go.

As scientists gain more leverage over different gene sequences that can be created, we have more potential paths to expand the field.

Everything from manipulating natural cell’s to act unnaturally, to building new synthetic cells bottom-up, may very well lie in the near future. It is crucial to note that the ability to manipulate and create new forms of life, has enough power to benefit as it does to harm humanity.

Security and ethics are already huge elephants in the room for Synthetic Biology, and for good reason. Criminals and rogue states being able to manipulate life and create their own genomes, could spell a dangerous new threat for human life as a whole.

All this is putting synthetic Biology is in a unique position to change our approach to solving problems across areas that may not even apply much biology today.

To summarize this review, here at the key takeaways for what synthetic biology is currently at:

  • Synthetic biology is the study of how human-made genes can alter the nature and function biological organisms, for our own intended usages.
  • Advantages for DNA based solutions here, is that the cells should be able to reproduce the engineered gene, introducing it to the gene pool altogether.
  • There’s two very prevalent areas in synthetic biology: Artificial Genome Sequencing (AGS), and DNA Computing.
  • Scientists have created 6 Base Pair DNA, consisted of two new DNA bases: dNaMPT (X), and d5STPSIC (Y).
  • In the near future, everything from hijacking cells to produce certain products such as biofuel, to building modified immune cells from bottom-up in order to battle cancer, is all very possible.

Interested in the work I do? Sign up for my monthly newsletter to get updated on my research, projects, and other content!

📝 Read this story later in Journal.

🗞 Wake up every Sunday morning to the week’s most noteworthy Tech stories, opinions, and news waiting in your inbox: Get the noteworthy newsletter >



Michael Trịnh

Undergraduate builder & researcher @UofT in the crossroads of bioinformatics, immunology, and genome engineering.