Author: logancollins

Guide to the Structure and Function of the Adenovirus Capsid

No Comments

PDF version: Guide to the Structure and Function of the Adenovirus Capsid

For this guide, I will explain the fundamental biology of adenovirus capsid proteins with an emphasis on the context of gene therapy. While the guide is meant primarily for readers with an interest in applying adenovirus to gene therapy, it will not include much discussion of the techniques and technologies involved in engineering adenoviruses for such purposes. If you are interested in learning more about adenovirus engineering, you may enjoy my review paper “Synthetic Biology Approaches for Engineering Next-Generation Adenoviral Gene Therapies” [1]. Here, I will focus mostly on the capsid of human adenovirus serotype 5 (Ad5) since it is the most commonly used type of adenovirus employed in gene therapy research, but I will occasionally describe other types of adenoviruses when necessary. Many of the presented concepts remain the same or similar across other types of adenoviruses.

The adenovirus consists of an icosahedral protein capsid enclosing a double-stranded DNA (dsDNA) genome. It possesses 12 fiber proteins which protrude from the capsid and helps to facilitate cellular transduction. Adenoviruses are nonenveloped and approximately 90 nm in diameter (not including the fibers). The Ad5 genome is about 36 kb in size. Major capsid proteins of the adenovirus include the hexon, penton, and fiber. The minor capsid proteins are protein IIIa, protein VI, protein VIII, and protein IX. Inside the capsid, there are core proteins including protein V, protein VII, protein μ (also known as protein X), adenovirus proteinase (AVP), protein IVa2, and terminal protein (TP) [2]. There are also many proteins expressed during adenovirus infection which are not incorporated into mature capsids, including the E1A proteins (289R, 243R, 217R, 171R, and 55R), the E1B proteins (52k and 55k), the adenoviral DNA polymerase, and more [3].

Ad5’s genome contains a variety of transcriptional units which are expressed at different times during the viral life cycle [3]. The E1A, E1B, E2A, E2B, E3, and E4 transcriptional units are expressed early during cellular infection. Their proteins are involved in DNA replication, transcriptional regulation, and suppression of host immune responses. The L1, L2, L3, L4, and L5 transcriptional units are expressed later in the life cycle. Their products include most of the capsid proteins as well as other proteins involved in packaging and assembly. Each transcriptional unit can produce multiple mRNAs through the host’s alternative splicing machinery.

Major capsid proteins


Adenovirus hexon represents the main structural component of the capsid. It is encoded as one of the products of the Ad5 L3 gene. Each capsid contains 240 trimers of the hexon protein (720 monomers) and each facet of the icosahedron consists of 12 trimers [4]. The lower part of each hexon monomer consists of two eight-stranded β barrels linked by a β-sheet. The eight-stranded β-barrels are known as jellyroll domains. In between the β-strands, long loops are present. These loops contain the seven hypervariable regions (HVRs) of the hexon, which differ in sequence composition between distinct adenovirus types. The loops form the upper portion of each hexon. HVR1 of Ad5 includes a 32-residue acidic loop which might be involved in neutralizing host defensins. The valley between the loop towers of Ad5 has been shown to interact with coagulation factors as well as to bind to the CD46 cellular receptor as an alternative cell entry mechanism.

Here, the structure of the Ad5 hexon trimer is shown from a side view and from a top view (PDB 1P30). All β-sheets are red, α-helices are cyan, and loops are magenta. Jellyroll domains are visible at the base of the side view and the HVR loops can be seen in the upper half of the side view. In the top view, the hexagonal shape of the hexon is clearly visible. The N- and C- termini are both located near the bottom of the hexon (adjacent to the inside of the virion). Some disordered regions are shown as dashed lines.


The 12 pentons serve to fill pentagonal gaps within the icosahedral capsid (which arise due to the geometry of the hexons) [4]. Penton is encoded as one of the products of the Ad5 L2 gene. Each penton also acts as a base onto which a fiber protein is anchored. Adenovirus pentons are pentamers, with each monomeric subunit consisting of a single jellyroll domain for the lower part and both a hypervariable loop and a variable loop at the top. In Ad5 and many other human adenoviruses, the penton hypervariable loop includes an RGD amino acid sequence. RGD is both an αv integrin binding motif and is a target for adenovirus neutralization by the enteric defensin HD5. Importantly, the penton’s RGD motif is essential for cellular transduction into clathrin-coated pits [5]. RGD may also play some role in endosomal escape. The other penton variable loop (distinct from the hypervariable loop) is poorly understood from a functional standpoint. Both the hypervariable loop and the variable loop might serve as decent sites for sequence modification in the context of gene therapy vectors. The penton N-terminal domain consists of approximately 50 amino acid sequence which extends into the inside of the adenovirus virion. This sequence is mostly disordered except for the part nearest to the jellyroll domain (residues 37-51 in Ad5), which interacts with two copies of protein IIIa.

Here, the structure of the Ad5 penton is shown from side and top views (PDB 3IZO). Coloration is by subunit. In the side view, the intravirion N-terminal domains are visible at the bottom, the jellyroll domains can be seen as the groups of β-sheets in the middle, and the loops are present at the upper region. The top view clearly illustrates the pentagonal symmetry of the penton. It should be noted that, in this structure, some of the loops are missing due to the difficulty of reconstructing them at high resolution. Of special relevance here is that the loop with the RGD sequence should be located at the top of the penton (in the gap between the uppermost α-helix and a nearby loop which both terminate prematurely).


Ad5’s 12 trimeric fibers are anchored onto the tops of the pentons [4]. They are encoded as a product of the L5 gene. These fibers initiate cellular transduction through binding of the knob domain to cellular receptors. The primary receptor for Ad5 is the coxsackievirus and adenovirus receptor (CAR). That said, it should be noted that Ad5’s fiber knob can also bind to alternative receptors such as vascular cell adhesion molecule 1 and heparan sulfate proteoglycans. For Ad5, the fiber is about 37 nm in length, but other adenoviruses can have shorter or longer fibers [6]. Fibers consist of an N-terminal tail domain, a shaft domain, and a C-terminal knob (also called head) domain [4]. The three N-terminal tails anchor into some of the clefts between penton monomers, likely via a hydrophobic ring region. The shaft consists of a structure known as a trimeric β-spiral. Shaft flexibility plays a role in cellular transduction by facilitating interaction of the penton with its integrin receptor after binding of the knob to CAR. Many adenovirus fibers are known to have hinges at the third β-repeat from the N-terminal tail domain [7]. These hinges arise from an insertion of a few extra amino acids within the third β-repeat which disrupts its structure and allows for it to flex. The C-terminal knob domain consists of an antiparallel β-sandwich and is responsible for trimerization of the fiber [4]. Its C-termini are oriented back towards the capsid of the adenovirus.

Here, part of the structure of an Ad2 fiber is shown from two perspective views (PDB 1QIU). Though there are structures of the Ad5 fiber components available, only the above Ad2 fiber structure has been assembled into a complex with and made publicly available. The Ad2 fiber is highly similar to the Ad5 fiber. Both Ad5 and Ad2 fibers have 22 β-repeats. Only a few β-repeats are included in the above structures, but that should be enough to grant an intuitive understanding of the general fiber organization.

Minor capsid proteins

Protein IX

Ad5 protein IX (pIX) is a 140 amino acid protein found nestled between hexons which confers greater thermostability to the capsid relative to mutants lacking pIX [4]. There are 240 copies of pIX in the capsid. It has an N-terminal domain, a rope domain, and a C-terminal domain. The N-terminal domains of three pIX monomers interlace to form a triskelion structure in the valleys between some of the hexons. The rope domain (also called linker domain) is often disordered and connects the N- and C-terminal domains. The C-terminal domain is an α-helix which forms a coiled-coil structure with the helices of other copies of pIX monomer. This coiled coil consists of four α-helices (three parallel and one antiparallel), each from a different pIX monomer. Four triskelions and three α-helix bundles are present in each icosahedral facet of the capsid. It should be noted that some of the triskelions take on slightly different structural features depending on which hexons they are associated with within a given facet [8]. Though all of the C-termini of pIX are exposed on the capsid surface, they can still be described as resting within crevices between hexons. Because of this, spacer peptides are usually necessary when engineering Ad5 pIX-fusions such that that the added protein is elevated out of the crevices [9].

Here, four copies of Ad5 pIX are shown interlacing among four hexons (top and side views) (PDB 6B1T). The C-terminal domain α-helical bundle of pIX is clearly visible. The N-terminal domain triskelion structures are not visible in these views. Hexons are portrayed in cool colors and the pIX copies are shown in magenta. Some disordered regions are shown as dashed lines.

Protein IIIa

The Ad5 protein IIIa (pIIIa) plays a structural role in stabilizing the capsid from the inside [4]. Five copies of pIIIa are found under each vertex of the Ad5 capsid. It is 585 amino acids in length, but only residues 7 to 300 have been structurally traced at high resolution. Its N-terminal domain connects the penton and the five adjacent hexons. (These are known as the peripentonal hexons. The peripentonal hexons plus the penton are collectively named the group-of-six or GOS) Its C-terminal domain binds protein VIII (another structural protein which will be discussed later). The traced part of the pIIIa structure consists of two globular domains connected by a long α-helix.

Above, traced parts of five pIIIa proteins are shown on the underside of a part of the Ad5 capsid (perspective is from the interior) (PDB 6B1T). Hexons are colored blue, the penton is colored yellow, and pIIIa is colored bright pink. The same structure is shown below from a side perspective.

Protein VI

Ad5 protein VI (pVI) starts out as 250 amino acids long but is cleaved by AVP at two sites, yielding multiple peptides [4]. The first site is after residue 33 and the second is after residue 239. The middle part contains a predicted amphipathic α-helix (residues 34-54) which inserts into host endosomal membrane. This alters the membrane’s curvature and helps facilitate lysis of the endosome, allowing the adenovirus to escape into the cytosol. The middle part also contains a domain (residues 109-143) which sometimes binds to the inner surface of the capsid in the cavities between certain hexons. The N-terminal peptide pVIN also binds to cavities between hexons. It has been suggested that this affinity hides the first pVI cleavage site in these cavities, preventing release of the membrane lytic peptide. During intracellular trafficking, environmental changes may allow adenovirus protein VII (a core protein) to outcompete pVI for the binding sites between hexons, causing release of the membrane lytic peptide. Finally, the C-terminal peptide pVIC is a cofactor which helps activate AVP. The pVIC peptide binds covalently to AVP and slides along the adenoviral genome, using the DNA as a track to reach all of the substrates in the core and the inner capsid surface. There are approximately 360 copies of protein VI in the Ad5 virion. Unfortunately, high-resolution structural data on pVI are scarce due to its variable position in the adenovirus virion.

Protein VIII

Ad5’s protein VIII (pVIII) also contributes to structurally stabilizing the adenoviral capsid from the interior [4]. It starts as a 227-residue protein which is cleaved by AVP at three sites, yielding two large peptides and two small peptides. The two large peptides stay together and bind between hexons. Some pVIII copies wedge between pIIIa and the peripentonal hexons, helping to connect the peripentonal hexons to the next set of surrounding hexons. Some pVIII copies are located underneath the nine hexons on the middle face of each icosahedral facet (known as the group-of-nine or GON). An interesting aspect of pVIII-hexon interactions is that can pVIII can engage in β-sheet augmentation, where a β-strand from pVIII is incorporated into one of the jellyroll domains of a nearby hexon. Not much is known about the two smaller peptides from pVIII except that these peptides do not appear to bind the capsid in a symmetric fashion.

Here, the traced parts of pVIII (red) are shown interwoven into a piece of the Ad5 capsid from an interior perspective (PDB 6B1T). Hexons are shown in shades of blue, the penton is shown in yellow, and pIIIa is displayed in bright pink.

Core proteins which interact directly with the capsid

Protein V

Adenovirus protein V (pV) is a positively charged protein which can form heterodimers with the pVII core protein [4]. That said, pV exists in a dimer-monomer equilibirium, so the binding to pVII is often transient. There also are direct associations between pV and the pVI capsid protein. These associations between pVII, pV, and pVI likely act to bridge the adenovirus core with the adenovirus capsid. In addition, pV-pVII heterodimers might interact with core protein μ. Each virion contains about 150 copies of pV. Most of the copies of pV are released during the beginning of uncoating. Interestingly, pV is not essential for adenovirus capsid assembly.

Protein VII

Protein VII (pVII) is a positively charged protein which plays a central role in condensing the adenovirus genome to fit into the capsid [4]. It has many arginine residues which contribute to its positive charge. AVP cleaves pVII at residues 13 and 24. The resulting middle peptide (including amino acids 13 through 24) might compete with pVI for hexon binding sites during adenovirus assembly. As mentioned earlier, environmental changes during intracellular trafficking may allow pVII to outcompete pVI for their hexon binding sites, causing release of the membrane lytic peptide from pVI cleavage. Though pVII acts as a functional analogue of the histone, it does not share much structural similarity with histones and does not replace histones when introduced into the cellular nucleus [2]. During infection, the viral genomic DNA as complexed with pVII is imported through nuclear pores. While pVII is important for condensing the adenoviral genome, it is not strictly required for assembly and packaging. In addition, pVII functions in signaling for the suppression of host innate immune responses. It binds to high mobility group B (HMGB) protein 1, a factor which is normally released from cells exposed to inflammation and which acts as a danger signal for the immune system. The adenoviral pVII prevents release of HMGB protein 1 and thereby dampens innate immune responses. Finally, pVII helps to regulate the progression of various steps during adenovirus genome replication.


This guide has centered on explaining the structures and functions of the Ad5 capsid proteins as well as the core proteins which are involved in key structural interactions with the capsid proteins. But this is only the beginning of learning about adenovirus biology. As mentioned in the introductory section, there are other core proteins including protein μ, the adenovirus proteinase, protein IVa2, and terminal protein which primarily interact with the adenovirus genome. Furthermore, the complex life cycle of the adenovirus requires numerous replication and packaging proteins (as well as interesting interactions with host cells) not covered here. Despite the specific focus of this guide, I hope that it is helpful to the reader for gaining a better idea of how the adenovirus capsid works. Perhaps this text will even provide a valuable bedrock of understanding for interested readers who are working on Ad5 capsid engineering projects.


[1]     L. T. Collins and D. T. Curiel, “Synthetic Biology Approaches for Engineering Next-Generation Adenoviral Gene Therapies,” ACS Nano, Aug. 2021, doi: 10.1021/acsnano.1c04556.

[2]     S. Kulanayake and S. K. Tikoo, “Adenovirus Core Proteins: Structure and Function,” Viruses , vol. 13, no. 3. 2021, doi: 10.3390/v13030388.

[3]     Y. S. Ahi and S. K. Mittal, “Components of Adenovirus Genome Packaging,” Frontiers in Microbiology, vol. 7. p. 1503, 2016, [Online]. Available:

[4]     J. Gallardo, M. Pérez-Illana, N. Martín-González, and C. San Martín, “Adenovirus Structure: What Is New?,” International Journal of Molecular Sciences , vol. 22, no. 10. 2021, doi: 10.3390/ijms22105240.

[5]     W. C. Russell, “Adenoviruses: update on structure and function,” J. Gen. Virol., vol. 90, no. 1, pp. 1–20, 2009, doi:

[6]     E. Vigne et al., “Genetic manipulations of adenovirus type 5 fiber resulting in liver tropism attenuation,” Gene Ther., vol. 10, no. 2, pp. 153–162, 2003, doi: 10.1038/

[7]     S. A. Nicklin, E. Wu, G. R. Nemerow, and A. H. Baker, “The influence of adenovirus fiber structure and function on vector development for gene therapy,” Mol. Ther., vol. 12, no. 3, pp. 384–393, Sep. 2005, doi: 10.1016/j.ymthe.2005.05.008.

[8]     V. S. Reddy and G. R. Nemerow, “Structures and organization of adenovirus cement proteins provide insights into the role of capsid maturation in virus entry and infection,” Proc. Natl. Acad. Sci., vol. 111, no. 32, pp. 11715 LP – 11720, Aug. 2014, doi: 10.1073/pnas.1408462111.

[9]     J. Vellinga et al., “Spacers Increase the Accessibility of Peptide Ligands Linked to the Carboxyl Terminus of Adenovirus Minor Capsid Protein IX,” J. Virol., vol. 78, no. 7, pp. 3470 LP – 3479, Apr. 2004, doi: 10.1128/JVI.78.7.3470-3479.2004.

Science Fiction Book Reviews

1 Comment

Station Eleven by Emily St. John Mandel: 98/100. Much of the essence of art is to reflect what makes us human, helping us better explain to ourselves what makes us tick. Station Eleven is a science fiction novel about a deadly flu pandemic which brings about the end of the world. Notably, it was written several years prior to the emergence of COVID-19. Emily St. John Mandel wields the premise masterfully to touch our souls and help us come to terms with human kindness, cruelty, hope, and vulnerability. Through its deep tragedy and heartfelt characters, the book manages to link questions of the individual and the global. We take a hard look at how the meaning of civilization connects to the meaning of life. Emily St. John Mandel’s prose puts billions to death. Those who survive must find purpose against the backdrop of the visceral viciousness of the apocalypse. Some immerse themselves in art, traveling the postapocalyptic wilderness and performing Shakespeare plays for pockets of survivors. Some join a religious cult led by a violent prophet who resembles history’s most monstrous men. Yet even this figure is skillfully humanized (though not exonerated) as having emerged from a frightened and damaged boy. Richly constructed character histories weave together in the end, creating a gorgeous tapestry which reveals both the inherent goodness and the intrinsic darkness of the human species. Station Eleven is lyrical, haunting, and intense. It immerses the reader in a realm which translates philosophy into the more brutally real language of emotion.

This Is How You Lose the Time War by Amal El-Mohtar and Max Gladstone: 98/100. I have a special fondness for fiction which reads like poetry. This Is How You Lose the Time War by Amal El-Mohtar and Max Gladstone represents a tour de force of far-future poetic science fiction which sparkles with imagination, intensity, and wonder. An epistolary novel, it is told through letters exchanged by a pair of time-traveling cyborg supersoldiers named Red and Blue respectively who start as mortal enemies on opposite sides of a war and gradually fall in love. Each letter is delivered through a distinct medium; powdered cod bone sprinkled over a biscuit, a code of mineral veins in lava, a pattern of a bee’s flight and the venom of its sting, and many more. Red and Blue often spend decades in different pasts and futures, taking on the forms of various people and animals as part of their war. Though this conflict’s degree of convolutedness is far beyond human comprehension, the authors expertly utilize lyrical language to transmit a tantalizing taste of its scope. The central characters are so far beyond human that they should seem alien to the reader, yet their emotions come across as piercing and visceral. Beyond this, the beauty of the language gives the narrative a songlike quality which instills every passage with sensation, crispness, and vivacity. In terms of symbolism and metaphor, the book contains more than enough fractal complexity to fill the Library of Congress with multilayered literary analyses. This Is How You Lose The Time War furthermore incorporates a wealth of fascinating philosophical ideas involving love, war, peace, power, and freedom which are built on top of its spectacular wordsmithing. This book makes me feel like I am sipping liquid beauty during the cool of early morning while watching the stars of an alien sky slip beneath the horizon.

Blindsight by Peter Watts: 98/100. It is difficult to describe Blindsight. I could clumsily slap labels onto the novel and call it literary psychological sci-fi horror with an emphasis on the philosophy of neuroscience. I could vaguely refer to it as a boiling froth of darkness replete with nightmarish poetics. I could say that it manages incorporate both aliens and vampires in a terrifyingly believable fashion. I could pontificate on how the story oozes with malign hyperintelligence and conveys a sense of hurtling movement too fast to track with human eyes. Yet none of this can truly capture the frightening majesty of the narrative. More directly, Blindsight is a story about contact with aliens. After humanity first encounters the aliens, the governments of Earth send a group of cyborgs, freaks, and savants on a living spaceship to meet the aliens. The captain of this group is vampire, a technologically resurrected predator with intelligence vastly exceeding that of any human. The protagonist (Siri Keeton) had half his brain surgically removed when he was a child, rendering him incapable of empathy and forcing him to learn how to navigate social interactions through purely algorithmic techniques. Siri’s unusual backstory and motivations are richly explored over the course of the story. The novel explores ideas surrounding radical neurodivergence, transhumanism, the effects of neurotechnology on society, intelligence, consciousness, artificial intelligence, empathy, the blurring of the human-machine divide, emotional abuse, ableism, and evolutionary biology. As the book progresses, numerous psychological and philosophical revelations accrue. The aliens are more truly alien than any other aliens I have encountered in fiction. It is through a certain aspect of these aliens that the book’s most intensely frightening philosophical proposition is unveiled, but I will not spoil that for the reader. Prepare to be deeply disturbed in the most intellectually stimulating of ways.

The Chronoliths by Robert Charles Wilson: 97/100. Science fiction is the literature of ideas. Quality science fiction links these ideas to our own lives in a meaningful fashion. The Chronoliths by Robert Charles Wilson is a novel which successfully weaves together big ideas with intensely personal trajectories of individual human lives. Through this style of writing, it allows us to see ourselves in the characters and reflect upon our roles in the epic drama of civilization and the universe. The Chronoliths blends several stories into a unified narrative. It tells the story of icy monuments which periodically materialize at various locations across the Earth, causing death and destruction where they appear. These Chronoliths have writing on them, text which proclaims future military victories by a warlord named Kuin. It tells the story of an ordinary man named Scott Warden, his efforts to protect his daughter, and how his destiny is inextricably linked to the Chronoliths by the physical forces of nature. It tells the story of a genius physicist named Sulamith Chopra who finds herself increasingly obsessed with the Chronoliths and how they influence the flow of history. It tells the story of a single mother named Ashlee and her difficult relationship with her sociopathic son Adam Mills. I am struck by the deeply human identities of all of the characters (even many of the minor characters). They feel so vividly real with their struggles, quirks, backstories, and traumas. I tangibly feel their hopes and fears as they search for purpose in the midst of troubled world. All of this is accentuated by the lovingly detailed global setting which glows with verisimilitude. I should mention that I am a longtime fan of Robert Charles Wilson’s writings. His short piece Utriusque Cosmi is perhaps my favorite story of all time. Yet even with my high expectations going into The Chonoliths, I was nonetheless floored by its haunting beauty.

The Sparrow by Mary Doria Russell: 95/100. It is not easy to incorporate theology into science fiction without proselytizing the reader, yet The Sparrow does an elegant job of examining philosophy of religion through a first contact lens. At a deeper level, this book is about the human search for meaning and belonging in the universe, so even nonreligious readers can viscerally appreciate most of its ideas. Some other important themes the interplay between love, trauma, guilt, faith, anger, and healing. There are also some interesting (and reasonably balanced) forays into the psychology surrounding sexual abstinence of priests. The Sparrow charts the painful recovery of the sole survivor of a mission to make first contact with aliens through visiting them directly on their home planet. The survivor is Father Emilio Sandoz and he is physically disfigured and psychologically scarred by his experiences. The novel works backwards to explain what happened to him and the rest of the crew of the mission. This book includes some extremely disturbing occurrences. I believe that these occurrences were necessary for the story, but they might be triggering to some readers, so please be aware of this. On a lighter note, Mary Doria Russell’s writing clearly demonstrates her exceptional skills as a historian. Part of what makes this story feel so real is that it contains a wealth of impeccably researched cultural depth. Latin American settings, the history of Turkey, the bureaucracy of the Roman Catholic Church, and more are covered in loving detail. Furthermore, the characters show thoroughly believable backstories, quirky personalities, and complex psychological evolution. I care about these people. The Sparrow represents one of the most philosophically rich and thought-provoking books that I have yet encountered.

The Quantum Thief by Hannu Rajaniemi: 95/100. I would characterize The Quantum Thief as the most imaginative novel I have ever read. From beginning to end, it sparkles with kaleidoscopic strangeness. Though some readers might be put off by the onslaught of unfamiliar terminology, I found the bizarre language exhilarating. It tells the tale of a gentleman thief named Jean le Flambeur who goes through a series of convoluted adventures in a hyper-futuristic postsingularity version of our own solar system. The novel explores the unreliability of memory and mind in a future where advanced neurotechnology is ubiquitous and any dividing line between biology and technology has been completely obliterated. I possess great admiration for the sheer audacity of the Rajaniemi’s creativity. The walking city on Mars (called the Oubliette) where much of the story takes place is only the tip of the iceberg. When people die in that city, their minds are transferred into colossal robotic monsters known as the Quiet which toil beneath the city on the surface of Mars. A detective accesses the Oubliette’s exomemory to solve the mystery of a murdered Chocolatier. The living spaceship named Perhonen flirts with the thief protagonist. Every line of the book adds more of these kinds of concepts. As the plot cascades, complex mysteries of missing memories and buried pasts unravel. All this mixes with the thrill of the heist, a cast of believable and emotionally resonant characters, a complex alien political landscape, and a sense that this futuristic society has been oddly suffused with French culture. It is difficult to properly describe the profoundly colorful weirdness of The Quantum Thief. You just have to read it for yourself.

Never Let Me Go by Kazuo Ishiguro: 95/100. For many, growing up is filled with both yearning and conflict. Never Let Me Go successfully captures the emotional intensity associated with the coming-of-age process while simultaneously investigating some dark concepts in bioethics. It is the story of Kathy, Tommy, Ruth, and a few others who grow up at an unusual English boarding school called Hailsham. The book chronicles the unfolding of their lives in a vividly believable and exquisitely detailed fashion as they hurtle towards an inevitable fate. They experience the familiar trials of growing up: navigating tricky social landscapes, falling in love, learning about the world, and forming their own identities. But there is a tragic context which overshadows these experiences. To reveal the specifics of this context would spoil some key aspects of the book, so I will only state that it explores some fascinating ideas in the area of medical science fiction. Despite the bioethics-related speculation which appears later in the novel, the narrative remains centered on the individual experiences of the characters, which fits well with its stylistic approach. Themes of mortality, love, friendship, and meaning are explored throughout. Perhaps most importantly, Never Let Me Go represents a deeply emotional story. By the end, I was weeping for the intricate characters who had decided to quietly accept something very sad indeed.

Exhalation by Ted Chiang: 95/100. As someone who was strongly influenced by Ted Chiang’s first short story collection “Stories of Your Life and Others”, I came into Exhalation with high expectations. I was not disappointed. Chiang possesses a special talent for crafting brilliant short pieces that combine intense clarity, tremendous conceptual ingenuity, and vast emotional depth. For instance, The Merchant and the Alchemist’s Gate followed the lyrical style of the classic One Thousand and One Nights, provided an uplifting narrative of loss and regret and redemption, and accessed themes of acceptance and fate. Another excellent story in the collection, The Truth of Fact The Truth of Feeling, gave a balanced perspective on how technology influences the way our brains think and communicate while also examining both a complex relationship between a father and daughter and a linguistics-driven historical scenario. The Lifecycle of Software Objects examines the concept of raising artificially intelligent creatures as children in a highly believable fashion. Exhalation (the title story) takes place in an alternate universe populated by a very different sort of life, yet it precisely interrogates ideas of vital importance to both the grand human condition and the deeply personal. Ted Chiang has once again demonstrated himself as one of the greatest short form science fiction authors ever to live.

Childhood’s End by Arthur C. Clarke: 92/100. It is not easy to capture the sheer sense of awe which comes from contemplating that which is beyond human comprehension. Childhood’s End delivers a shockingly provocative glimpse into the sublime while forcing the reader to contemplate the place of humanity in the universe. As humans, many of us enjoy telling ourselves stories about loving gods. Those inclined towards Lovecraftian tales take the opposite approach, conjuring up nightmares of cosmic monsters. Arthur C. Clarke unflinchingly finds a middle ground between these extremes. At the staggering conclusion of Childhood’s End, we experience both the cold realization of our own insignificance and a spiritually satisfying transcendence. Clarke proposes that to truly understand the divine, we may need to transform into something which is no longer even remotely human. Perhaps I am of the minority opinion that I am not repulsed by this notion, though I certainly do have some reservations about it. This is a spectacularly thought-provoking novel. My only complaint is that the first two sections of the book are significantly less compelling than its Earth-shattering conclusion, though they are necessary to set it up. Because the story was published in 1953, it includes some very outdated sexist assumptions and racist terminology. (As a person who has read some of Clarke’s later novels, I can attest that he improved over time in this regard). The characters and plot in the initial two-thirds of the book feel too stiff and detached for my taste. Nonetheless, this is more than made up for with the final portion of the story. If you want to think about the big questions and experience both extreme alienness and spiritual wonderment at the same time, you should read this book.

Cover image source: The Prologue and the Promise by Robert McCall

A Guide to CRISPR-Cas Nucleases


PDF version: A Guide to CRISPR-Cas Nucleases by Logan Thrasher Collins

Many different types of CRISPR-Cas nucleases possess biotechnological relevance. For a newcomer, the menagerie of Cas proteins may seem overwhelming. It can be challenging to decide which type of CRISPR system to employ in one’s research. To help address this issue, I compiled these notes. While my guide is certainly not comprehensive, it still covers a wide swath of important Cas proteins and may prove valuable as a starting point for those interested in getting a sense of the field. One should be aware that the field of CRISPR technology is moving rapidly, so some of the nucleases described here might eventually be superseded by newly discovered and/or newly engineered Cas proteins. I would also like to mention that since these notes are specifically focused on types of Cas proteins, I have omitted direct explanations of some important CRISPR technologies such as base editors, prime editors, and dead Cas systems. I also have not directly explained important CRISPR-related concepts such as non-homologous end joining (NHEJ), homology-directed repair (HDR), and adeno-associated virus (AAV) vectors. I encourage the reader to look elsewhere to learn about these subjects since they are vital for having a strong understanding of CRISPR biotechnology. I hope that you enjoy reading my notes and find them useful for your own scientific endeavors!


SpCas9 represents one of the first discovered and most commonly used CRISPR-Cas proteins.1 It comes from Streptococcus pyogenes, a gram-positive bacterial pathogen. SpCas9 employs two nuclease domains to make blunt double-stranded cuts in DNA: the HNH domain for cutting the strand which pairs with the gRNA and the RuvC domain for cutting the other strand. The protospacer adjacent motif (PAM) of SpCas9 has the sequence 5’-NGG-3’, which limits the target sites that the nuclease can find. Though wild-type (WT) SpCas9 possesses a problematic level of off-target activity, several mutant variants of the enzyme have been engineered which give it much more precision.2,3 As some examples, a few of these (but not all of them) include eSpCas9-HF, eSpCas9(1.1), and HypaCas9. The eSpCas9-HF and eSpCas9(1.1) enzymes maintain robust on-target cleavage while reducing off-target effects.3 The HypaCas9 enzyme has similar properties, but with even less off-target effects.2


At 1053 amino acids in length, SaCas9 is significantly smaller than SpCas9 (which is 1368 amino acids long).4 SaCas9 can be used in mammalian cells, employs NNGRRT PAM sites (R is A or G), and uses RuvC and HNH domains for cutting. But without further engineering, SaCas9 has lower target specificity even than SpCas9. Fortunately, mutant versions of SaCas9 which exhibit improved targeting accuracy have been developed. Tan et al. engineered SaCas9-HF, a version of the protein which has much less off-target activity relative to the WT SaCas9 and retains its on-target activity.4 With such improvements, SaCas9-HF can serve as a useful alternative to SpCas9.  


The LbCas12a enzyme makes staggered cuts using a single RuvC domain (and no HNH domain), uses T-rich PAM sites, and catalyzes its own crRNA maturation.5 LbCas12a comes from Lachnospiraceae bacterium ND2006. LbCas12a has another remarkable property: the binding and cleavage of target dsDNA activates a separate part of the protein which nonspecifically cleaves any ssDNA in its vicinity. This nonspecific trans-cleavage activity is thought to occur as a result of a conformational change in the LbCas12a protein which exposes its RuvC domain for broader ssDNA attack after binding to target dsDNA.6 It should be noted that other type-V Cas proteins including AsCas12a (see corresponding section), FnCas12a (from the bacterium Francisella novicida), and AaCas12b (from the bacterium Alicyclobacillus acidoterrestris) have been shown to exhibit the same capabilities.5 There furthermore exist many RNA-guided RNA-targeting Cas proteins which possess the same types of abilities.7 There are likely many other type-V Cas proteins with these capabilities as well. The activation of type-V Cas proteins to perform indiscriminate ssDNA cleavage after exposure to target dsDNA has been exploited as a target-induced signal amplification method to develop novel molecular diagnostics.6


The AsCas12a protein (also called Cpf1) is derived from Acidaminococcus sp.,8 which are a group of anaerobic gram-negative bacteria. The protein exhibits several distinctive features compared to Cas9. AsCas12a utilizes a T-rich PAM site, unlike Cas9’s G-rich PAM. This is useful since it expands the possible targets for CRISPR. In particular, the T-rich PAM of AsCas12a can be useful when dealing with organisms that have AT-rich genomes such as Plasmodium falciparum. The naturally occurring form of AsCas12a does not require a tracrRNA, instead its CRISPR arrays are processed into just crRNAs, which serve to complete the functional AsCas12a-crRNA complex. Rather than creating blunt ends, AsCas12a makes staggered cuts with 4-5 nucleotide 5’ overhangs. This is useful since it increases the precision of non-homologous end joining (NHEJ) repair and allows insertion of DNA sequences at a chosen cut site with a desired orientation as specified by the base pairing of the insert with the overhang sequences. In addition, the AsCas12a protein employs a single RuvC domain to make its staggered cuts and does not have an HNH domain. AsCas12a has a lower tolerance for gRNA-target mismatches9 compared to SpCas9 and therefore demonstrates greater targeting specificity. As a result, AsCas12a shows fewer off-target effects overall. But it also has a lower editing efficiency compared to Cas9 proteins, which means that less cells receive any edits upon introduction of the AsCas12a. As described with LbCas12a, the AsCas12a protein also can carry out nonspecific ssDNA cleavage after it cuts to its target dsDNA.

AsCas12a ultra

WT AsCas12a possesses high targeting specificity, low off-target effects, and makes 5’ overhangs which facilitate correct insert orientation (see the section on AsCas12a). These properties represent desirable qualities for therapeutic gene editing, but the low editing efficiency of AsCas12a limits its therapeutic potential. Because of this, Zhang et al. (in a collaboration between Editas and Integrated DNA Technologies) developed an engineered version of the protein which was dubbed AsCas12a ultra.9 This AsCas12a ultra protein was created using directed evolution in bacteria. It has two point mutations relative to WT AsCas12a, M537R and F870L. These mutations grant the AsCas12a ultra extremely high editing efficiency while maintaining the protein’s low level of off-target effects. For a variety of target sites, Zhang et al. demonstrated nearly 100% editing efficiency in HSPCs, iPSCs, T cells, and NK cells using AsCas12a ultra. They also showed 93% efficiency for simultaneous disruption of three genes in T cells. When performing knock-in edits, Zhang et al. achieved efficiencies of 60% in T cells, 50% in NK cells, and 30% in HSPCs. These impressive numbers illustrate the utility of AsCas12a ultra as a broadly applicable tool for therapeutic gene editing.


The AsCas12f1 protein consists of only 422 amino acids, making it one of the smallest Cas proteins known.10 It comes from a type of gram-positive iron-oxidizing bacteria called Acidibacillus sulfuroxidans. AsCas12f1 makes staggered double-stranded breaks in target DNA and recognizes 5’ T-rich PAMs. Even with minimal engineering (just the construction of gRNA from combining its tracrRNA and mature crRNA), Wu et al. showed that AsCas12f1 exhibits usable levels of activity in mammalian cells.10 When expressed directly in mammalian cells via a plasmid, the protein achieved a maximum indel efficiency of 32.8%. When delivered to mammalian cells by AAV-DJ, the maximum indel efficiency was 11.5%. The AsCas12f1 protein possesses considerable promise as a compact therapeutic gene editing tool.

Kim et al.’s engineered Un1Cas12f

At 529 amino acids in length, the Un1Cas12f nuclease represents one of the smallest Cas proteins yet discovered.11 This is useful since the small size of Un1Cas12f’s gene allows it to easily fit within AAV vectors. It comes from an uncultured archaeon and is classified as a type-V CRISPR nuclease, which utilize a C-terminal RuvC domain and do not possess an HNH domain. Though the original Un1Cas12f-gRNA complex has very low editing efficiency in eukaryotic cells, Kim et al. were able to intensively engineer the gRNA using a rational design strategy and achieve an 867-fold improvement of indel frequency in mammalian cells.12 They also showed that the Un1Cas12f gene and gRNA gene could be delivered to the cells using AAVs. Because of its small size, Un1Cas12f may serve as an excellent scaffold for creating base editors and prime editors which fit inside of AAVs.


The CasMINI protein is another engineered CRISPR nuclease derived from Cas12f,13 which comes from an uncultured archaeon. This Cas12f is the same as the Un1Cas12f used by Kim et al.12 Since Cas12f has little to no editing activity in mammalian cells, Xu et al. used rational design to optimize the associated gRNA and employed directed evolution to optimize the protein itself.13 CasMINI, a 529 amino acid protein, was the end result of these approaches. When CasMINI was modified to make dCasMINI-VPR (the VPR is a protein fusion which activates certain genes), it performed with comparable efficiency relative to the commonly used dLbCas12a-VPR. In some cases, dCasMINI-VPR actually outperformed dLbCas12a-VPR. When dCasMINI was modified by fusing base editor (ABE) domains at its N-terminus, the dCasMINI-ABE constructs performed base editing at comparable efficiency relative to dLbCas12a-ABE proteins. Because of their small sizes, the genes encoding the dCasMINI-ABE designs could easily fit into AAV vectors, though Xu et al. did not test this in their paper. Furthermore, even the genes encoding CasMINI prime editors should fit into AAV vectors. It should be noted that the most efficient dCasMINI-ABE base editing occurred in a narrow window precisely 3-4 bp downstream of the PAM site. When CasMINI was tested for its ability to perform gene editing by making indels, it showed significantly improved activity over Cas12f, though the editing efficiencies were still fairly low at around 5-10%.


The Cas12j enzyme, also known as CasΦ, comes from the genomes of huge bacteriophages of the Biggiephage clade.14 This is remarkable since CRISPR systems have usually been found in bacteria and archaea rather than viruses (though the prevalence of such machinery in viruses is perhaps underestimated). It has been hypothesized that Biggiephages use Cas12j to cut the DNA of other competing bacteriophages. There exist subtypes of Cas12j such as Cas12j-1, Cas12j-2, and Cas12j-3. All of the Cas12j nucleases are small at between 700 and 800 amino acids in length. The Cas12j nuclease cuts target dsDNA using a single C-terminal RuvC domain. Cas12j’s RuvC domain has a small amount of homology to the TnpB protein superfamily from which type-V Cas proteins evolved, yet it still shares <7% amino acid identity overall with type-V Cas proteins. Cas12j is most closely related to a type of TnpB group which is distinct from the type-V enzymes. The Cas12j nuclease catalyzes its own crRNA maturation using its RuvC domain (similar to the type-V nucleases). Unlike the type-V Cas proteins, Cas12j uses the same active site for both its RuvC cleavage of target DNA and its RuvC processing of the crRNA. It employs T-rich PAM sites which have fairly minimal target requirements. For example, the PAM of the Cas12j-2 subtype is 5’-TBN-3’ (B = G, T, or C). These minimal requirements give Cas12j expanded target recognition capabilities compared to other Cas proteins. Cas12j is active in vitro as well as within bacterial, human, and plant cells. Cas12j-2 (with a gRNA) has been observed to edit up to 33% of HEK293 cells. Though this may sound somewhat low, it represents an editing efficiency comparable to that initially reported for Cas9.


The LwaCas13a protein is a type-VI CRISPR nuclease and it cleaves RNA rather than DNA.15 It represents one of the most active types of RNA-guided RNA-targeting Cas proteins. LwaCas13a catalyzes the maturation of its own crRNA. The enzyme comes from Leptotrichia wadei, a type of anaerobic gram-negative bacteria found in saliva. LwaCas13a has demonstrated around 50%-80% knockdown of target RNAs in mammalian and plant cells. This is similar to the knockdown efficiencies of shRNAs, but LwaCas13a shows much lower off-target effects. When converted into dLwaCas13a, the protein can act as an RNA imaging tool. It has also been reported to have strong potential for therapeutics as well. One of the most important emerging applications of LwaCas13a (and similar Cas proteins) is that they can be used in diagnostics for infectious diseases.7 To do this, the LwaCas13a gRNA can be designed to target an RNA sequence from a desired pathogen. LwaCas13a can then be mixed with a short reporter RNA oligonucleotide which has a fluorophore at one end and a quencher at the other (the fluorophore is quenched by its close proximity to the quencher). If the target pathogen RNA is introduced, LwaCas13a will cleave said target RNA as well as activate nonspecific trans-cleavage activity (see section on LbCas12a), leading to cleavage of the reporter oligonucleotides. When the reporter oligonucleotides are cleaved, the fluorophore is released from the quencher, resulting in observable fluorescence. It should be noted that many CRISPR-based diagnostics require some form of target nucleic acid amplification step to increase signal prior to the usage of a Cas protein like LwaCas13a, though ways to mitigate this limitation are undergoing rapid development.16


Kannan et al. identified Cas13bt1 and Cas13bt3 as useful RNA-targeting CRISPR nucleases since Cas13bt has some activity in human cells.17 Cas13bt1 and Cas13bt3 are small at just 804 amino acids and 775 amino acids respectively. It should be noted that Cas13bt also exhibits nonspecific nonspecific trans-cleavage activity (see section on LbCas12a) after cleaving its RNA target, which may allow its usage in diagnostics. Kannan et al. took advantage of the small sizes of Cas13bt1 and Cas13bt3 to develop compact RNA base editors. They fused an ADAR2 hyperactive adenosine deaminase catalytic domain onto dCas13bt1 and dCas13bt3. The resulting constructs were respectively named REPAIR.t1 and REPAIR.t3 and were shown to facilitate adenosine to inosine conversion in target RNAs. They also fused an ADAR2dd cytidine deaminase domain (which was itself created through directed evolution) onto dCas13bt1 and dCas13bt3. The resulting constructs were respectively named RESCUE.t1 and RESCUE.t3 and were shown to facilitate conversion of cytosine to uracil in target RNAs. Due to the small sizes of Cas13bt enzymes, all of these RNA base editors were small enough to fit inside of AAV vectors even alongside gRNA encoding sequences. The authors demonstrated successful AAV-mediated delivery to cells, but the editing efficiencies were low, so further optimization will likely be necessary.


The CasX nuclease represents a distinct type of Cas protein which does not share much sequence similarity with other types of CRISPR enzymes except for a RuvC domain.18 It is an RNA-guided DNA-targeting endonuclease which has minimal nonspecific trans-cleavage activity. Using its single RuvC domain, CasX creates staggered cuts (with about 10 nucleotide overhangs) in dsDNA complementary to its gRNA and adjacent to its TTCN PAM sites. CasX nucleases are <1000 amino acids in length, which is smaller than Cas9 and Cas12a. This could be useful for AAV-mediated delivery of CasX systems. There are different subtypes of CasX which come from different bacteria. Two of the known subtypes are DpbCasX (from Deltaproteobacteria) and PlmCasX (from Planctomycetes). DpbCasx can act in human cells, though it shows limited gene editing efficiency. PlmCasX generally has better efficiency at performing in human cells and can often achieve targeted disruption of genes in around a third of transfected cells. While this level of disruption is still modest, it is similar to the levels originally found with WT Cas9 enzymes before they were optimized for gene editing.

Un1Cas12f (previously known as Cas14a)

The Cas12f proteins represent a class of small CRISPR nucleases (400-700 amino acids in length) that are capable of RNA-guided cleavage of ssDNA or dsDNA depending on whether the gRNA or crRNA includes a PAM. They employ a RuvC domain for cleavage and do not possess an HNH domain. There are various subtypes of Cas12f, but Un1Cas12f (previously Cas14a1) has been studied in the most detail. Un1Cas12f was first reported to selectively cleave ssDNA and not dsDNA.19 It was also initially reported to not require a PAM site for targeting. Without the constraint of needing a PAM site for targeting, Un1Cas12f has broader possibilities for which ssDNA sequences can be targeted. However, later research revealed that Un1Cas12f can cleave dsDNA when a 5’ T-rich PAM sequence is included in the gRNA or crRNA.20 As with many other types of Cas proteins, Un1Cas12f exhibits nonspecific nonspecific trans-cleavage activity of dsDNA (see section on LbCas12a) after cleaving its target DNA, which grants it utility as a component of diagnostics.


The Cas7-11 protein is an RNA-guided RNA-targeting CRISPR nuclease.21 It is named Cas7-11 because it arose evolutionarily from a fusion of a protein known as Cas7 with a protein known as Cas11. The DiCas7-11 enzyme comes from the gram-negative sulfate-reducing bacteria Desulfonema ishimotonii (there also exist similar types of Cas7-11 from other species). An important advantage of DiCas7-11 is that it does not have a toxic effect on host cells (bacterial or mammalian). By comparison, RNA knockdown technologies including shRNA, LwaCas13a, PspCas13b, and RfxCas13d typically cause around 30-50% host cell death. DiCas7-11 shows similar knockdown efficiencies compared to these other RNA knockdown technologies while demonstrating no detectable cellular toxicity. Unfortunately, DiCas7-11 is also fairly large at 1602 amino acids, making it difficult to package into AAV vectors. One more application of Cas7-11 is RNA editing. The creation of a dDiCas7-11 fused to a base editor domain has enabled RNA editing in mammalian cells.


3D structure images were created using PyMol.

(1)      Anders, C.; Niewoehner, O.; Duerst, A.; Jinek, M. Structural Basis of PAM-Dependent Target DNA Recognition by the Cas9 Endonuclease. Nature 2014, 513 (7519), 569–573.

(2)      Chen, J. S.; Dagdas, Y. S.; Kleinstiver, B. P.; Welch, M. M.; Sousa, A. A.; Harrington, L. B.; Sternberg, S. H.; Joung, J. K.; Yildiz, A.; Doudna, J. A. Enhanced Proofreading Governs CRISPR–Cas9 Targeting Accuracy. Nature 2017, 550 (7676), 407–410.

(3)      M., S. I.; Linyi, G.; Bernd, Z.; A., S. D.; X., Y. W.; Feng, Z. Rationally Engineered Cas9 Nucleases with Improved Specificity. Science (80-. ). 2016, 351 (6268), 84–88.

(4)      Tan, Y.; Chu, A. H. Y.; Bao, S.; Hoang, D. A.; Kebede, F. T.; Xiong, W.; Ji, M.; Shi, J.; Zheng, Z. Rationally Engineered Staphylococcus Aureus Cas9 Nucleases with High Genome-Wide Specificity. Proc. Natl. Acad. Sci. 2019, 116 (42), 20969 LP – 20976.

(5)      S., C. J.; Enbo, M.; B., H. L.; Maria, D. C.; Xinran, T.; M., P. J.; A., D. J. CRISPR-Cas12a Target Binding Unleashes Indiscriminate Single-Stranded DNase Activity. Science (80-. ). 2018, 360 (6387), 436–439.

(6)      Nalefski, E. A.; Patel, N.; Leung, P. J. Y.; Islam, Z.; Kooistra, R. M.; Parikh, I.; Marion, E.; Knott, G. J.; Doudna, J. A.; Le Ny, A.-L. M.; Madan, D. Kinetic Analysis of Cas12a and Cas13a RNA-Guided Nucleases for Development of Improved CRISPR-Based Diagnostics. iScience 2021, 24 (9), 102996.

(7)      Kellner, M. J.; Koob, J. G.; Gootenberg, J. S.; Abudayyeh, O. O.; Zhang, F. SHERLOCK: Nucleic Acid Detection with CRISPR Nucleases. Nat. Protoc. 2019, 14 (10), 2986–3012.

(8)      Zetsche, B.; Gootenberg, J. S.; Abudayyeh, O. O.; Slaymaker, I. M.; Makarova, K. S.; Essletzbichler, P.; Volz, S. E.; Joung, J.; van der Oost, J.; Regev, A.; Koonin, E. V.; Zhang, F. Cpf1 Is a Single RNA-Guided Endonuclease of a Class 2 CRISPR-Cas System. Cell 2015, 163 (3), 759–771.

(9)      Zhang, L.; Zuris, J. A.; Viswanathan, R.; Edelstein, J. N.; Turk, R.; Thommandru, B.; Rube, H. T.; Glenn, S. E.; Collingwood, M. A.; Bode, N. M.; Beaudoin, S. F.; Lele, S.; Scott, S. N.; Wasko, K. M.; Sexton, S.; Borges, C. M.; Schubert, M. S.; Kurgan, G. L.; et al. AsCas12a Ultra Nuclease Facilitates the Rapid Generation of Therapeutic Cell Medicines. Nat. Commun. 2021, 12 (1), 3908.

(10)    Wu, Z.; Zhang, Y.; Yu, H.; Pan, D.; Wang, Y.; Wang, Y.; Li, F.; Liu, C.; Nan, H.; Chen, W.; Ji, Q. Programmed Genome Editing by a Miniature CRISPR-Cas12f Nuclease. Nat. Chem. Biol. 2021.

(11)    Okano, K.; Sato, Y.; Hizume, T.; Honda, K. Genome Editing by Miniature CRISPR/Cas12f1 Enzyme in Escherichia Coli. J. Biosci. Bioeng. 2021, 132 (2), 120–124.

(12)    Kim, D. Y.; Lee, J. M.; Moon, S. Bin; Chin, H. J.; Park, S.; Lim, Y.; Kim, D.; Koo, T.; Ko, J.-H.; Kim, Y.-S. Efficient CRISPR Editing with a Hypercompact Cas12f1 and Engineered Guide RNAs Delivered by Adeno-Associated Virus. Nat. Biotechnol. 2021.

(13)    Xu, X.; Chemparathy, A.; Zeng, L.; Kempton, H. R.; Shang, S.; Nakamura, M.; Qi, L. S. Engineered Miniature CRISPR-Cas System for Mammalian Genome Regulation and Editing. Mol. Cell 2021.

(14)    Patrick, P.; Basem, A.-S.; Ezra, B.-R.; A., T. C.; Zheng, L.; F., C. B.; J., K. G.; E., J. S.; F., B. J.; A., D. J. CRISPR-CasΦ from Huge Phages Is a Hypercompact Genome Editor. Science (80-. ). 2020, 369 (6501), 333–337.

(15)    Abudayyeh, O. O.; Gootenberg, J. S.; Essletzbichler, P.; Han, S.; Joung, J.; Belanto, J. J.; Verdine, V.; Cox, D. B. T.; Kellner, M. J.; Regev, A.; Lander, E. S.; Voytas, D. F.; Ting, A. Y.; Zhang, F. RNA Targeting with CRISPR–Cas13. Nature 2017, 550 (7675), 280–284.

(16)    Kaminski, M. M.; Abudayyeh, O. O.; Gootenberg, J. S.; Zhang, F.; Collins, J. J. CRISPR-Based Diagnostics. Nat. Biomed. Eng. 2021, 5 (7), 643–656.

(17)    Kannan, S.; Altae-Tran, H.; Jin, X.; Madigan, V. J.; Oshiro, R.; Makarova, K. S.; Koonin, E. V; Zhang, F. Compact RNA Editors with Small Cas13 Proteins. Nat. Biotechnol. 2021.

(18)    Liu, J.-J.; Orlova, N.; Oakes, B. L.; Ma, E.; Spinner, H. B.; Baney, K. L. M.; Chuck, J.; Tan, D.; Knott, G. J.; Harrington, L. B.; Al-Shayeb, B.; Wagner, A.; Brötzmann, J.; Staahl, B. T.; Taylor, K. L.; Desmarais, J.; Nogales, E.; Doudna, J. A. CasX Enzymes Comprise a Distinct Family of RNA-Guided Genome Editors. Nature 2019, 566 (7743), 218–223.

(19)    B., H. L.; David, B.; S., C. J.; David, P.-E.; Enbo, M.; P., W. I.; C., C. J.; C., K. N.; F., B. J.; A., D. J. Programmed DNA Destruction by Miniature CRISPR-Cas14 Enzymes. Science (80-. ). 2018, 362 (6416), 839–842.

(20)    Karvelis, T.; Bigelyte, G.; Young, J. K.; Hou, Z.; Zedaveinyte, R.; Budre, K.; Paulraj, S.; Djukanovic, V.; Gasior, S.; Silanskas, A.; Venclovas, Č.; Siksnys, V. PAM Recognition by Miniature CRISPR–Cas12f Nucleases Triggers Programmable Double-Stranded DNA Target Cleavage. Nucleic Acids Res. 2020, 48 (9), 5016–5023.

(21)    Özcan, A.; Krajeski, R.; Ioannidi, E.; Lee, B.; Gardner, A.; Makarova, K. S.; Koonin, E. V; Abudayyeh, O. O.; Gootenberg, J. S. Programmable RNA Targeting with the Single-Protein CRISPR Effector Cas7-11. Nature 2021, 597 (7878), 720–725.

Resource: List of Biotechnology Companies to Watch


PDF version: List of Biotechnology Companies to Watch

I created this list to serve as a resource to help people learn about and keep track of key biotechnology companies. Some of these are emerging startups, some are established giants, and some provide useful services. Though this list is far from comprehensive, I have tried to cover as many of the key players as possible. In the next iteration of this list, I would especially like to add more agricultural biotechnology companies. It is also important to realize that this landscape is constantly changing, so some of the information on this list will eventually transition into antiquity (this current version was written over the course of 2021 and early 2022). I think many people will find my compilation both interesting and useful. I hope you enjoy delving into the exciting world of biotechnology!  

AblynxNanobodies as therapeutics and as laboratory reagents.
AgeX TherapeuticsTreating aging using stem cell therapies, induced tissue regeneration, related methods.
AgriseaDeveloping salt-tolerant rice for ocean agriculture.
Early stage as of May 2021.
AllonniaEngineering microorganisms and enzymes to degrade environmental pollutants.
Funded by the Ferment Consortium of Ginkgo Bioworks.
AsimovDeveloping computer aided design tools for synthetic biology, making host cell lines for viral vector and biologics manufacturing, constructing genetic parts database.
One of the co-founders is Christopher Voigt.
James Collins is on the scientific advisory board.
Beam TherapeuticsDeveloping base editor technologies towards therapeutic applications.
David Liu and Feng Zhang are among the co-founders.
BiogenLarge pharmaceutical company focusing on developing treatments for neurological diseases.
Has made moves towards developing gene therapy pipelines for treating neurological diseases, though the company has experienced some setbacks in this space (i.e. failed clinical trials).
BioMarin PharmaceuticalEnzyme replacement therapies for rare diseases.
During April 2021, announced a collaboration with the Allen Institute to develop AAV gene therapies for rare diseases of the brain.
Bionaut LabsMicrorobotics as a new paradigm for drug delivery.
BioVivaDeveloping gene therapies to treat aging, offers tests for determining biological age.
Elizabeth Parrish (the company’s CEO) tested an experimental gene therapy on herself and reports positive results, though some people question whether the therapy actually worked.
George Church and Aubrey de Grey are on the scientific advisory board.
Anders Sandberg is the company’s ethics advisor.
CapsigenEngineering superior AAV gene therapy vectors through a proprietary method called Transcription-Dependent Directed Evolution (TRADETM).
Have developed greatly improved neurotrophic AAVs.
Entered into a partnership with Biogen during May of 2021 to develop AAV gene therapies that treat various brain and neuromuscular disorders.
CATALOGBuilding a DNA-based platform for massive digital data storage and computation.
Cortical LabsDeveloping hybrid bioelectronic devices which incorporate cultured biological neurons to perform computational tasks. These devices are power efficient, scalable, robust to physical damage, and have the potential for fluid adaptation to many different computational problems.
Creative BiolabsCustom services for antibody engineering, membrane protein production and characterization, bioconjugation, gene therapy development, viral vector engineering, cell therapy development, molecular dynamics simulations, drug development consulting, and more.
CultivariumDeveloping molecular techniques, hardware platforms, and software tools to accelerate adoption of non-model microorganisms for biotechnology.
Cultivarium is a focused research organization (FRO), so it possesses a distinct funding approach and different goals compared to traditional startups. For more information, see this open access article describing FROs in Nature.
Dyno TherapeuticsUsing deep learning to improve properties of AAV capsids as a platform technology for gene therapy.
George Church is one of the co-founders.
E11 BioBuilding moonshot technologies involving superior molecular barcoding, spatial -omics, and viral circuit tracing to help neuroscientists map the brain. Has a long-term goal of mapping brains at the one-hundred billion neuron scale.
E11 Bio is a focused research organization (FRO), so it possesses a distinct funding approach and different goals compared to traditional startups. For more information, see this open access article describing FROs in Nature.
Editas MedicineCRISPR-based gene therapy.
George Church, David Liu, Jennifer Doudna, Feng Zhang, and J. Keith Joung are the co-founders.
Eikon TherapeuticsSuperior drug discovery platform which leverages high-throughput automated super-resolution microscopy for tracking single protein movements in living cells.
Eric Betzig is one of the advisors.
GATTAquantDNA origami imaging probes, fluorescence microscopy reagents.
First commercial application of DNA origami.
GenScriptServices in artificial DNA synthesis, synthetic biology, antibodies, cell therapies, enzyme engineering, etc.
Ginkgo BioworksSynthetic biology, biomanufacturing, microorganism design, enzyme engineering, etc.
Acquired Gen9 in 2017.
HelixNanoDeveloping an mRNA-based SARS-CoV-2 vaccine which might protect from all possible variants of the virus.
Pivoted from original plan of developing cancer vaccines using the same technology.
Co-founded by Hannu Rajaniemi, who is also a successful science fiction author.
George Church is an advisor.
ImmunaiCombining multi-omic single cell profiling technologies and machine learning to comprehensively map the immune system and thereby enable greatly improved immunotherapies as well as accelerate clinical trials and avoid costly failures.
Impossible FoodsUses synthetic biology and biochemical engineering to develop plant-based substitutes for meat products.
Their signature product is the Impossible Burger. They also make a product which mimics sausages.
One notable strategy employed by Impossible Foods is production of leghemoglobin in yeast. This compound gives a meaty flavor when added to their food products. They also add other plant-based compounds to mimic the fats found in animal meat.
Intellia TherapeuticsDeveloping therapies which employ CRISPR gene editing technology.
Has conducted some successful clinical trials using CRISPR gene therapy to treat transthyretin amyloidosis (as of February 2022, this is not yet FDA approved though).
Also working on CRISPR therapeutics for engineering T cells towards targeting acute myeloid leukemia.
Partnered with Regeneron, Novartis, and others.
Jennifer Doudna was one of the co-founders.
KernelNeurotechnology, noninvasive brain-computer interfaces, invasive neural prostheses.
Some noninvasive products anticipated to be released during 2021.
Founded by Bryan Johnson who personally invested $54 million.
Raised an additional $53 million from outside investors.
Early goal is to help treat brain disease, has ambitions to enable human enhancement.
LarondeDeveloping therapies which utilize circular RNAs (Laronde calls these “endless RNAs”) as expression vehicles for proteins. Such circular RNAs are much more stable and less immunogenic than linear RNAs. 
LigandalPeptide nanoparticles for targeted CRISPR-Cas gene therapy delivery, immunotherapy, hematological gene therapy, aging treatments.
Founded by Andre Watson.
Mammoth BiosciencesCRISPR-based diagnostics.
Jennifer Doudna is one of the co-founders.
ManifoldBioSystem for barcoding protein therapeutics to enable high-throughput design and testing in complex environments, machine learning to optimize drug design.
George Church is one of the co-founders.
ModernaBiomedical technologies which utilize mRNA inside of lipid nanoparticles; application areas include drug discovery, drug development, and vaccines.
Major player in COVID-19 pandemic since it was one of the first companies which developed and distributed SARS-CoV-2 vaccines to the world.
Nautilus BiotechnologyDeveloping a high-throughput single-molecule proteomics platform which integrates many novel techniques to decipher protein networks and thereby help accelerate basic science, new therapeutics, and new diagnostics.
NeuralinkHigh-bandwidth brain-machine interfaces, surgical robots which implant the interfaces in a manner resembling a sewing machine.
Early goal is to help treat brain disease, has ambitions to enable human enhancement.
Founded by Elon Musk and others, highly publicized by Elon Musk.
Has done testing on rats, pigs, monkeys, and other animals as of April 2021.
OpenwaterPortable medical imaging technologies which employ novel optoelectronics, lasers, and holographic systems.
Wearable imaging technologies which could be 1,000x cheaper than MRI and achieve similar or better results.
Has speculated that their technology might eventually allow telepathic communication.
Organovo3D tissue bioprinting for in vivo clinical applications, in vitro tissue models for disease modeling and toxicology.
Long-term goal is to print entire human organs for transplants.
Oxford Nanopore TechnologiesPortable nanopore sequencing devices, high-throughput desktop nanopore sequencing devices, sample preparation kits.
The company states that they have the first and only nanopore DNA and RNA sequencing platform as of May 2021.
OxitecGenetically modified male insects which curb the reproduction of populations of their species in the wild, acting as a precise and environmentally friendly way of controlling dangerous pests that spread disease or destroy crops.
After years of battles with activists and regulatory bodies, the company will release 750 million genetically modified mosquitos in the Florida Keys (the first time this has been done in the U.S.) with the goal of reducing rates of illnesses such as yellow fever and dengue. 
Panacea LongevityEnhancing longevity and health using a fasting-mimetic metabolite supplementation.
Early stage as of May 2021.
ProteineaMass-produced insect larvae as an affordable way of manufacturing recombinant proteins.
Early stage as of May 2021.
Repair BiotechnologiesDeveloping a cholesterol degrading platform therapy which can reverse atherosclerosis.
The CEO, who is known as Reason, is outspoken about the need to combat aging.
Has preclinical proof-of-concept as of May 2021.
ResilienceNew manufacturing platforms to service partners for development and scaling of gene therapies, cell therapies, vaccines, protein therapies, and more.
Received $800 million in funding during 2020.
Sherlock BiosciencesCRISPR-based diagnostics.
Feng Zhang is one of the co-founders.
SomalogicProteomics platform called SomaScan for protein biomarker discovery which aids researchers in the development of new diagnostics.
SomaScan is an aptamer-based platform which can simultaneously measure 7,000 protein biomarkers.
Founded by Larry Gold, who is the inventor of SELEX.
SynthegoCRISPR genome engineering services, custom cell lines, custom screening libraries, CRISPR reagents and kits, aiding both academic researchers and clinical drug developers.
Syzygy PlasmonicsDeveloping a photocatalytic reactor system which leverages a nanoparticle-based plasmonic photocatalyst. The photocatalyst consists of a larger light-harvesting plasmonic nanoparticle decorated with smaller catalytic nanoparticles. Their first product will be a clean hydrogen fuel production system which does not rely on petroleum.
More of a chemical engineering company than a biotechnology company, but their technology may eventually have applications in biology.
Tilibit NanosystemsService which gives researchers predesigned and custom DNA origami nanostructures, including ones with chemical modifications.
Founded by Hendrik Dietz, who was CEO from 2012-2014. He is now a scientific advisor.
Twist BioscienceArtificial DNA synthesis services. Synthetic biology towards insulin manufacturing in yeast, scalable spider silk manufacturing, combating malaria, and DNA data storage.
Emily Leproust is a co-founder.
Vault PharmaProtein vault nanocompartments as a drug delivery platform to treat cancers and other diseases, protein vaults as a vaccine platform.
Co-founded by Leonard Rome.
VectorBuilderServices in vector cloning, virus packaging, library construction, cell lines, etc.
ZymergenSynthetic biology, metabolic engineering, biomanufacturing of materials and compounds as a substitute for chemical engineering practices.
64x BioHigh-throughput screening and computational design of new mammalian cell lines for manufacturing gene and cell therapies.
George Church and Pamela Silver are among the co-founders.
10x GenomicsSpatial transcriptomics, genomics, proteomics, immune cell profiling, etc.
Acquired ReadCoor and Cartana in 2020.

Cover image is a photograph of a biopharmaceutical manufacturing facility from

Notes on Quantum Mechanics

No Comments
PDF version: Notes on Quantum Mechanics – By Logan Thrasher Collins

The Schrödinger equation and wave functions

Overview of the Schrödinger equation and wave functions

Quantum mechanical systems are described in terms of wave functions Ψ(x,y,z,t). Unlike classical functions of motion, wave functions determine the probability that a given particle may occur in some region. The way that this is achieved involves integration and will be discussed later in these notes.

To find a wave function, one must solve the Schrödinger equation for the system in question. There are time-dependent and time-independent versions of the Schrödinger equation. The time-dependent version is given in 1D and 3D by the first pair of equations below and the time-independent version is given in 1D and 3D by the second pair of equations below. Here, ћ is h/2π (and h is Planck’s constant), V is the particle’s potential energy, E is the particle’s total energy, Ψ is a time dependent wave function, ψ is a time-independent wave function, and m is the mass of the particle. After this point, these notes will focus on 1D cases unless otherwise specified (it will often be relatively straightforward to extrapolate to the 3D case).

For a wave function to make physical sense, it needs to satisfy the constraint that its integral from –∞ to ∞ must equal 1. This reflects the probabilistic nature of quantum mechanics; the probability that a particle may be found anywhere in space must be 1. For this reason, one must usually find a (possibly complex) normalization constant A after finding the wave function solution to the Schrödinger equation. This is accomplished by solving the following integral for A. Here, Ψ* is the complex conjugate of the wave function without the normalization constant and Ψ is the wave function without the normalization constant.

To obtain solutions to the time-dependent Schrödinger equation, one must first solve the time-independent Schrödinger equation to get ψ(x). The general solution for the time-dependent Schrödinger equation is any linear combination of the product of ψ(x) with an exponential term (see below). The coefficients cn can be real or complex.

Physically, |cn|2 represents the probability that a measurement of the system’s energy would return a value of En. As such, an infinite sum of all the |cn|2 values is equal to 1. In addition, note that each Ψn(x,t) = ψn(x)e–iEnt/ is known as a stationary state. The reason these solutions are called stationary states is because the expectation values of measurable quantities are independent of time when the system is in a stationary state (as a result of the time-dependent term canceling out).

Using wave functions

Once a wave function is known, it can be used to learn about the given quantum mechanical system. Though wave functions specify the state of a quantum mechanical system, this state usually cannot undergo measurement without altering the system, so the wave function must be interpreted probabilistically. The way the probabilistic interpretation is achieved will be explained over the course of this section.

Before going further, it will be useful to understand some methods from probability. First, the expectation value is the average of all the possible outcomes of a measurement as weighted by their likelihood (it is not the most likely outcome as the name might suggest). Next, the standard deviation σ describes the spread of a distribution about an average value. Note that the square of the standard deviation is called the variance.

Equations for the expectation value and standard deviation are given as follows. The first equation computes the expectation value for a discrete variable j. Here, P(j) is the probability of measurement f(j) for a given j. The second equation is a convenient way to compute the standard deviation σ associated with the expectation value for j. The third equation computes the expectation value for a continuous function f(x). Here, ρ(x) is the probability density of x. When ρ(x) is integrated over an interval a to b, it gives the probability that measurement x will be found over that interval. The fourth equation the same as the second equation, but finds the standard deviation σ for the continuous variable x.

In quantum mechanics, operators are employed in place of measurable quantities such as position, momentum, and energy. These operators play a special role in the probabilistic interpretation of wave functions since they help one to compute an expectation value for the corresponding measurable quantity.

To compute the expectation value for a measurable quantity Q in quantum mechanics, the following equation is used. Here, Ψ is the time-dependent wave function, Ψ* is the complex conjugate of the time-dependent wave function, and Q̂ is the operator corresponding to Q.

Any quantum operator which corresponds to a classical dynamical variable can be expressed in terms of the momentum operator –iℏ(∂/∂x). By rewriting a given classical expression in terms of momentum p and then replacing every p within the expression by –iℏ(∂/∂x), the corresponding quantum operator is obtained. Below, a table of common quantum mechanical operators in 1D and 3D is given.

Heisenberg uncertainty principle

The Heisenberg uncertainty principle explains why quantum mechanics requires a probabilistic interpretation. According to the Heisenberg uncertainty principle, the more precisely the position of a particle is determined via some measurement, the less precisely its momentum can be known (and vice versa). The Heisenberg uncertainty principle is quantified by the following equation.

The reason for the Heisenberg uncertainty principle comes from the wave nature of matter (and not from the observer effect). For a sinusoidal wave, the wave itself is not really located at any particular site, it is instead spread out across the cycles of the sinusoid. For a pulse wave, the wave can be localized to the site of the pulse, but it does not really have a wavelength. There are also intermediate cases where the wavelength is somewhat poorly defined and the location is somewhat well-defined or vice-versa. Since the wavelength of a particle is related to the momentum by the de Broglie formula p = h/λ = 2πℏ/λ, this means that the interplay between the wavelength and the position applies to momentum and position as well. The Heisenberg uncertainty principle quantifies this interplay.

Some simple quantum mechanical systems

Infinite square well

The infinite square well is a system for which a particle’s V(x) = 0 when 0 ≤ x ≤ a and its V(x) = ∞ otherwise. Because the potential energy is infinite outside of the well, the probability of finding the particle there is zero. Inside the well, the time-independent Schrödinger equation is given as follows. This equation is the same as the classical simple harmonic oscillator.

For the infinite square well, certain boundary conditions apply. In order for the wave function to be continuous, the wave function must equal zero once it reaches the walls, so ψ(0) = ψ(a) = 0. The general solution to the infinite square well differential equation is given as the first equation below. The boundary condition ψ(0) = 0 is employed in the second equation below. Since the coefficient B = 0, there are only sine solutions to the equation. Furthermore, if ψ(a) = 0, then Asin(ka) = 0. This means that k = nπ/a (where n = 1, 2, 3…) as given by the third equation below. The fourth equation below shows that this set of values for k leads to a set of possible discrete energy levels for the system

To find the constant A, the wave function ψ = Asin(nπx/a) must undergo normalization. As mentioned earlier, normalization is achieved by setting the normalization integral equal to 1 and solving for the constant A. Note that the time-independent Schrödinger equation can be utilized in the normalization integral since the exponential component of the time-dependent Schrödinger equation would cancel anyways.

Using this information, the wave functions for the infinite square well particle system are obtained. The time-independent and time-dependent wave functions are both displayed below at left and right respectively.

This infinite set of wave functions has some important properties. They possess discrete energies that increase by a factor of n2 with each level (and n = 1 is the ground state). The wave functions are also orthonormal. This property is described by the following equation. Here, δmn is the Kronecker delta and is defined below.

Another important property of these wave functions is completeness. This means that any function can be expressed as a linear combination of the time-independent wave functions ψn. The reason for this remarkable property is that the general solution (see below) is equivalent to a Fourier series.

The first equation below can be employed to compute the nth coefficient cn. Here, f(x) = Ψ(x,0) which is an initial wave function. Note that the initial wave function can be any function Ψ(x,0) and the result will generate coefficients for that starting point. This first equation is derived using the orthonormality of the solution set. Note that the formula applies to most quantum mechanical systems since the properties of orthonormality and completeness hold for most quantum mechanical systems (though there are some exceptions). The second equation below computes the cn coefficients specifically for the infinite square well system.

Quantum harmonic oscillator

For the quantum harmonic oscillator, the potential energy in the Schrödinger equation is given by V(x) = 0.5kx2 = 0.5mω2x2. This means that the following time-independent Schrödinger equation needs to be solved.

There are two main methods for solving this differential equation. These include a ladder operator approach and a power series approach. Both of these methods are quite complicated and will not be covered here. The solutions for n = 0, 1, 2, 3, 4, 5 are given below. Here, Hn(y) is the nth Hermite polynomial. The first five Hermite polynomials and the corresponding energies for the system are given in the table. Note that the discrete energy levels for the quantum harmonic oscillator follow the form (n + 0.5)ћω.

As with any quantum mechanical system, the quantum harmonic oscillator is further described by the general time-dependent solution. To identify the coefficients cn for this general solution, Fourier’s trick is employed (see previous section) where f(x) is once again any initial wave function Ψ(x,0).

Quantum free particle

Though the classical free particle is a simple problem, there are some nuances which arise in the case of the quantum mechanical free particle which greatly complicate the system.

To start, the Schrödinger equation for the quantum free particle is given in the first equation below. Here, k = (2mE)0.5/ћ. Note that V(x) = 0 since there is no external potential acting on the particle. The second equation below is a general time-independent solution to the system in exponential form. The third equation below is the time-dependent solution to the system where the terms are multiplied by e–iEt/ћ. Realize that this general solution can be written as a single term by redefining k as ±(2mE)0.5/ћ. When k > 0, the solution is a wave propagating to the right. When k < 0, the solution is a wave propagating to the left.

The speed of these propagating waves can be found by dividing the coefficient of t (which is ћk2/2m) by the coefficient of x (which is k). Since this is speed, the direction of the wave does not matter, so one can take the absolute value of k. By contrast, the speed of a classical particle is found by solving E = 0.5mv2, which gives a puzzling result that is twice as fast as the quantum particle.

Another challenge associated with the quantum free particle is that its wave function is non-normalizable (as shown below). Because of this, one can conclude that free particles cannot exist in stationary states. Equivalently, free particles never exhibit definite energies.

To resolve these issues with the quantum free particle, it has been found that the wave function of a quantum free particle actually carries a range of energies and speeds known as a wave packet. The solution for this wave packet involves the integral given by the first equation below and a function ϕ(k) given by the second equation below. This second equation allows one to determine ϕ(k) to fit a desired initial wave function Ψ(x,0). It was obtained using a mathematical tool called Plancherel’s theorem.

The above solution to the quantum free particle is now normalizable. Furthermore, the issue with the speed of the quantum free particle having a value twice as large as the speed of the classical free particle is fixed by considering a phenomenon known as group velocity. The waveform of the particle is an oscillating sinusoid (see image). This waveform includes an envelope, which represents the overall shape of the oscillations rather than the individual ripples. The group velocity vg is the speed of this envelope while the phase velocity vp is the speed of the ripples. It can be shown using the definitions of phase velocity and group velocity (see below) that the group velocity is twice the phase velocity, resolving the problem with the particle speed. The group velocity of the envelope is thus what actually corresponds to the speed of the particle.

Interlude on bound states and scattering states

To review, the solutions to the Schrödinger equation for the infinite square well and quantum harmonic oscillator were normalizable and labeled by a discrete index n while the solution to the Schrödinger equation for the free particle was not normalizable and was labeled by a continuous variable k.

The solutions which are normalizable and labeled by a discrete index are known as bound states. The solutions which are not normalizable and are labeled by a continuous variable are known scattering states.

Bound states and scattering states are related to certain classical mechanical phenomena. Bound states correspond to a classical particle in a potential well where the energy is not large enough for the particle to escape the well. Scattering states correspond to a particle which might be influenced by a potential but has a large enough energy to pass through the potential without getting trapped.

In quantum mechanics, bound states occur when E < V(∞) and E < V(–∞) since the phenomenon of quantum tunneling allows quantum particles to leak through any finite potential barrier. Scattering states occur when E > V(∞) or E > V(–∞). Since most potentials go to zero at infinity or negative infinity, this simplifies to bound states happening when E < 0 and scattering states happening when E > 0.

The infinite square well and the quantum harmonic oscillator represent bound states since V(x) goes to ∞ when x → ±∞. By contrast, the quantum free particle represents a scattering state since V(x) = 0 everywhere. However, there are also potentials which can result in both bound and scattering states. These kinds of potentials will be explored in the following sections.

Delta-function well

Recall that the Dirac delta function δ(x) is an infinitely high and infinitely narrow spike at the origin with an area equal to 1 (the area is obtained by integrating). The spike appears at the point a along the x axis when δ(x – a) is used. One important property of the Dirac delta function is that f(x)δ(x – a) = f(a)δ(x – a). By integrating both sides of the equation of this property, one can obtain the following useful expression. Note that a ± ϵ is used as the bounds since any positive value ϵ will then allow the bounds to encompass the Dirac delta function spike.

The delta-function well is a potential of the form –αδ(x) where α is a positive constant. As a result, the time-independent Schrödinger equation for the delta-function well system is given as follows. This equation has solutions that yield bound states when E < 0 and scattering states when E > 0.

For the bound states where E < 0, the general solutions are given by equations below. The substitution κ is defined by the first equation below, the second equation below is the general solution for x < 0, and the third equation below is the general solution for x > 0. (Since E is assumed to have a negative value, κ is real and positive). Note that V(x) = 0 for x < 0 and x > 0. In the solution for x < 0, the Ae–κx term explodes as x → –∞, so A must equal zero. In the solution for x > 0, the Feκx term explodes as x → ∞, so F must equal zero.

To combine these equations, one must use appropriate boundary conditions at x = 0. For any quantum system, ψ is continuous and dψ/dt is continuous except at points where the potential is infinite. The requirement for ψ to exhibit continuity means that F = B at x = 0. As a result, the solution for the bound states can be concisely stated as follows. In addition, a plot of the delta-function well’s bound state time-independent wave function is given below.

The presence of the delta function influences the energy E. To find the energy, one can integrate the time-independent Schrödinger equation for the delta-function well system. By making the bounds of integration ±ϵ and then taking the limit as ϵ approaches zero, the integral works only on the negative spike of the delta function at x = 0. The result for the energy is at the end of the following set of equations.

As seen above, the delta-function well only exhibits a single bound state energy E. By normalizing the wave function ψ(x) = Be–κ|x|, the constant B is found (as seen in the first equation below). The second equation below describes the single bound state wave function and reiterates the single bound state energy associated with this wave function.

For the scattering states where E > 0, the general solutions are given by equations below. The substitution k is defined by the first equation below, the second equation below is the general solution for x < 0, and the third equation below is the general solution for x > 0. (Since E is assumed to have a positive value, k is real and positive). Note that V(x) = 0 for x < 0 and x > 0. None of the terms explode this time, so none of the terms can be ruled out as equal to zero.

As a consequence of the requirement for ψ(x) to be continuous at x = 0, the following equation involving the constants A, B, F, and G must hold true. This is the first boundary condition.

There is also a second boundary condition which involves dψ/dx. Recall the following step (see first equation below) from the process of integrating the Schrödinger equation. To implement this step, the derivatives of ψ(x) (see second equation below) are found and then the limits of these derivatives from the left and right directions are taken (see third equation below). Since ψ(0) = A + B as seen in the equation above, the second boundary condition can be given as the final equation below.

By rearranging the final equation above and substituting in a parameter β = mα/ћ2k, the following expression is obtained. This expression is a compact way of writing the second boundary condition.

These two boundary conditions provide two equations, but there are four unknowns in these equations (five unknowns if k is included). Despite this, the physical significance of the unknown constants can be helpful. When eikx is multiplied by the factor for time-dependence e–iEt/ћ, it gives rise to a wave propagating to the right. When e–ikx is multiplied by the factor for time-dependence e–iEt/ћ, it gives rise to a wave propagating to the left. As a result, the constants describe the amplitudes of various waves. A is the amplitude of a wave moving to the right on the x < 0 side of the delta-function potential, B is the amplitude of a wave moving to the left on the x < 0 side of the delta-function potential, F is the amplitude of a wave moving to the right on the x > 0 side of the delta-function potential, and G is the amplitude of a wave moving to the left on the x > 0 side of the delta-function potential.

In a typical experiment on this type of system, particles are fired from one side of the delta-function potential, the left or the right. If the particles are coming from the left (moving to the right), the term with G will equal zero. If the particles are coming from the right (moving to the left), the term with A will equal zero. This can be understood intuitively by examining the figure above.

As an example, for the case of particles fired from the left (moving to the right), A is the amplitude of the incident wave, B is the amplitude of the reflected wave, and F is the amplitude of the transmitted wave. The equations of the two boundary conditions are reiterated in the first line below. By solving these equations, the second line of expressions is found. Since the probability of finding a particle at a certain location is |Ψ|2, the relative probability R of an incident particle undergoing reflection and the relative probability T of an incident particle undergoing transmission are given by the third line of expressions below. 

Also for the example case of particles fired from the left (moving to the right), by substituting back from β = mα/ћ2k and k = (2mE)0.5/ћ to get the expressions in terms of energy, the following equations are obtained for the reflection and transmission relative probabilities.

By performing the same process, but with A = 0 instead of G = 0, corresponding equations can be found for the case of particles fired from the right (moving towards the left).

It is important to note that, since these scattering wave functions are not normalizable, they do not actually represent possible particle states. To solve this problem, one must construct normalizable linear combinations of the stationary states in a manner similar to that performed with the quantum free particle system. In this way, wave packets will occur and the actual particles will be described by the range of energies of the wave packets. Because the actual normalizable system exhibits a range of energies, the probabilities R and T should be thought of as approximate measures of reflection and transmission for particles with energies in the vicinity of E.

Finite square well

The finite square well is a system for which a particle’s V(x) = –V0 when –a ≤ x ≤ a and its V(x) = 0 otherwise. For this system, the Schrödinger equation is given as follows for the conditions x < –a, –a ≤ x ≤ a, and x > a. Note that the equations for x < –a and x > a are the same since V(x) = 0 in both cases (but the boundary conditions will differ as will be explained soon). As with the Delta-function potential well, the finite square well has both bound states (with E < 0) and scattering states (with E > 0). First, the bound states with E < 0 will be considered. In this case, the Schrödinger equations for the finite square well are as follows.

For the cases of x < –a and x > a where V(x) = 0, the general solutions to the Schrödinger equation are respectively Ae–κx + Beκx and Fe–κx + Geκx where A, B, F, and G are arbitrary constants. In the x < –a case, the Ae–κx term blows up as x → –∞, making this term physically invalid. As a result, the physically admissible solution is ψ(x) = Beκx. In the x > a case, the Geκx term blows up as as x → ∞, making this term physically invalid. As a result, the physically admissible solution is ψ(x) = Fe–κx. For the case of –a ≤ x ≤ a, the general solution to the Schrödinger equation is ψ(x) = Csin(lx) + Dcos(lx). Note that, because E must be greater than the minimum potential energy Vmin = –V0, the value of l ends up real and positive (even though E is also negative). These solutions are summarized by the following equations.

Since the potential V(x) = –V0 is an even function (symmetric about the y axis), one can choose to write the solutions to the wave function as either even or odd. This comes from some properties of the time-independent Schrödinger equation. Next, it is again important to constrain these solutions using the boundary conditions which require the continuity of ψ(x) and dψ/dx at ±a.

For the even solutions, the constant C in ψ(x) = Csin(lx) + Dcos(lx) is zero. Because C = 0, the remaining equation is the even function ψ(x) = Dcos(lx) for –a ≤ x ≤ a. So, the continuity of ψ(x) and dψ/dx at +a necessitates the following two equations to hold true. The third equation comes from dividing the second equation by the first equation to solve for κ.

For the odd solutions, the constant D in ψ(x) = Csin(lx) + Dcos(lx) is zero. Because D = 0, the remaining equation is the odd function ψ(x) = Dsin(lx) for –a ≤ x ≤ a. So, the continuity of ψ(x) and dψ/dx at +a necessitates the following two equations to hold true. The third equation comes from dividing the second equation by the first equation to solve for κ.

As κ and l are both functions of E, the κ = ltan(la) and κ = –lcot(la) equations can be solved for E. To do this, it is convenient to use the notation z = la and z0 = (a/ћ)(2mV0)0.5. Simplifying the κ = ltan(la) and κ = –lcot(la) equations using this notation gives the following results. These equations can be solved numerically for z or graphically for z by looking for points of intersection (after obtaining z, E is easily computed).

Let us consider the tan(z) equation. There are two limiting cases of interest. These include a well which is wide and deep and a well which is shallow and narrow. Though not included in these notes, similar calculations can be performed for the –cot(z) equation.

For a wide and deep well, the value of z0 is large. Intersections between the curves of tan(zn) and ((z0/zn)2 – 1)0.5 occur at nπ/2 for odd n and at nπ for even n. This leads to the following equations which describe values of En. From this outcome, it can be seen that infinite V0 results in the infinite square well case with an infinite number of bound states. However, for any finite square well, there are only a finite number of bound states.

For a shallow and narrow well, the value of z0 is small. As the value of z0 decreases, fewer and fewer bound states exist. Once z0 is smaller than π/2, there is only one bound state (which is an even bound state). Interestingly, no matter how small the well, this one bound state always persists.

The scattering states, which occur when E > 0, will now be considered. In this case, the Schrödinger equations for the finite square well are as follows.

The general solutions to the Schrödinger equation for the finite square well’s scattering states are as follows.

But recall that in a typical scattering experiment, particles are fired from one side of the delta-function potential, the left or the right. Here it will be assumed that the particles are fired from the left side of the well (moving towards the right). Note that similar calculations could be performed for the opposite case. With this assumption, one can realize that the coefficient A represents the incident (from the left) wave’s amplitude, the coefficient B represents the reflected wave’s amplitude, and the coefficient F represents the transmitted (to the right) wave’s amplitude. Finally, the coefficient G = 0 since there is not an incident wave from the right moving towards the left.

There are four boundary conditions, continuity of ψ(x) at ±a and continuity of dψ/dx at ±a. These boundary conditions yield the following equations.

With the above equations, one can eliminate C and D and subsequently solve the system for B and F. This yields the equations below for B and F.

As with the delta-function well, a transmission coefficient T = |F|2/|A|2 can be computed across the finite square well. Recall that T represents the probability of the particle undergoing transmission across the well (in this case when moving from the right side to the left side). The probability of the particle undergoing reflection is R = 1 – T.

Since 1/T equals the equation below, whenever the sine squared term is zero, the probability of transmission T = 1.

Recall that a sine (or sine squared) term is zero when the function inside of it equals nπ such that n is any integer.

Remarkably, the above equation is the same as the one which describes the infinite square well’s energies. But realize that, for the finite square well, this only holds in the case of T = 1.

Reference: Griffiths, D. J., & Schroeter, D. F. (2018). Introduction to Quantum Mechanics (3rd ed.). Cambridge University Press. 10.1017/9781316995433

Cover image source: