|Human Variation|

“villages in a dish”

 
 

I often had dreams as a small child that the Hamburglar and Chuck E. Cheese would visit me at night to play. This wasn’t coming out of nowhere, exactly. My home in Columbus, Ohio was sandwiched between strip malls and fast food restaurants, so I imagined that these fictional characters were neighbors. I was a bit odd to say the least, but isn’t every child odd in their own unique ways?

It was around this time that I started going to school. Like a big kid, I used to say. The administrators suggested I skip kindergarten and first grade based on my scores on some aptitude test, but my mother knew better—He has absolutely no social skills, he is going to kindergarten with kids his age. I attended a series of private Catholic schools over the years starting with Notre Dame elementary (1990-1992), St. Cecilia (1992-2000), St. Charles Prep (2000-2004), and finally the University of Notre Dame (2004-2008). I have fond memories of this time, but if I were to describe it in one word, it would be homogenous. I wasn’t comfortable being a weirdo, and I didn’t look like most of the other kids.

August 25, 1991—First day of school. I looked like a baby Newt Gingrich back then.

August 25, 1991—First day of school. I looked like a baby Newt Gingrich back then.

I was often one of a handful of minorities in the classroom or on the playground. As a Salvadoran-American high school student, I got so tired of correcting people when they called me Mexican that I eventually just started telling people I was Mexican. My nickname on the baseball team was “Mestizo.” They meant no harm, and I was generally treated well. I learned to embrace my differences during this time and entered my adult years fascinated with human variation.

We humans are characterized by an immense diversity in our physical and psychological traits. Some can run fast, others can run for long distances, a few can do both. Some can write staggering works of heartbreaking genius on paper, while others express themselves through music and dance. Some think that Alfonso Cuaron’s masterpiece Children of Men is the best film of the 21st century, while others do not and they are wrong.

This variation doesn’t simply exist, in many ways it is also the reason that we exist. Genetic and phenotypic variation helped our antecedents survive intense selective pressures and countless existential crises, and the same can be said for every species that currently inhabits this unrelenting planet of ours. 

Countless studies over the past half century have concluded that most human features are best described on a spectrum, and where you lie on that spectrum is determined by a symphony of genetic and environmental factors with a bit of randomness sprinkled in for good measure. To complicate matters further, the respective contribution of genes and environment differs for each trait in each population under investigation.

But where does all of this diversity come from? Are our differences determined at conception? At birth? At the age of 25? Somewhere in between, or all of the above? We are lucky enough to live in a time when we have access to techniques that can help us start to answer these questions (though by no means will we have all the answers in our lifetimes).

Thanks to advances in human genome sequencing, geneticists are able to perform massive genome-wide association studies (GWAS) that try to find relationships between certain genetic polymorphisms and the variance in a given human trait. These experiments have detected numerous novel variants potentially involved in complex neurological or psychological traits and their associated diseases. However, they often lack clear mechanistic links between variants and the traits under investigation. Variant discovery is outrunning the pace of biological explanations by unfathomable lengths.

There is a considerable amount of evidence to suggest that many of our differences arise during the earliest stages of neurodevelopment, which we can now model using human pluripotent stem cell-derived neural cells. I am currently studying the impact of human genetic variation on phenotypic diversity using a brand-new in vitro culturing strategy known as “village-in-a-dish.” This method, which was developed as part of a long-standing collaboration between the Eggan lab and Steve McCarroll’s group at Harvard Medical School, allows us to pool stem cell lines and their derivatives from multiple human donors into the same culture environment. In doing so, we are able to minimize the technical and biological variation that often plagues multi-line comparative investigations and cut down on the financial costs of culturing hundreds of cell lines independently.

Once the mixed-donor villages are constructed, we can then challenge the cells in any way we want before running (1) low-coverage whole genome sequencing-based Census-seq analysis to infer the breakdown of the village on a donor-by-donor basis at any given time and/or (2) single-cell RNA sequencing-based Dropulation analysis to probe the transcriptional profiles of each cell followed by donor re-identification. This method only works if we have a prior understanding of the genetic architecture of each donor (i.e. whole genome sequencing or SNP array data), which also comes in handy when we run expression quantitative trait loci (eQTL) detection pipelines. Using these methods, we can, for example, calculate the change in relative proliferation rates over time of the donors in a village through repeated Census-seq collections, measure the relationships between the gene expression and those growth rates, and attempt to identify genetic variants that can explain each of these phenomena. Essentially, we have a system for performing a GWAS in a dish with the added bonus of generating data that can functionally link variants to cellular behaviors. It doesn’t quite get us to understanding the exact molecular mechanisms connecting a genetic variant to a complex human behavior or disease, but it is a start.

I deployed these new technologies to my Zika work after pondering a suggestion from a reviewer of one of my manuscripts. They asked if we had observed any differences in Zika infectivity across cell lines. I looked into the patient data and it was quite clear—some fetuses fair better than others when their mothers are infected. There are many possible reasons for this that we can’t model in a dish, but perhaps there was a genetic component that could explain a fraction of this variance. I separately infected a couple of dozen human neural progenitor cell lines with Zika and was shocked by what I saw. Some lines were completely resistant to infection, while others were demolished.

I immediately created a village of neural progenitors using my SNaP method. We ran Dropulation on the pool and detected hundreds of eQTLs. We then cross-referenced the eQTL genes to the list of top hits from my CRISPR-Cas9 Zika survival screen, which narrowed down the set of potentially functionally relevant genes to twelve. One gene stood out in particular—IFITM3—because it was one of the strongest “protectors” against infection observed in my screen. Other groups have shown similar relationships with IFITM3 and some human viruses, and concluded that greater amounts of this protein in a cell result in more protection from infection, and vice versa. In our case, it was clear that a single nucleotide polymorphism in the promoter region of this gene correlated with expression. We had a target, now it was time to determine if donor genotype at this locus could explain differential viral susceptibility.

I infected the village with Zika, waited two days, and then FAC sorted the cells based on their levels of viral envelope protein. We then ran Census-seq to determine the donor composition of each FAC-sorted fraction before grouping our results by IFITM3 SNP genotype. It was clear that the donors with the genotype associated with higher IFITM3 levels were significantly more resistant to infection than the donors with the allele that aligns with less IFITM3. We confirmed these findings by re-analyzing our individually-cultured progenitor cell infectivity data grouped by genotype. We discovered that this single base pair locus (out of billions) was able to explain over 50% of the variance in this phenotype.

You are probably saying Well, I hope I have the protective allele. Maybe you do and maybe you don’t—only genomic sequencing can answer that question with a decent level of certainty for you. There are some clues though. Individuals of Asian and African descent rarely possess the risk allele. In fact, in some of these populations, risk allele frequency is below 1%. It is different story, however, if you are of European descent. These populations show a risk allele frequency as high as 46%!

How could there be such high variability in allele frequency across populations? This is a bit out of my area of expertise, but it is fun to speculate that over countless generations the protective allele was selected for in populations that inhabited areas endemic to mosquito-borne flaviviruses (Zika, West Nile, Yellow Fever, etc). The mosquitos that transmit these types of viruses do not (and most likely have never) lived in Norway or Sweden, but are common in Central Africa and East Asia, and as a result, the selective pressure never existed in those populations. That could change, of course, as increased globalization and climate change continue to drive these mosquito vectors further north into Europe and the United States.

What will it mean for us when these viruses are introduced to naïve populations with high risk allele frequencies?