Facial recognition algorithms show genetic similarities in genetically unrelated humans

In a recent study published in Cell reportsthe researchers identified look-alike humans, who were not genetically related, using facial recognition (FR) algorithms for multi-omics studies.

Study: Lookalike humans identified by facial recognition algorithms show genetic similarities. Image Credit: The Faces/Shutterstock


Historically, research on facial morphology was based on craniofacial abnormalities. Nowadays, smartphones and closed-circuit television (CCTV) cameras use facial recognition software, sparking interest in normal face variations.

Since finding the right human sample is difficult, studies have not sufficiently characterized random humans who objectively share facial features. Once found, this unique set of humans could facilitate the study of the contribution of genomics, epigenomics and microbiomics to a resemblance between humans.

About the study

In the current study, researchers obtained portraits of 32 people resembling photographs taken by Francois Brunelle, a Canadian artist. First, they determined an objective measure of “likeliness” for these look-alike pairs. Next, they used the Custom-Net custom deep convolutional neural network, the MatConvNet algorithm, and the Microsoft Oxford Project face API (three methods) to achieve varying results.

To control a high similarity score, the team ran facial recognition algorithms on photographs of monozygotic twins obtained from the University of Notre Dame’s twin database. The researchers analyzed deoxyribonucleic acid (DNA) extracted from the saliva of the 32 lookalike humans by multi-omics to obtain three levels of biological information:

i) genomics,

ii) epigenomics, and

iii) microbiology.

For genomic information, the researchers used a single nucleotide polymorphism (SNP) microarray that interrogated 4,327,108 genetic variants selected from the international HapMap projects and 1,000 genomes and targeted genetic variation down to 1% frequency of minor alleles (MAF). Additionally, they assessed facial gene enrichment in 19,277 SNPs from 3,730 genes by applying a hypergeometric test and a Monte Carlo simulation of 10,000 iterations.

For epigenetic analyses, the researchers used a DNA methylation microarray that assessed more than 0.85 million 5′-cytosine-phosphate-guanine-3′ (CpG) sites. They calculated absolute age differences for 16 look-alike pairs based on their chronological and epigenetic ages, that is, according to their date of birth and their DNA methylation clock, respectively.

For microbiome analysis, they performed direct ribosomal ribonucleic acid (RNA) sequencing. Finally, the team looked at 68 biometric and lifestyle attributes for all human pairs that look alike.

Study results

The authors noted that the three FR systems grouped together 16 of 32 twins (50%) similar and that their similarity scores were similar to those obtained from monozygotic twins according to MatConvNet. Conversely, only one pair regrouped (6.2%) from 16 similar cases that were not grouped by the three FR networks. Genomic analyzes revealed nine of 16 similar pairs (56.2%) clustered in the unsupervised clustering heatmap with bootstrap; thus, they considered them “ultra” doppelgangers.

Principal component analysis (PCA) and t-distributed stochastic neighbor incorporation (t-SNE) showed that these ‘ultra’ look-alikes also had genotyping resemblance. In terms of population stratification, of the 16 look-alike pairs, 13 were of European ancestry, one East Asian, one Hispanic, and one South-Central Asian. Seven of the 13 white double lookalikes did not cluster genetically, indicating alternative targets for shared genetic variation between the lookalike pairs.

The number of shared SNP positions was significantly higher than the random dissimilar pairs. In no iteration of random gene sets did facial genes exceed the number of facial genes represented in the selection of 19,277 SNPs. The 1,794 facial genes in the selection of 19,277 SNPs constituted 26% of all facial genes in the study array (hypergeometric test p: 6.31e−172; empirical Monte Carlo p

The aging process alters facial morphology and DNA methylation, an indicator of biological age, may or may not be directly related to chronological age. However, it is associated with genetic variations in humans. Despite evidence of epigenetic variation in human populations, only one pair of lookalikes are lumped together by DNA methylation. This pair was also clustered by SNPs, suggesting that the resembling epigenetic profile was likely due to their underlying shared genetics, also supported by analysis of CpGs near SNPs.

Moreover, the ultra-like pairs showed similar epigenetics. Three of the nine ultra-similar pairs clustered within a +100 base pair window from the 19,277 SNPs. Thus, DNA methylation, as a marker of biological age and methylation quantitative trait loci (meQTLs), might also show phenotypic similarities in ultra-similars.

Regarding alpha diversity, based on the type of bacteria in the oral samples, only one pair of look-alikes clustered together. However, this pair did not cluster in SNP genotyping. Quantitatively, based on the amount of bacteria strands in the samples, again only one pair of lookalikes clustered together (6.25%, 1/16). The microbiome analysis grouped a similar pair, but the ultra-like pairs had similar weights; thus, the indication of microbiome composition could be linked to obesity.


Taken together, the study findings support an important role of genetic, epigenetic, and microbiome components in determining human facial features. Curiously, the same biological determinants are responsible for human physical and behavioral attributes. Thus, the study data could provide a molecular basis for future applications in the fields of biomedicine, evolution, and forensics. However, future studies would require collaborative efforts to predict human facial structure based on the multi-omics landscape of individuals.

Sharon D. Cole