Single Cell Experiments: How to Detect Impurities
I shared it here first.
A few months ago, we received an excellent question from a few colleagues. Here are some resources that might help you understand some of the ideas behind single-cell experimental designs.
Barnyard metrics
Multi-species mixture metrics, also known as a barnyard, are used to test single-cell impurities for encapsulation. You can think of it as a quality check for single-cell encapsulation (and cell integrity). Usually, two different species, human and mouse, are chosen.
When do you design a barnyard experiment?
When you invent or develop a droplet encapsulation technology (e.g., Drop-seq, 10X Chromium, PIP-seq), you usually need such validations. Moreover, if you change a standard procedure of what has been developed, especially any breakage or cell lysis you are afraid of after the intervention, you might prefer standard barnyard metrics as well.
How exactly does it work?
Cells of different species are mixed in a 1:1 ratio before the single-cell encapsulation. Later, the reads are usually assigned to a hybrid (e.g., human and mouse together) reference genome. If there are impurities/crosstalk/doublets/multiplets, you can detect them easily.
However, the presence of RNA leakage prior to encapsulation (called ambient RNA, which might be due to stress or less viable cells) or heterogeneous cell population (e.g., smaller cells captured with bigger cells) might mislead such analysis.
Read length, sequencing depth, and complexity of the library, shorter reads, or shallow sequencing depth might not provide enough resolution for multi-species RNA seq alignments (please check out further reading). Please make sure that you need such an experiment. Apart from this, there are tools to estimate based on unique genes/reads detected per cell barcode.
(little self-criticism) As molecular biologists, we like adopting established technologies/methods over newer approaches in some instances for the sake of following the methods from previous papers, which might cause missing the alternative options.
So, what might be the alternative?
Cell Hashing
Yup, there are groups of smart people who come up with excellent ideas every time, leveraging the field. It is like magic (good science is hard to distinguish from magic, huh?).
Basically, you label your samples with hashtag oligos (HTO) before mixing the populations. Instead of optimizing the workflow for multi-species and downstream hybrid alignment, you take advantage of hashing the cells before (so that they will already have unique barcodes). Then, you will demultiplex to assign to supposedly single cells encapsulated in the droplets to detect doublets or multiplets in general. It might also be useful to distinguish low-quality cells from ambient RNA.
Want to learn more? This approach is used to reduce the cost as well. The reason is you can combine multiple conditions/cell types/samples in one 10X load (let’s say 3k cells for three different conditions unless you are modifying the reaction mixture for each separately). Furthermore, cite-seq is originally developed to detect cell surface markers to relate gene expression in a multi-modal (ADTs) experimental setup.
I want to expand the content of this post to total-RNA-seq discussions as well (not know when, but stay tuned!).
Further Reading/References:
- Estimating the frequency of multiplets in single-cell RNA sequencing from cell-mixing experiments, https://peerj.com/articles/5578/#results
- RNA released earlier due to stress or apoptosis in some cases (If not a wetting failure?, one way to ensure in this case is to stick with high viability of cells before mixing of reagents for loading to the chip, https://www.10xgenomics.com/blog/faqs-about-single-cell-sample-preparation-covering-the-basics)
- Best practices on the differential expression analysis of multi-species RNA-seq, https://genomebiology.biomedcentral.com/articles/10.1186/s13059-021-02337-8
- Optimised Splitting of Mixed-Species RNA Sequencing Data, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9081140/
- Does Cellranger DNA support multi-species (barnyard) experiments?, https://kb.10xgenomics.com/hc/en-us/articles/360012401791-Does-Cellranger-DNA-support-multi-species-barnyard-experiments
- Mixing mouse and human 10x single cell RNAseq data, https://divingintogeneticsandgenomics.com/post/mixing-mouse-and-human-10x-single-cell-rnaseq-data/
- Cell Hashing with barcoded antibodies enables multiplexing and doublet detection for single cell genomics, https://genomebiology.biomedcentral.com/articles/10.1186/s13059-018-1603-1
- You might have heard cite-seq but not be fully aware of the advantages it: https://cite-seq.com/cell-hashing/
- Simultaneous epitope and transcriptome measurement in single cells, https://www.nature.com/articles/nmeth.4380
- How to incorporate cell hashing in the data analysis workflow: https://satijalab.org/seurat/articles/hashing_vignette.html
- Where to find commercial antibodies for cell hashing: https://www.biolegend.com/en-ie/protocols/totalseq-a-dual-index-protocol