top of page

Statistical algorithms for the detection of Biological insights in single-cell RNA-seq and spatial transcriptomics

Hadas.png

We develop algorithms that rely on statistical assumptions or statistical tests to find statistically significant patterns of expression in high-throughput data. These patterns may be indicative of underlying active Biological processes in the studied tissue or cells.

One of these algorithms is SPIRAL, which relies on Gaussian statistics to detect structures in single cell, bulk and spatial transcriptomics data. Each structure is composed of a group of genes working simultaneously in a specific population of cells or spots. These structures offer a comprehensive view of the tissue Biology. SPIRAL is available at https://spiral.technion.ac.il/ (the website is currently available only in the Technion).

A second algorithm is a statistical approach using the minimum-HyperGeometric test to find activated microRNAs in single cell or spatial data. Since microRNAs are known to down-regulate their targets, we predict the microRNA’s level of activity by using its targets’ expression pattern as input. Since the expression of microRNAs themselves is not measured in most nowadays gene expression experiments, information regarding their level of activity may be highly insightful and contribute to our understanding of the activated pathways in the tissue.

We also developed piHG: a statistical test for the evaluation of partial agreement of sets.  In Biology, the task of merging sets of items is very common: merging lists of proteins from several replicates of an experiment or merging lists of genes reported by different papers. However, due to the absence of statistical evaluation protocols, the merging process is usually decided upon arbitrarily. piHG evaluates the significance of the size of a partial intersection set that was formed by assembling all items that are members of at least s out of n sets, and advices the user as to the statistically preferred merging option for the specific data set.

​

​

​

bottom of page