27. Neighborhood analysis#
27.1. Motivation#
After annotating cell types or cell states in the dataset (or spots, according to the technology at end), we can quantify whether such annotations are spatially enriched and analyze cellular neighborhoods across the tissue.
Cellular neighborhood analysis is a good starting point for various downstream tasks as it can help to understand the cellular composition of the tissue and identify candidates for more in-depth analysis. For example, it can help to find candidates for cell-cell communication based on spatial proximity, or spatial regions and clusters for identification of spatially variable genes.
Neighborhood analysis is often performed through spatial statistics [Gelfand et al., 2010], which are quantitative scores that can be used to identify spatial neighborhoods in the tissue. Here, we’ll take a look at various spatial statistics implemented in Squidpy [Palla et al., 2022].
27.2. Environment setup and data#
We first load the respective packages needed in this tutorial and the dataset.
import scanpy as sc
import squidpy as sq
sc.settings.verbosity = 3
sc.settings.set_figure_params(dpi=80, facecolor="white")
The dataset used in this tutorial consists of 1 tissue slides from 1 mouse and is provided by 10x Genomics Space Ranger 1.1.0. The dataset was pre-processed in Squidpy, which provides a loading function for this dataset.
adata = sq.datasets.visium_hne_adata()
27.3. Identifying interactions between spatial communities#
After annotating cell types or cell states in the dataset (or spots, according to the technology at end), we can quantify whether such annotations are spatially enriched. To this end, computing a neighborhood enrichment can help us identify clusters that are neighbors in the tissue of interest. In short, it’s an enrichment score on spatial proximity of clusters: if observations (cells or spots) belonging to a cluster are often close to observations belonging to another cluster, then they will have a high score and will appear to be enriched. On the other hand, if they are far apart, and therefore are seldom neighbors, the score will be low and they can be defined as depleted. This score is based on a permutation-based test, and you can set the number of permutations with the n_perms argument (default is 1000).
Since the function works on a spatial connectivity matrix (spatial graph), we need to compute that as well. This can be done with squidpy.gr.spatial_neighbors()
.
sq.gr.spatial_neighbors(adata)
Creating graph using `grid` coordinates and `None` transform and `1` libraries.
Adding `adata.obsp['spatial_connectivities']`
`adata.obsp['spatial_distances']`
`adata.uns['spatial_neighbors']`
Finish (0:00:01)
We can now run the neighborhood enrichment test by providing the annotation key in adata.obs
.
sq.gr.nhood_enrichment(adata, cluster_key="cluster")
Calculating neighborhood enrichment using `1` core(s)
Adding `adata.uns['cluster_nhood_enrichment']`
Finish (0:00:00)
The method added adata.uns['cluster_nhood_enrichment']
to our AnnData object.
adata.uns["cluster_nhood_enrichment"]
{'zscore': array([[ 70.22155762, -13.6041789 , -0.4258425 , 2.94739732,
-9.03708733, -12.44368289, -8.94699794, -12.42056733,
-9.76328672, -8.53570662, -5.34106034, -6.81758862,
-10.46793569, -13.79518632, -11.85984283],
[-13.6041789 , 74.34650092, -6.37460975, -9.11260235,
1.92787693, -12.0461501 , -11.90907953, -11.42070949,
-8.79949063, -7.03693412, -5.27623436, -6.33541649,
-3.39758871, -13.09498141, -11.02100047],
[ -0.4258425 , -6.37460975, 71.02312621, -10.2557408 ,
-8.38252823, -3.56904737, -11.60339076, -11.17301353,
-8.63097371, -8.04109503, -4.88912876, -6.29768401,
-7.40871581, -12.6590886 , -10.5828295 ],
[ 2.94739732, -9.11260235, -10.2557408 , 57.83928709,
14.41213846, -9.54187862, -9.20031921, -9.36357782,
-7.19965825, -6.49172687, -4.07412227, -5.14772134,
-7.55767703, -10.34784232, -8.81768536],
[ -9.03708733, 1.92787693, -8.38252823, 14.41213846,
36.34406673, -8.27754053, -7.55077944, -8.38725173,
-6.28914454, -1.78180094, -3.50081508, -4.46780626,
-6.80187795, -9.09205304, -7.97723212],
[-12.44368289, -12.0461501 , -3.56904737, -9.54187862,
-8.27754053, 56.46317761, -1.65912586, -9.83917564,
3.34500357, 4.76287001, -4.4421212 , -6.07070432,
-4.06673556, -9.01905311, -7.63068773],
[ -8.94699794, -11.90907953, -11.60339076, -9.20031921,
-7.55077944, -1.65912586, 56.52983503, -10.53203586,
-6.95861812, -2.59374816, 26.16381548, 13.58097471,
-8.86853162, -11.81470655, -10.82483187],
[-12.42056733, -11.42070949, -11.17301353, -9.36357782,
-8.38725173, -9.83917564, -10.53203586, 78.38442071,
-3.86826297, -4.37445824, -4.5991668 , -5.83739543,
-8.6848676 , -11.53420802, -6.35636122],
[ -9.76328672, -8.79949063, -8.63097371, -7.19965825,
-6.28914454, 3.34500357, -6.95861812, -3.86826297,
61.62040415, -4.40694372, -3.54475788, -2.20267766,
-6.88126696, -1.30472155, -2.3125057 ],
[ -8.53570662, -7.03693412, -8.04109503, -6.49172687,
-1.78180094, 4.76287001, -2.59374816, -4.37445824,
-4.40694372, 42.4700539 , -3.1814847 , -4.0503999 ,
4.9198482 , -8.01018986, -4.18579912],
[ -5.34106034, -5.27623436, -4.88912876, -4.07412227,
-3.50081508, -4.4421212 , 26.16381548, -4.5991668 ,
-3.54475788, -3.1814847 , 45.15843296, -1.28213058,
-3.87010332, -4.99305736, -4.42082271],
[ -6.81758862, -6.33541649, -6.29768401, -5.14772134,
-4.46780626, -6.07070432, 13.58097471, -5.83739543,
-2.20267766, -4.0503999 , -1.28213058, 64.52229328,
-4.87520289, -5.13157661, -5.494849 ],
[-10.46793569, -3.39758871, -7.40871581, -7.55767703,
-6.80187795, -4.06673556, -8.86853162, -8.6848676 ,
-6.88126696, 4.9198482 , -3.87010332, -4.87520289,
72.72449982, -10.10115791, -8.53496568],
[-13.79518632, -13.09498141, -12.6590886 , -10.34784232,
-9.09205304, -9.01905311, -11.81470655, -11.53420802,
-1.30472155, -8.01018986, -4.99305736, -5.13157661,
-10.10115791, 80.27730927, -3.61634144],
[-11.85984283, -11.02100047, -10.5828295 , -8.81768536,
-7.97723212, -7.63068773, -10.82483187, -6.35636122,
-2.3125057 , -4.18579912, -4.42082271, -5.494849 ,
-8.53496568, -3.61634144, 67.62254826]]),
'count': array([[1352, 1, 145, 128, 4, 4, 40, 0, 0, 0, 0,
0, 0, 0, 0],
[ 1, 1290, 67, 9, 87, 0, 0, 0, 0, 7, 0,
0, 55, 0, 0],
[ 145, 67, 1140, 0, 2, 83, 3, 0, 0, 0, 0,
0, 18, 0, 0],
[ 128, 9, 0, 678, 140, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0],
[ 4, 87, 2, 140, 330, 0, 6, 0, 0, 20, 0,
0, 0, 0, 0],
[ 4, 0, 83, 0, 0, 878, 92, 7, 90, 84, 1,
0, 43, 35, 24],
[ 40, 0, 3, 0, 6, 92, 872, 0, 13, 34, 128,
104, 0, 3, 0],
[ 0, 0, 0, 0, 0, 7, 0, 1080, 31, 19, 0,
0, 0, 0, 29],
[ 0, 0, 0, 0, 0, 90, 13, 31, 544, 7, 0,
10, 0, 64, 39],
[ 0, 7, 0, 0, 20, 84, 34, 19, 7, 314, 0,
0, 63, 1, 17],
[ 0, 0, 0, 0, 0, 1, 128, 0, 0, 0, 120,
3, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 104, 0, 10, 0, 3,
280, 0, 9, 0],
[ 0, 55, 18, 0, 0, 43, 0, 0, 0, 63, 0,
0, 736, 0, 0],
[ 0, 0, 0, 0, 0, 35, 3, 0, 64, 1, 0,
9, 0, 1382, 72],
[ 0, 0, 0, 0, 0, 24, 0, 29, 39, 17, 0,
0, 0, 72, 902]], dtype=uint32)}
The added object contains two arrays. The first stored under zscore
contains the enrichment z-score for each cell-cell interaction. The second is stored under count
and represents the enrichment count.
Finally, we’ll directly visualize the results with squidpy.pl.nhood_enrichment()
.
From the above plot, we can see that there seems to be an enrichment for clusters of the Pyramidal_layer
and Dentate_gyrus
. By looking at the spatial scatterplot above, we can confirm that these clusters are indeed “neighbors” as their members are often close.
A similar approach to such problem is computing what we call an interaction matrix, that is, the sum of all connecting observations between clusters in tissue. The approach is related to the neighborhood enrichment analysis yet it is not a test, but should be viewed as a simple summary statistics of the spatial graph. Let’s take a look at how the interaction matrix looks like for the dataset.
sq.gr.interaction_matrix(adata, cluster_key="cluster")
Adding `adata.uns['cluster_interactions']`
The function added adata.uns['cluster_interactions']
to our AnnData object, which contains the number of interactions between two clusters with respect to the provided spatial connectivities graph.
adata.uns["cluster_interactions"]
array([[1.352e+03, 1.000e+00, 1.450e+02, 1.280e+02, 4.000e+00, 4.000e+00,
4.000e+01, 0.000e+00, 0.000e+00, 0.000e+00, 0.000e+00, 0.000e+00,
0.000e+00, 0.000e+00, 0.000e+00],
[1.000e+00, 1.290e+03, 6.700e+01, 9.000e+00, 8.700e+01, 0.000e+00,
0.000e+00, 0.000e+00, 0.000e+00, 7.000e+00, 0.000e+00, 0.000e+00,
5.500e+01, 0.000e+00, 0.000e+00],
[1.450e+02, 6.700e+01, 1.140e+03, 0.000e+00, 2.000e+00, 8.300e+01,
3.000e+00, 0.000e+00, 0.000e+00, 0.000e+00, 0.000e+00, 0.000e+00,
1.800e+01, 0.000e+00, 0.000e+00],
[1.280e+02, 9.000e+00, 0.000e+00, 6.780e+02, 1.400e+02, 0.000e+00,
0.000e+00, 0.000e+00, 0.000e+00, 0.000e+00, 0.000e+00, 0.000e+00,
0.000e+00, 0.000e+00, 0.000e+00],
[4.000e+00, 8.700e+01, 2.000e+00, 1.400e+02, 3.300e+02, 0.000e+00,
6.000e+00, 0.000e+00, 0.000e+00, 2.000e+01, 0.000e+00, 0.000e+00,
0.000e+00, 0.000e+00, 0.000e+00],
[4.000e+00, 0.000e+00, 8.300e+01, 0.000e+00, 0.000e+00, 8.780e+02,
9.200e+01, 7.000e+00, 9.000e+01, 8.400e+01, 1.000e+00, 0.000e+00,
4.300e+01, 3.500e+01, 2.400e+01],
[4.000e+01, 0.000e+00, 3.000e+00, 0.000e+00, 6.000e+00, 9.200e+01,
8.720e+02, 0.000e+00, 1.300e+01, 3.400e+01, 1.280e+02, 1.040e+02,
0.000e+00, 3.000e+00, 0.000e+00],
[0.000e+00, 0.000e+00, 0.000e+00, 0.000e+00, 0.000e+00, 7.000e+00,
0.000e+00, 1.080e+03, 3.100e+01, 1.900e+01, 0.000e+00, 0.000e+00,
0.000e+00, 0.000e+00, 2.900e+01],
[0.000e+00, 0.000e+00, 0.000e+00, 0.000e+00, 0.000e+00, 9.000e+01,
1.300e+01, 3.100e+01, 5.440e+02, 7.000e+00, 0.000e+00, 1.000e+01,
0.000e+00, 6.400e+01, 3.900e+01],
[0.000e+00, 7.000e+00, 0.000e+00, 0.000e+00, 2.000e+01, 8.400e+01,
3.400e+01, 1.900e+01, 7.000e+00, 3.140e+02, 0.000e+00, 0.000e+00,
6.300e+01, 1.000e+00, 1.700e+01],
[0.000e+00, 0.000e+00, 0.000e+00, 0.000e+00, 0.000e+00, 1.000e+00,
1.280e+02, 0.000e+00, 0.000e+00, 0.000e+00, 1.200e+02, 3.000e+00,
0.000e+00, 0.000e+00, 0.000e+00],
[0.000e+00, 0.000e+00, 0.000e+00, 0.000e+00, 0.000e+00, 0.000e+00,
1.040e+02, 0.000e+00, 1.000e+01, 0.000e+00, 3.000e+00, 2.800e+02,
0.000e+00, 9.000e+00, 0.000e+00],
[0.000e+00, 5.500e+01, 1.800e+01, 0.000e+00, 0.000e+00, 4.300e+01,
0.000e+00, 0.000e+00, 0.000e+00, 6.300e+01, 0.000e+00, 0.000e+00,
7.360e+02, 0.000e+00, 0.000e+00],
[0.000e+00, 0.000e+00, 0.000e+00, 0.000e+00, 0.000e+00, 3.500e+01,
3.000e+00, 0.000e+00, 6.400e+01, 1.000e+00, 0.000e+00, 9.000e+00,
0.000e+00, 1.382e+03, 7.200e+01],
[0.000e+00, 0.000e+00, 0.000e+00, 0.000e+00, 0.000e+00, 2.400e+01,
0.000e+00, 2.900e+01, 3.900e+01, 1.700e+01, 0.000e+00, 0.000e+00,
0.000e+00, 7.200e+01, 9.020e+02]])
We can visualize the results with squidpy.pl.interaction_matrix()
.
For the dataset, we roughly recapitulate the neighborhood enrichment test, yet we seem to not observe a particularly strong interaction between the Pyramidal_layer
and Dentate_gyrus
clusters.
One explanation for such a result is that the number of observations of such clusters is low, hence the low number of interactions.
27.4. Co-occurrence across spatial dimensions#
Another spatial statistic that can be computed on cell type annotations in spatial coordinates is what we call the co-occurrence score [Palla et al., 2022, Tosti et al., 2021]. The co-occurrence score gives us an indication on whether clusters co-occur with each other at increasing distances across the tissue. The co-occurrence score is defined as:
\(\frac{p(exp|cond)}{p(exp)}\)
where \(p(exp|cond)\) is the conditional probability of observing a cluster \(exp\) conditioned on the presence of a cluster \(cond\) whereas \(exp\) is the probability of observing \(exp\) in the radius size of interest. The score is computed across increasing radii size around each observation (i.e. spots here) in the tissue.
sq.gr.co_occurrence(adata, cluster_key="cluster")
sq.pl.co_occurrence(adata, cluster_key="cluster", clusters="Cortex_1", figsize=(8, 5))
Here, we selected to visualize the cluster Cortex_1
to visualize how at close distances, the cluster co-occur with the other Cortex
clusters, as expected.
27.5. Key takeaways#
Spatial statistics and neighborhood analysis can be a good first starting point for analyzing spatial omics data
Analysis tools like Squidpy provide several spatial statistics that help to understand the neighborhood structure in spatial omics datasets
27.6. References#
A.E. Gelfand, M. Fuentes, P. Guttorp, and P. Diggle. Handbook of Spatial Statistics. Chapman & Hall/CRC Handbooks of Modern Statistical Methods. Taylor & Francis, 2010. ISBN 9781420072877. URL: http://books.google.com/books?id=EFbbcMFZ2mMC.
Giovanni Palla, Hannah Spitzer, Michal Klein, David Fischer, Anna Christina Schaar, Louis Benedikt Kuemmerle, Sergei Rybakov, Ignacio L. Ibarra, Olle Holmberg, Isaac Virshup, Mohammad Lotfollahi, Sabrina Richter, and Fabian J. Theis. Squidpy: a scalable framework for spatial omics analysis. Nature Methods, 19(2):171–178, Feb 2022. URL: https://doi.org/10.1038/s41592-021-01358-2, doi:10.1038/s41592-021-01358-2.
Luca Tosti, Yan Hang, Olivia Debnath, Sebastian Tiesmeyer, Timo Trefzer, Katja Steiger, Foo Wei Ten, Sören Lukassen, Simone Ballke, Anja A. Kühl, Simone Spieckermann, Rita Bottino, Naveed Ishaque, Wilko Weichert, Seung K. Kim, Roland Eils, and Christian Conrad. Single-nucleus and in situ rna–sequencing reveal cell topographies in the human pancreas. Gastroenterology, 160(4):1330–1344.e11, 2021. URL: https://www.sciencedirect.com/science/article/pii/S0016508520353993, doi:https://doi.org/10.1053/j.gastro.2020.11.010.
27.7. Contributors#
27.7.2. Reviewers#
Anna Schaar
Lukas Heumos