Skip to content

ErasmusMC-Bioinformatics/Spatial-Cookbook

Repository files navigation

Spatial Cookbook

This repository explains the basics of analyzing Visium CytAssist data generated by PARTS. This workflow is currently under construction. A provisional workflow figure can be found below: There is a core workflow with steps that need to be executed for every analysis. The result of the core workflow is processed and clustered Visium data with differentially expressed genes for each cluster. Then there are three additional workflows that can be followed depending on your research question. These workflows are clonal analysis, tumor microenvironment analysis and analysis of spatial niches.

Recreate environment

This project uses renv to manage R package dependencies. To replicate the environment:

1. Clone the repository

git clone git@github.com:ErasmusMC-Bioinformatics/Spatial-Cookbook.git
cd your-repo

2. Open the project in R or RStudio

Open the .Rproj file in RStudio (if available), or set the working directory to the project root.

3. Restore the R environment

In the R console run:

install.packages("renv")  # if renv is not already installed
renv::restore()

Now all packages will be installed with the exact versions listed in renv.lock file.

Core workflow

0 Space Ranger and alignment

The data provided by PARTS consists of the raw sequencing files, images and data processed through Space Ranger. Before continuing with the downstream analysis it is important to check the fiducial alignment and whether you would like to align a high-resolution H&E image to the data instead of the lower-quality CytAssist image.

0.1 Check fiducial alignment

Space Ranger uses an automated image detection algorithm to correctly align the count data of the spots with the tissue image. However, in some cases (for example when the fiducial markers are obstructed) this automated alignment does not work properly. Therefore, it is important to check the alignment before continuing with downstream analysis. You can check the alignment in the web_summary.html file. Examples of errors are shown below:

Example of misaligned fiducials Example of wrong orientation of image Example of wrong tissue detection

If any of these issues occur, manual fiducial alignment should be executed in the Loupe browser using the CytAssist image. The process is well described here: https://www.10xgenomics.com/support/software/space-ranger/latest/analysis/inputs/image-fiducial-alignment. This can be combined with the manual alignment of the H&E image which will be described in part 0.2.

0.2 Alignment with H&E (optional, but highly recommended)

While this part is optional, aligning your Visium data with a high-resolution H&E will make it a lot easier to annotate your data or relate your expression to the tissue.

When you would like to align your microscope image, you will have to re-run Space Ranger. There are two options for re-running Space Ranger:

When you are happy with the alignment results you can continue to step 1.

1 Preprocessing & quality control

The first step in the downstream analysis is to check the quality of your sample and filter out spots with too low quality. An R markdown script for this process can be found here: 01_Preprocessing_and_QC.Rmd. This script uses the Seurat package for visualizing multiple quality metrics and filtering of spots based on thresholds. It expects raw Visium data in Space Ranger output format. The user should evaluate the QC plots and determine thresholds for filtering appropriate for their data. The resulting RDS files will be used as input for step 2.

2 Normalization

The second step in the downstream analysis is to normalize your data. An R markdown script for this process can be found here: 02_Normalization.Rmd. This script uses the Seurat package to normalize using the SCTransform method. It expects filtered Seurat objects in RDS format and will save normalized Seurat objects in .RDS format.

3 Dimensionality reduction & clustering

The third step in the downstream analysis is to reduce the dimensionality of your data and apply clustering. An R markdown script for this process can be found here: 03_Dimensionality_reduction_and_clustering.Rmd. This script uses the Seurat package to reduce dimensionality using Principal Component Analysis (PCA) and the Leiden algorithm for clustering. It expects SCTransform normalized Seurat objects in RDS format and will save clustered Seurat objects in RDS format for the resolutions specified by the user.

4 Gene expression analysis

The fourth step in the downstream analysis is to determine differentially expressed genes per cluster. An R markdown script for this process can be found here: 04_Expression_analysis.Rmd. This script uses the FindMarkers function from Seurat to determine differentially expressed genes. In addition, spatially variable genes are detected as well.

Clonal analysis workflow

This workflow is relevant when studying cancer subclones.

5 CNV inference

The first step in the clonal analysis workflow is to determine copy number variations from the expression data. An R markdown script for this process can be found here: 05_CNV_inference.Rmd. This script uses the SPATA2 implementation of inferCNV.

6 Phylogenetic tree construction

The second step in the clonal analysis workflow is to construct phylogenetic tree based on the previously determined copy number variation profiles. An R markdown script for this process can be found here: 06_Phylogenetic_tree_construction.Rmd. This script uses the input from the previous step and the dendextend package to visualize a phylogenetic tree labeled for histological annotations.

7 Deconvolution

Independent of the research question, spot deconvolution is a crucial step in analysing Visium data, as it determines which cell types are present and in which abundance. For this step, two flavours are available. A Jupyter notebook is available for using the Python-based method TANGRAM for deconvolution: 07_01_B_TANGRAM.ipynb. In addition, a yaml file is available to install a conda environment through which the Jupyter notebook can be executed: TANGRAM_env.yml. Note that TANGRAM is a computationally intense method and will likely need GPUs to run within a reasonable timeframe. This is all dependent on the size of your reference scRNA-seq data. To convert the input RDS files to AnnData objects required by TANGRAM, a conversion script is available in R: 07_01_A_Seurat_to_AnnData.R.

An R markdown script is also available for using the R-based deconvolution method called CARD: 07_02_Deconvolution_CARD.Rmd. CARD is less computationally heavy compared to TANGRAM.

About

Useful scripts for processing and analyzing spatial omics data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages