sclet: A Lightweight Toolkit for Single-Cell Data Analysis
Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical Universityguangchuangyu@gmail.com
2026-06-15
Preface
sclet is a state-aware single-cell and spatial transcriptomics analysis toolkit built around SingleCellExperiment. At one level, it provides a coherent set of high-level analysis verbs for routine workflows such as preprocessing, dimensionality reduction, clustering, annotation, visualization, and downstream interpretation. At another level, it introduces a more opinionated design: analysis results should not be scattered across ad hoc slots and forgotten parameter choices. Instead, they should be recorded, inspectable, and reusable as part of a coherent workflow.
That design principle is what makes sclet different. Throughout this book, you will see that the package is not organized as a thin compatibility layer for another framework. Instead, it treats SingleCellExperiment as the primary analysis object and builds a unified state-aware workflow around it. Active layers, reductions, graphs, identities, and higher-level analysis records are all tracked in a shared contract. This makes routine analysis easier, but it also makes the workflow easier to explain, debug, extend, and eventually hand off to others.
The goal of this book is therefore twofold. First, it serves as a practical guide for users who want to analyze single-cell and spatial transcriptomics data within a SingleCellExperiment-native workflow without writing endless glue code. Second, it explains the package architecture clearly enough that advanced users and contributors can understand how the pieces fit together.
0.1 Mainline Map
This book now treats sclet as a map of analysis mainlines rather than a bag of disconnected modules.
- Data ingress and object interoperability: start by reading data, moving between
SCE,Seurat, andAnnData, and establishing one clean object of record. - Core single-cell workflow: run QC, normalization, HVG selection, dimensional reduction, clustering, and identity management.
- Integration and batch correction: align multiple samples into a reusable corrected view for downstream analysis.
- Cell identity and reference mapping: answer “what are these cells?” through annotation, mapping, and confidence inspection.
- Cell fate and dynamic processes: answer “where are these cells going?” with trajectory, velocity, and fate-oriented tools.
- Program, regulon, and mechanistic interpretation: explain states through pathway activity, regulon scoring, and mechanism-oriented summaries.
- Differential analysis and functional interpretation: connect marker discovery, pseudobulk testing, and enrichment into one interpretation chain.
- Cell-cell communication and microenvironment interaction: move from isolated cell states to interaction programs and ligand-receptor logic.
- State priority and perturbation sensitivity: reserve the user-facing route for prioritizing sensitive or rare states as the workflow layer matures.
- Spatial context and niche analysis: connect deconvolution, niche interpretation, and perturbation-aware spatial reasoning.
- Multimodal expansion: reserve the documentation path for future ADT, ATAC, and multiome support.
0.2 How This Book Is Organized
The table of contents now follows the same logic.
First come the foundations chapters, which explain the state contract and provenance-aware automation. These are short, but they make the rest of the package much easier to understand.
Next come the mainline chapters, ordered to match the typical analysis story: data ingress -> core workflow -> integration -> reference mapping -> dynamics -> program/regulon -> differential interpretation -> communication -> state priority -> spatial -> multimodal.
Finally, the appendix and specialized extension chapters collect cross-cutting plotting helpers, metacells, Milo, phenotype association, interactive exploration, SVP-related material, and AI-assisted analysis. These chapters still matter, but they no longer define the default reading path.
0.3 Reading Paths
You do not need to read every chapter linearly. A few good entry routes are:
- New users: read the foundations, then follow the first four mainlines in order.
- Annotation-first users: jump from the core workflow to integration and reference mapping.
- Dynamics-focused users: start with the core workflow, then read the trajectory, program/regulon, and spatial chapters together.
- Mechanism-first users: combine the program/regulon chapter with the differential interpretation and communication chapters.
- Contributors: read the foundations first, then inspect the mainline chapter that best matches the feature you want to extend.
0.4 Example Conventions
This book uses two kinds of examples.
First, many chapters contain directly runnable examples based on bundled datasets or dependencies that are available in the GitHub Actions build environment. Whenever practical, these examples are executed during book rendering so that the documentation also serves as a lightweight regression test for the package.
Second, some chapters contain demonstration-only examples that are shown but not executed during book rendering. This is intentional rather than accidental. In practice, these cases usually fall into one of four categories: they require external reference files or databases, interactive sessions, AI-specific configuration, or heavier Python-backed workflows that are still less stable than the R / Bioconductor-native parts of the book.
When an example is not executed during the book build, the surrounding text explains why. The code is still written as real usage code, so readers can copy, adapt, and run it in their own environment when they are ready.