4 Recommended usage

sclet is still centered on SingleCellExperiment. The recommended way to use the package is to treat the SCE object as the single source of truth, and then apply sclet analysis functions on top for common analysis steps.

4.1 For end users

The shortest recommended workflow is to use the standard pipeline entry point:

library(sclet)

# Run the full standard pipeline
sce <- RunStandardPipeline(sce)

# Print a structured summary of the analysis steps
PipelineSummary(sce)

Under the hood, RunStandardPipeline sequentially calls:

sce <- NormalizeData(sce)
sce <- FindVariableFeatures(sce)
sce <- ScaleData(sce)
sce <- RunPCA(sce)
sce <- FindNeighbors(sce)
sce <- FindClusters(sce)
sce <- RunUMAP(sce)

If scrapper is installed, FindVariableFeatures(sce, method = "scrapper") is also supported as an optional HVG backend. sclet still writes the result back into the same HVG state and rowData, so downstream code continues to use VariableFeatures() in the usual way.

Because scrapper is an optional dependency, users need to install it explicitly if they want this backend instead of the default scran path.

In current sclet, user-facing analysis verbs are standardized on the Run* style. Use RunPCA(), RunMilo(), and RunCellChat() as the recommended public API.

The other important convention is that sclet now distinguishes between:

a physical assay, which is where the matrix is actually stored in the SCE object
a logical layer, which is the user-facing expression view that downstream analysis functions consume

For example, after normalization and scaling, a typical object may expose:

counts
logcounts
scaled

and DefaultLayer(object) tells you which one is currently recommended for downstream use.

For interactive exploration, it is better to rely on the state-aware accessors instead of reaching directly into object internals:

DefaultAssay(), DefaultLayer(), DefaultReduction(), DefaultGraph(), and ActiveIdent() expose the current active view.
Layers() and LayerData() expose the available expression layers and their matrices.
CommandLog() shows which analysis functions have been applied.
PipelineSummary() gives a clear textual report of the executed pipeline steps.
Status() gives a quick user-facing snapshot of the current object state.
get_hvg(), get_graph(), get_milo(), get_trajectory(), get_cellchat(), get_integration(), get_annotation(), and get_mapping() expose structured analysis records.
get_analysis_context() gives a lightweight summary of the current active view plus the active integration/annotation/mapping and other analysis records.
has_*() helpers are the preferred way to check whether an analysis result is available.

You can also use the integrated plotting functions that automatically consume the active state:

# Plot the active dimensional reduction colored by active identity
CellDimPlot(sce)

# Plot feature expression on the active dimensional reduction
FeatureDimPlot(sce, features = c("GeneA", "GeneB"))

# Plot a grouped heatmap of features across cell identities
GroupHeatmap(sce, features = c("GeneA", "GeneB", "GeneC"))

# Plot cell statistics (e.g., proportion of clusters across conditions)
CellStatPlot(sce, split.by = "Condition")

sclet supports robust annotation and reference mapping:

# Full annotation via SingleR
sce <- RunSingleR(sce, ref = ref_dataset, labels = ref_dataset$CellType)

# Lightweight label transfer via KNN in shared feature space
sce <- RunKNNPredict(sce, ref = ref_dataset, labels = "CellType", k = 5)

# Visualize query and reference cells in the same reduction space
ProjectionPlot(query = sce, ref = ref_dataset, reduction = "UMAP")

This means the recommended mental model is simple: keep one SCE object, run the sclet analysis functions in sequence, and inspect state through the exported accessors instead of through ad hoc metadata fields.

In practical terms:

use RunPCA(layer = "scaled") if you want PCA to be built explicitly from the scaled layer
use FindMarkers() without an explicit assay/layer if you want it to follow the current layer logic, with safe fallback away from scaled
use RunSingleR() and RunCellChat() without manual matrix extraction when the current DefaultLayer() already reflects the desired biological view
after BatchRemover(), use get_integration() and DefaultLayer() to understand which corrected view is currently active
after RunSingleR(), use get_annotation() to inspect the annotation record, and get_mapping() to inspect the corresponding reference-mapping record
use Status() when you want the shortest high-level snapshot of the current object state before drilling into specific accessors
use get_analysis_context() when you want one compact snapshot of the current layer/reduction/graph/ident combination together with the active analysis records
after RunSuperCell(), use get_supercell() to inspect the returned metacell object’s aggregation record and parent-child provenance
after RunCellChat(), use get_cellchat() to inspect the active communication record, or pass id = ... when you keep multiple CellChat runs
after RunMilo(), use get_milo() to inspect the active DA record, or pass id = ... when you keep multiple Milo runs

When analysis is performed on top of a corrected layer, downstream states now carry that provenance forward. In other words, PCA, graph construction, clustering, and UMAP can all record that they depend on the active integration state rather than behaving like disconnected one-off steps.

For multi-sample workflows, the provenance chain can now start even earlier: sce_merge() stores a merged_inputs integration record, and the subsequent BatchRemover() state can point back to that merge step.

4.2 For pipeline authors

If you are writing a reusable script or package around sclet, prefer patterns that are stable under future extension:

Pass the SCE object through each step and return the updated object explicitly.
Prefer exported entry points such as RunPCA() and RunUMAP() over calling lower-level functions directly when sclet already provides them.
Read results through accessors (get_*, has_*, Default*, ActiveIdent, Layers, LayerData) instead of assuming a fixed metadata layout.
If old metadata still needs to be read while migrating historical objects, keep that logic centralized in shared internal helpers instead of re-implementing direct metadata(...) reads in multiple accessors.
When using RunCellChat(), request return = "sce" or return = "both" if downstream code needs the updated SCE with recorded analysis state.
When a module can produce multiple records of the same type, give each run a stable name and read it back through get_* (id = ...) instead of assuming only one result exists.