单细胞数据高级分析之构建成熟路径 | Identifying a maturation trajectory

Identifying a maturation trajectory.

To assign each cell a maturation score thatis proportional to the developmental progress, we first performed dimensionalityreduction as described above using all genes that were detected in at least 2% ofthe cells (8,014 genes). This resulted in four significant dimensions. We then fita principal curve (R package princurve, smoother= ‘lowess’, f= 1/3) through thedata. The maturation score of a cell is then the arc-length from the beginning of thecurve to the point at which the cell projects onto the curve.

The resulting curve isdirectionless, so we assign the ‘beginning’ of the curve so that the expression of Nesis negatively correlated with maturation. Nes is a known ventricular zone markerand therefore should only be highly expressed early in the trajectory. Maturationscores are normalized to the interval [0, 1]. In an independent analysis, we also usedMonocle2 to order cells along a pseudo-time. We used Monocle version 2.3.6 withexpression response variable set to negative binomial. We estimated size factorsand dispersion using the default functions.

For ordering cells, we reduced theset of genes based on results of the monocle dispersion Table function, and onlyconsidered 718 genes with mean expression0.01 and an empirical dispersion atleast twice as large as the fitted dispersion. Dimensionality reduction was carriedout using the default method (DDRTree)

Defining mitotic and post mitotic populations.

We observed a sharp transitionpoint along the maturation trajectory at which cells uniformly transitioned intoa postmitotic state, corresponding to the loss of proliferation potential and exitfrom the cell cycle (Fig. 1f, Extended Data Fig. 1).

We therefore subdivided thematuration trajectory into a mitotic and postmitotic phase to facilitate downstreamanalyses. We defined cells with a high phase-specific enrichment score (score>2, see section ‘Removal of cell cycle effect’) as being in the S or the G2/M phase.

We then fitted a smooth curve (loess, span=0.33, degree=2) to number of cells inS, G2/M phases as a function of maturation score. The point where this curve fallsbelow half the global average marks the dividing threshold (Fig. 1f).