Stability Selection and Consensus Clustering in R: The R Package sharp
Barbara Bodinier et al.
Abstract
The R package sharp (Stability-enHanced Approaches using Resampling Procedures) provides an integrated framework for stability-enhanced variable selection, graphical modeling and clustering. In stability selection, a feature selection algorithm is combined with a resampling technique to estimate feature selection probabilities. Features with selection proportions above a threshold are considered stably selected. Similarly, a clustering algorithm is applied on multiple subsamples of items to compute co-membership proportions in consensus clustering. The consensus clusters are obtained by clustering using comembership proportions as a measure of similarity. We calibrate the hyper-parameters of stability selection (or consensus clustering) jointly by maximizing a consensus score calculated under the null hypothesis of equiprobability of selection (or co-membership), which characterizes instability. The package offers flexibility in the modeling, includes diagnostic and visualization tools, and allows for parallelization.
5 citations
Evidence weight
Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40
| F · citation impact | 0.41 × 0.4 = 0.16 |
| M · momentum | 0.63 × 0.15 = 0.09 |
| V · venue signal | 0.50 × 0.05 = 0.03 |
| R · text relevance † | 0.50 × 0.4 = 0.20 |
† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.