12.07.2015 Views

P-values and Stability Selection - Seminar für Statistik - ETH Zürich

P-values and Stability Selection - Seminar für Statistik - ETH Zürich

P-values and Stability Selection - Seminar für Statistik - ETH Zürich

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Statistics for high-dimensional data:P-<strong>values</strong> <strong>and</strong> <strong>Stability</strong> <strong>Selection</strong>Peter Bühlmann <strong>and</strong> Sara van de Geer<strong>Seminar</strong> für <strong>Statistik</strong>, <strong>ETH</strong> ZürichMay 2012


P-<strong>values</strong> for high-dimensional linear modelsY = Xβ 0 + εgoal: statistical hypothesis testingH 0,j : β 0 j = 0 or H 0,G : β 0 j = 0 for all j ∈ Gbackground: if we could h<strong>and</strong>le the asymptotic distribution ofthe Lasso ˆβ(λ) under the null-hypothesis❀ could construct p-<strong>values</strong>this is very difficult!asymptotic distribution of ˆβ has some point mass at zero,...Knight <strong>and</strong> Fu (2000) for p < ∞ <strong>and</strong> n → ∞;Montanari (2010, 2012) ... for r<strong>and</strong>om design with i.i.d. columnsnot practical at the moment


Variable selectionExample: Motif regressionfor finding HIF1α transcription factor binding sites in DNA seq.Müller, Meier, PB & RicciY i ∈ R: univariate response measuring binding intensity ofHIF1α on coarse DNA segment i (from CHIP-chip experiments)X i = (X (1)i, . . . , X (p)i) ∈ R p :X (j)i= abundance score of c<strong>and</strong>idate motif j in DNA segment i(using sequence data <strong>and</strong> computational biology algorithms,e.g. MDSCAN)


question: relation between the binding intensity Y <strong>and</strong> theabundance of short c<strong>and</strong>idate motifs?❀ linear model is often reasonable“motif regression” (Conlon, X.S. Liu, Lieb & J.S. Liu, 2003)Y = Xβ + ɛ, n = 287, p = 195goal: variable selection❀ find the relevant motifs among the p = 195 c<strong>and</strong>idates


Motif regressionfor finding HIF1α transcription factor binding sites in DNA seq.Y i ∈ R: univariate response measuring binding intensity oncoarse DNA segment i (from CHIP-chip experiments)= abundance score of c<strong>and</strong>idate motif j in DNA segment iX (j)ivariable selection in linear model Y i = β 0 +i = 1, . . . , n = 287, p = 195❀ Lasso selects 26 covariates <strong>and</strong> R 2 ≈ 50%i.e. 26 interesting c<strong>and</strong>idate motifsp∑j=1β j X (j)i+ ε i ,


motif regression: estimated coefficients ˆβ(ˆλ CV )original datacoefficients0.00 0.05 0.10 0.15 0.200 50 100 150 200variableswhich variables in Ŝ are false positives?(p-<strong>values</strong> would be very useful!)


(Multi) Sample splittingwhich was an early but sub-ideal proposal◮ select variables on first half of the sample ❀ Ŝ◮ compute OLS for variables in Ŝ on second half of thesample❀ P-<strong>values</strong> P j based on Gaussian linear modelif j ∈ Ŝ :P j from t-statisticsif j /∈ Ŝ : P j = 1 (i.e. if ˆβ (j) = 0)Bonferroni-“style” corrected P-<strong>values</strong>:P corr,j = min(P j · |Ŝ|, 1)❀ (conserv.) familywise error control withP corr,j (j = 1, . . . , p)(Wasserman & Roeder, 2008)


this is a “P-value lottery”motif regression example: p = 195, n = 287adjusted P-<strong>values</strong> for same important variableover different r<strong>and</strong>om sample-splitsFREQUENCY0 20 40 60 800.0 0.2 0.4 0.6 0.8 1.0ADJUSTED P−VALUE❀ improve by aggregating over many sample-splits(which improves “efficiency” as well)


this is a “P-value lottery”motif regression example: p = 195, n = 287adjusted P-<strong>values</strong> for same important variableover different r<strong>and</strong>om sample-splitsFREQUENCY0 20 40 60 800.0 0.2 0.4 0.6 0.8 1.0ADJUSTED P−VALUE❀ improve by aggregating over many sample-splits(which improves “efficiency” as well)


Sample splitting multiple timesrun the sample-splitting procedure B times <strong>and</strong> do a non-trivialaggregation of p-<strong>values</strong>P-<strong>values</strong>: P (1)j, . . . , P (B)jgoal:aggregation of P (1)j, . . . , P (B)jto a single P-value P final,jproblem: dependence among P (1)j, . . . , P (B)j


defineQ (j) (γ) =q γ (P (j)}{{}corr,b/γ; b = 1, . . . B)emp. γ-quantile fct.e.g: γ = 1/2, aggregation with the median❀ (conserv.) familywise error control for any fixed value of γwhat is the best γ? it really matters❀ can “search” for it an correct with an additional factor


“adaptively” aggregated P-value:P (j)final = (1 − log(γ min)) · infγ∈(γ min ,1) Q(j) (γ)Q (j) (γ) = q γ (P (j)corr,b/γ; b = 1, . . . B)❀ reject H (j)0: β j = 0 ⇐⇒ P (j)final ≤ αequals roughly a raw P-value based on sample size ⌊n/2⌋,multiplied byP (j)finala factor ≈ (5 − 10) · |Ŝ|(which is to be compared with p)


for familywise error rate (FWER) =P[at least one false positive selection]TheoremConsider Gaussian linear model (with fixed design) <strong>and</strong>assume:◮ lim n→∞ P[Ŝ ⊇ S 0] = 1 screening property◮|Ŝ| < ⌊n/2⌋ sparsity propertyThen:strong control for either familywise error rate or for falsediscovery rate


motif regression examplep = 195, n = 287motif regression●coefficients0.00 0.05 0.10 0.15 0.20●●●● ●●●●●● ●●● ● ●●● ● ● ● ● ●●● ●●●●●●●●●0 50 100 150 200variables◦: variable/motif with FWER-adjusted p-value 0.006◦: p-value clearly larger than 0.05(this variable corresponds to known true motif)


discussion: multi sample splitting◮ assumes P[Ŝ ⊇ S 0] → 1❀ requires the beta-min conditionsuch an assumption should be avoided for hypothesistesting(because this is the essence of the question in testingwhether βj0 is smallish or sufficiently large)◮ necessarily requires design conditions; but this isunavoidable


<strong>Stability</strong> <strong>Selection</strong> (Meinshausen & PB, 2010)which allows to go way beyond linear modelsselection of “features” from the set {1, . . . , p}, e.g.:◮ variable selection in regression or classification◮ edge selection in a graph◮ membership to a cluster◮ ...selection procedureŜ λ ⊆ {1, . . . , p},λ a tuning parameterprime example: Lasso for selecting variables in a linear model


subsampling:◮ draw sub-sample of size ⌊n/2⌋ without replacement,denoted by I ∗ ⊆ {1, . . . , n}, |I ∗ | = ⌊n/2⌋◮ run the selection algorithm Ŝλ (I ∗ ) on I ∗◮ do these steps many times <strong>and</strong> compute therelative selection frequenciesˆΠ λ j= P ∗ (j ∈ Ŝλ (I ∗ )), j = 1, . . . , p(P ∗ is w.r.t. sub-sampling)could also use bootstrap sampling with replacement...


subsampling:◮ draw sub-sample of size ⌊n/2⌋ without replacement,denoted by I ∗ ⊆ {1, . . . , n}, |I ∗ | = ⌊n/2⌋◮ run the selection algorithm Ŝλ (I ∗ ) on I ∗◮ do these steps many times <strong>and</strong> compute therelative selection frequenciesˆΠ λ j= P ∗ (j ∈ Ŝλ (I ∗ )), j = 1, . . . , p(P ∗ is w.r.t. sub-sampling)could also use bootstrap sampling with replacement...


<strong>Stability</strong> selectionŜ stable = {j; ˆΠ λ j ≥ π thr }choice of π thr ❀ see later


if we consider many regularization parameters:{Ŝλ ; λ ∈ Λ}(Λ can be discrete or continuous)Ŝ stable = {j; max λ∈Λ ˆΠλ j ≥ π thr }see also Bach (2009) for a related proposal


Choice of threshold π thr ∈ (0, 1)?


How to choose the threshold π thr ?denote by V = |S c 0 ∩ Ŝstable | = number of false positivesconsider a selection procedure which selects q variables(e.g. top 50 variables when running Lasso over many λ’s)Theorem (Meinshausen & PB, 2010main assumption: exchangeability conditionin addition: Ŝ is better than “r<strong>and</strong>om guessing”Then:E[V ] ≤1 q 22π thr − 1 pi.e. finite sample control, even if p ≫ n❀ choose threshold π thr to control e.g. E[V ] ≤ 1 orP[V > 0] ≤ E[V ] ≤ α


note the generality of the Theorem...◮ it works for any method which is better than “r<strong>and</strong>omguessing”◮ it works not only for regression but also for “any” discretestructure estimation problem (whenever there is ainclude/exclude decision)❀ variable selection, graphical modeling, clustering, ...<strong>and</strong> hence there must be a fairly strong condition...Exchangeability condition:the distribution of {I {j∈ Ŝ λ } ; j ∈ SC 0} is exchangeablenote: only some requirement for noise variables


note the generality of the Theorem...◮ it works for any method which is better than “r<strong>and</strong>omguessing”◮ it works not only for regression but also for “any” discretestructure estimation problem (whenever there is ainclude/exclude decision)❀ variable selection, graphical modeling, clustering, ...<strong>and</strong> hence there must be a fairly strong condition...Exchangeability condition:the distribution of {I {j∈ Ŝ λ } ; j ∈ SC 0} is exchangeablenote: only some requirement for noise variables


Many simulations!datapnssnr(A)100010040.5 2100010 20 500.5 2 0.5 2 0.5 2violations 0 2 0 0max cor 0.34 0.11(B)1000200 10004 10 20 500.5 2 0.5 2 0.5 2 0.5 200.6510.5444(C)1000200 10004 10 20 500.5 2 0.5 2 0.5 2 0.5 2140.991081310.99302(D)100 1000200 200 100010 30 4 10 20 500.5 2 0.5 2 0.5 2 0.5 2 0.5 2 0.5 210.7420.691100.692(E)200 1000200 200 100010 30 4 10 20 500.5 2 0.5 2 0.5 2 0.5 2 0.5 2 0.5 2660.73371030.783422590.7762(F)66025874 10 20 500.5 2 0.5 2 0.5 2 0.5 200.48470159(G)40881154 100.5 2 0.5 21180.824771XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXrep(1, 2)rep(1, 2)rep(1, 2)P(first 0.1s correct)rep(1, 2)3/41/21/40Xrep(1, 2)rep(1, 2)Xrep(1, 2)Xrep(1, 2)XXXXXXrep(1, 2)Xrep(1, 2)XXXXrep(1, 2)Xrep(1, 2)XXXXXXXrep(1, 2)Xrep(1, 2)Xrep(1, 2)Xrep(1, 2)XXXXXrep(1, rep(1, 2) rep(1, 2) rep(1, 2) 2)rep(1, 2)rep(1, 2)rep(1, 2)P(first 0.4s correct)rep(1, 2)13/41/2X1/40X X XXX X X Xrep(1, rep(1, 2) 2)rep(1, rep(1, 2) 2)rep(1, rep(1, 2) 2)rep(1, rep(1, 2) 2)rep(1, rep(1, 2) 2)rep(1, rep(1, 2) 2)XXXXXXXXXXrep(1, 2)rep(1, 2)rep(1, 2)rep(1, 2)rep(1, 2)rep(1, 2)XXXX X XX XXXXXXX X X X X X X X X XXXX X X X X X X X X X XX X X X Xrep(1, 2)rep(1, 2)rep(1, 2)rep(1, 2)rep(1, 2)rep(1, 2)even if exchangeability condition fails:(conservative) error control for E[V ] holds


Motif regression (n=287, p=195)◮ two stable motifs with stability selection (w.r.t. E[V ] ≤ 1)◮ Multi sample splitting finds only the one motif which is notbiologically validated


Ribolflavin data (n=71, p=4088): control for E[V ] ≤ 1stability selection E[V]


Graphical modeling using ! = GLasso 0.46 ! = 0.448 ! = 0.436 ! = 0.424!!!!!!!!!!! !!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!! !! !!!(Rothman, Bickel, Levina!& Zhu, 2008; Friedman, Hastie & Tibshirani, 2008)!!!! !!! !!!! !!! !!!! !!! !!!!!! !!!!! !!!!! !!!!!!!! !!!!! !!!!!!!! !!!!! !!!!!!!! !!!!!!! !!!!!!!! !!!!! !!!!! !!!!!infer conditional independence !!!! !!!graph !!!! !!!using l 1 -penalization!!!! !!! ! ! !!!!!!!!!!!!!i.e. infer zeroes of Σ −1 ! !! !! !!!!!!!!!!!!! from X 1 , . . . , X n i.i.d. ∼ N p (0, Σ)!! ! ! !! ! ! !! ! Σ −1!jk≠ 0 ⇔ X (j) ̸⊥ X (k) |X ({1,...,p}\{j,k}) !!!!!!!!!!!!⇔!edge j − ! k!!!!!!!gene expr. dataGraphical Lasso<strong>Stability</strong> <strong>Selection</strong>! ! !!! ! !! !!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!! !! !!!! !!!!!! !!!! ! !!!!! !!!!!!!! !!! ! !!!! !!!!! !!!!! !! ! ! !! ! !!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!! !!! ! ! !! ! !! ! !!! ! ! ! !!!! !!!!!!! !! ! !!! !! ! !!!! !!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!! !!!!!!!!!!!!!!! !!!!!!!!!!! !!!!!!!!!! !!!!!!!!!!!! !!!! ! !!!!! !!!!!!!!!! !! !!!!! !!! ! !!!!!!!! ! !!!!! !!!!!! !!!!!! !!!!!!!! !!!!!! !!!!!!!! !! ! !! ! !!! ! !!!! !!! ! ! !!!! !!! ! ! !!!! !!!!!!!!!!!!!! !! !!!!!!!!!!! !! ! ! !! ! ! !! ! !! !! !! ! !! !! !!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!! !!!!!!!!!! !!!!?!⇒! !!!zero-pattern of Σ −1


sub-problem of Riboflavin datap = 160, n = 115stability selection with E[V ] ≤ 5●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●varying the regularization parameter λ in l 1 -penalizationGraphical Lasso● ●● ●● ●●● ● ●●●λ = 0.46●●●●●●● ●●●●● ● ●●●● ●●●●● ●●● ●●● ● ●●●●●●●●●●●●●●●●●● ●●●●●●●● ●●● ●●●● ●● ●●●● ● ●●● ●●● ●● ●● ●●● ● ●●●λ = 0.448●●●●●●● ●●●●● ● ●●●● ●●●●● ●●● ●●● ● ●●●●●●●●●●●●●●●●●● ●●●●●●●● ●●● ●●●● ●● ●●●● ● ●●● ●●● ●● ●● ●●● ● ●●●λ = 0.436●●●●●●● ●●●●● ● ●●●● ●●●●● ●●● ●●● ● ●●●●●●●●●●●●●●●●●● ●●●●●●●● ●●● ●●●● ●● ●●●● ● ●●● ●●● ●● ●● ●●● ● ●●●λ = 0.424●●●●●●● ●●●●● ● ●●●● ●●●●● ●●● ●●● ● ●●●●●●●●●●●●●●●●●● ●●●●●●●● ●●● ●●●● ●● ●●●● ● ●●● ●●● ●● ●● ●●● ● ●●●λ = 0.412●●●●●●● ●●●●● ● ●●●● ●●●●● ●●● ●●● ● ●●●●●●●●●●●●●●●●●● ●●●●●●●● ●●● ●●●● ●● ●●●● ● ●●● ●●● ●● ●● ●●● ● ●●●●●●●●●● ●●●●● ● ●●●● ●●●●λ = 0.4● ●●● ●●● ● ●●●●●●●●●●●●●●●●●● ●●●●●●●● ●●● ●●●● ●● ●●●● ● ●●● ●●<strong>Stability</strong> <strong>Selection</strong>● ●● ●● ●●● ● ●●●●●●●●●● ●●●●● ● ●●●● ●●●●● ●●● ●●● ● ●●●●●●●●●●●●●●●●●● ●●●●●●●● ●●● ●●●● ●● ●●●● ● ●●● ●●● ●● ●● ●●● ● ●●●●●●●●●● ●●●●● ● ●●●● ●●●●● ●●● ●●● ● ●●●●●●●●●●●●●●●●●● ●●●●●●●● ●●● ●●●● ●● ●●●● ● ●●● ●●● ●● ●● ●●● ● ●●●●●●●●●● ●●●●● ● ●●●● ●●●●● ●●● ●●● ● ●●●●●●●●●●●●●●●●●● ●●●●●●●● ●●● ●●●● ●● ●●●● ● ●●● ●●● ●● ●● ●●● ● ●●●●●●●●●● ●●●●● ● ●●●● ●●●●● ●●● ●●● ● ●●●●●●●●●●●●●●●●●● ●●●●●●●● ●●● ●●●● ●● ●●●● ● ●●● ●●● ●● ●● ●●● ● ●●●●●●●●●● ●●●●● ● ●●●● ●●●●● ●●● ●●● ● ●●●●●●●●●●●●●●●●●● ●●●●●●●● ●●● ●●●● ●● ●●●● ● ●●● ●●● ●● ●● ●●● ● ●●●●●●●●●● ●●●●● ● ●●●● ●●●●● ●●● ●●● ● ●●●●●●●●●●●●●●●●●● ●●●●●●●● ●●● ●●●● ●● ●●●● ● ●●● ●●with stability selection: choice of initial λ-tuning parameter doesnot matter much (as proved by our theory)just need to fix the finite-sample control


●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●permutation of variablesvarying the regularization parameter for the null-caseλ = 0.065●●●●● ●●●●●● ●●●●● Graphical Lasso● ●●● ●●● ●●●●●●●●●●●●●● ●●●● ●●● ●●●●●●●●●●●●●●●●●●● ●●●● ●●λ = 0.063●●●●● ●●●●●● ●●●●● ● ●●●●●●● ● ● ● ●●●●●●●●●● ●● ●●● ●●● ●●●●●●●●●●●●●● ●●●● ●●● ●●●●●●●●●●●●●●●●●●● ●●●● ●●λ = 0.061●●●●● ●●●●●● ●●●●● ● ●●●●●●● ● ● ● ●●●●●●●●●● ●● ●●● ●●● ●●●●●●●●●●●●●● ●●●● ●●● ●●●●●●●●●●●●●●●●●●● ●●●● ●●λ = 0.059●●●●● ●●●●●● ●●●●● ● ●●●●●●● ● ● ● ●●●●●●●●●● ●● ●●● ●●● ●●●●●●●●●●●●●● ●●●● ●●● ●●●●●●●●●●●●●●●●●●● ●●●● ●●λ = 0.057●●●●● ●●●●●● ●●●●● ● ●●●●●●● ● ● ● ●●●●●●●●●● ●● ●●● ●●● ●●●●●●●●●●●●●● ●●●● ●●● ●●●●●●●●●●●●●●●●●●● ●●●● ●●λ = 0.055●●●●● ●●●●●● ●●●●● ● ●●●●●●● ● ● ● ●●●●●●●●●● ●● ●●● ●●● ●●●●●●●●●●●●●● ●●●● ●●● ●●●●●●●●●●●●●●●●●●● ●●●● ●●● ●●●●●●● ● ● ● ●●●●●●●●●● ●●●●●● ●●●●●● ●●●●● <strong>Stability</strong> <strong>Selection</strong>● ●●● ●●● ●●●●●●●●●●●●●● ●●●● ●●● ●●●●●●●●●●●●●●●●●●● ●●●● ●●●●●●● ●●●●●● ●●●●● ● ●●●●●●● ● ● ● ●●●●●●●●●● ●● ●●● ●●● ●●●●●●●●●●●●●● ●●●● ●●● ●●●●●●●●●●●●●●●●●●● ●●●● ●●●●●●● ●●●●●● ●●●●● ● ●●●●●●● ● ● ● ●●●●●●●●●● ●● ●●● ●●● ●●●●●●●●●●●●●● ●●●● ●●● ●●●●●●●●●●●●●●●●●●● ●●●● ●●●●●●● ●●●●●● ●●●●● ● ●●●●●●● ● ● ● ●●●●●●●●●● ●● ●●● ●●● ●●●●●●●●●●●●●● ●●●● ●●● ●●●●●●●●●●●●●●●●●●● ●●●● ●●●●●●● ●●●●●● ●●●●● ● ●●●●●●● ● ● ● ●●●●●●●●●● ●● ●●● ●●● ●●●●●●●●●●●●●● ●●●● ●●● ●●●●●●●●●●●●●●●●●●● ●●●● ●●●●●●● ●●●●●● ●●●●● ● ●●●●●●● ● ● ● ●●●●●●●●●● ●● ●●● ●●● ●●●●●●●●●●●●●● ●●●● ●●● ●●●●●●●●●●●●●●●●●●● ●●●● ●●● ●●●●●●● ● ● ● ●●●●●●●●●● ●with stability selection: the number of false positives is indeedcontrolled (as proved by our theory)<strong>and</strong> here: exchangeability condition holds


Predicting causal effects from observational data↘ error rateCausal gene rankingsummary median errorGene rank effect expression (PCER) name1 AT2G45660 1 0.60 5.07 0.0017 AGL20 (SOC1)2 AT4G24010 2 0.61 5.69 0.0021 ATCSLG13 AT1G15520 2 0.58 5.42 0.0017 PDR124 AT3G02920 5 0.58 7.44 0.0024 replication protein-related5 AT5G43610 5 0.41 4.98 0.0101 ATSUC66 AT4G00650 7 0.48 5.56 0.0020 FRI7 AT1G24070 8 0.57 6.13 0.0026 ATCSLA108 AT1G19940 9 0.53 5.13 0.0019 AtGH9B59 AT3G61170 9 0.51 5.12 0.0034 protein coding10 AT1G32375 10 0.54 5.21 0.0031 protein coding11 AT2G15320 10 0.50 5.57 0.0027 protein coding12 AT2G28120 10 0.49 6.45 0.0026 protein coding13 AT2G16510 13 0.50 10.7 0.0023 AVAP514 AT3G14630 13 0.48 4.87 0.0039 CYP72A915 AT1G11800 15 0.51 6.97 0.0028 protein coding16 AT5G44800 16 0.32 6.55 0.0704 CHR417 AT3G50660 17 0.40 7.60 0.0059 DWF418 AT5G10140 19 0.30 10.3 0.0064 FLC19 AT1G24110 20 0.49 4.66 0.0059 peroxidase, putative20 AT1G27030 20 0.45 10.1 0.0059 unknown protein• biological validation by gene knockout experiments in progress.❀ see Friday...


The Lasso <strong>and</strong> its stability path <strong>and</strong> why <strong>Stability</strong> <strong>Selection</strong> works so wellriboflavin example: n = 71, p = 4099sparsity s 0 “=” 6 (6 “relevant” genes;all other variables permuted)Lasso<strong>Stability</strong> selectionwith stability selection: the 4-6 “true” variables are sticking outmuch more clearly from noise covariates


stability selection cannot be reproduced by simply selecting theright penalty with Lassostability selection provides a fundamentally new solution


Leo Breiman<strong>and</strong> providing error controlin terms of E[V ] (❀ conservative FWER control)


providing error control:E[V ] ≤1 q 22π thr − 1 p◮ super-simple!◮ over-simplistic?


Comparative conclusionswithout conditions on the “design matrix” or “covariancestructure” (etc.):cannot assign strengths or uncertainty of a variable or astructural component (e.g. edge in a graph) in ahigh-dimensional setting<strong>and</strong> these conditions are typically uncheckable...


three “easy to use” methods in comparison:method assumptions applicabilitymulti sample splitting compatibility condition GLMsbeta-min conditionstability selection exchangeability condition “all”the less assumptions, the more trustable the statisticalinference!“yet”, given the necessity of often uncheckable “designconditions”:confirmatory high-dimensional inference remains challenging

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!