09.07.2015 Views

Methodical Monte Carlo Experiments

Methodical Monte Carlo Experiments

Methodical Monte Carlo Experiments

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

What is this Presentation?VisionIssuesDescriptiveK.U.


What are we talking about?Many data sets are generatedProcedures are applied to dataResults of procedures are comparedDescriptiveK.U.


My Vision ThingDO NOT:DO:Think of the MC experiment as “One Giant Sequential Script”of commandsGenerate a massive block of data that needs to be saved andre-loadedCreate a simulation made up of separate functionalcomponentsReproducibly Generate Data for one “run” of the simulation.Design a function that accepts 1 data set and analyzes it.Design procedure to repeat steps 1 and 2Harvest estimates, summarize the resultsDescriptiveK.U.


Here’s what NOT to doThis is an understandable approach, one I have used with beginners> s ← 10> q ← 0 .345> s t d e ← 20> s e t . s e e d (2341234)> mydatasets ← v e c t o r ( ” l i s t ” , 100)> f o r ( i i n 1 : 1 0 0 ) {x ← 18 + 43 * r u n i f (1000)y ← s + q * x + rnorm (1000 , mean = 0 ,sd = s t d e )mydf ← d a t a . f r a m e ( x , y )mydatasets [ [ i ] ] ← mydf}> m y r e g r e s s i o n s ← l a p p l y ( mydatasets , f u n c t i o n( mydf ) lm ( y ∼ x , data = mydf ) )DescriptiveK.U.


Inspectorate That! IThe best thing about this example: it will improve your Rdata management skillsInvestigate just one example output, take the 33rd> t h e 3 3 r d r e g ← m y r e g r e s s i o n s [ [ 3 3 ] ]> a t t r i b u t e s ( t h e 3 3 r d r e g )$names[ 1 ] ” c o e f f i c i e n t s ” ” r e s i d u a l s ” ” e f f e c t s ””rank ” ” f i t t e d . v a l u e s ” ”a s s i g n ”[ 7 ] ”qr ” ” d f . r e s i d u a l ” ” x l e v e l s ”” c a l l ” ”terms ” ”model ”DescriptiveK.U.


Inspectorate That! II$ c l a s s[ 1 ] ”lm ”> summary ( t h e 3 3 r d r e g )C a l l :lm ( f o r m u l a = y ∼ x , data = mydf )R e s i d u a l s :Min 1Q Median 3Q Max−63.188 −12.509 −0.074 12 .390 63 .328C o e f f i c i e n t s :E s t i m a t e S t d . E r r o r t v a l u e Pr ( >| t| )DescriptiveK.U.


Inspectorate That! III( I n t e r c e p t ) 11 .12946 2 .09755 5 .306 1.38e−07 ***x 0 .33875 0 .05016 6 .754 2.45e−11 ***−−−S i g n i f . codes : 0 ' *** ' 0 .001 ' ** ' 0 . 0 1 ' * ' 0. 0 5 ' . ' 0 . 1 ' ' 1R e s i d u a l s t a n d a r d e r r o r : 19 . 0 4 on 998 d e g r e e so f freedomM u l t i p l e R 2 : 0 .0437 , A d j u s t e d R 2 : 0 .04275F − s t a t i s t i c : 45 . 6 1 on 1 and 998 DF, p−value :2 .445e−11DescriptiveK.U.


Inspectorate That! IV> sum33 ← summary ( t h e 3 3 r d r e g )> a t t r i b u t e s ( sum33 )$names[ 1 ] ” c a l l ” ”terms ” ”r e s i d u a l s ” ” c o e f f i c i e n t s ” ” a l i a s e d ””sigma ”[ 7 ] ”df ” ”r . s q u a r e d ” ”a d j . r . s q u a r e d ” ” f s t a t i s t i c ” ”c o v . u n s c a l e d ”$ c l a s s[ 1 ] ”summary.lm ”> c o e f ( t h e 3 3 r d r e g )DescriptiveK.U.


Inspectorate That! V( I n t e r c e p t ) x11 .1294615 0 .3387549> c o e f ( t h e 3 3 r d r e g ) [ 2 ]x0 .3387549> c o e f 3 3 ← c o e f ( t h e 3 3 r d r e g )> c o e f ( sum33 )DescriptiveK.U.


Inspectorate That! VIE s t i m a t e S t d . E r r o r t v a l u ePr ( >| t | )( I n t e r c e p t ) 11 .1294615 2 .09754569 5 .305945 1.380912e−07x 0 .3387549 0 .05015926 6 .753586 2.445331e−11> c o e f ( sum33 ) [ 2 , 1 ][ 1 ] 0 .3387549> sum33$ sigma[ 1 ] 19 .04095DescriptiveK.U.


Inspectorate That! VII> sum33$ r . s q u a r e[ 1 ] 0 .04370492DescriptiveK.U.


Collect and Summarize I> s e s t i m a t e s ← v e c t o r ( l e n g t h = 100)> f o r ( i i n 1 : 1 0 0 ) {s e s t i m a t e s [ i ] ← c o e f ( m y r e g r e s s i o n s [ [ i] ] ) [ 1 ]}> h i s t ( s e s t i m a t e s , prob = T, x l a b = ”e s t i m a t e so f the i n t e r c e p t ” , s , main = ”SamplingD i s t r i b u t i o n o f Estimated I n t e r c e p t s ” ,x l i m = 1 . 5 * range ( s e s t i m a t e s ) )> l i n e s ( d e n s i t y ( s e s t i m a t e s ) , c o l = ”red ” , l t y= 2)DescriptiveK.U.


CritiquesDo we need to simulate all of the data at once? Why?How selective should we be with saving result objects?Advantage of this approachThe random number stream is drawn from continuously, sorepetition is possibleBut only if we don’t alter N in each groupIt doesn’t generate & analyze one data set in one stepIf we hope to parallelize, want to divide work (don’t loop 100times, then loop again 100 times)DescriptiveK.U.


About Random Record Keeping in RChambers Software For Data Analysis says we should keeptrack of the random generator’s state so we can pick up thegenerator whenever we want and get “the next” number.p. 230 describes“SoDA”package’s“simulationResult”function.simulationResult has effect of running a simulation, but itkeeps as recorded values the state of the random numbergenerator at the start and at the end of the function.Read the code for “simulationResult”, it is simply using an S4class to achieve the effect of the following function.DescriptiveK.U.


Record Keeping, Doing it MyselfdoSomething ← f u n c t i o n ( whatever ) {i n S t a t e ← .Random.seedr e s u l t s ← whatever code you want t h a t drawsrandom numberso u t S t a t e ← .Random.seedl i s t ( i n S t a t e , outState , r e s u l t s )}DescriptiveK.U.


Example: OLS> a ← 2> b ← 5> s t d e ← 3> N ← 100> Nexp ← 30> s e t . s e e d (4343432)> getPhonyData ← f u n c t i o n (N, a , b , s t d e ) {i n S t a t e ← .Random.seedx ← rnorm (N, mean = 50 , sd = 100)y ← a + b * x + rnorm (N, mean = 0 , sd =s t d e )o u t S t a t e ← .Random.seedl i s t ( dat = d a t a . f r a m e ( i n p u t = x , output= y ) , i n S t a t e = i n S t a t e , o u t S t a t e =o u t S t a t e )}DescriptiveK.U.


Note: Output is a list including one data frame, plus the in andout states of the generatorDescriptiveK.U.


Create a function that analyzes an input data set I> analyzePhonyData ← f u n c t i o n ( dat ) {mymod ← lm ( output ∼ input , data = dat )}I wonder if I should return the regression and the data objectgrouped together, as inlist(mymod, dat) or perhaps just list(summary(mymod), dat).DescriptiveK.U.


Orchestrate One Run of the Exercise I> conductSim ← f u n c t i o n ( i ) {d1 ← getPhonyData (N, a , b , s t d e )r e s ← analyzePhonyData ( d1$ dat )l i s t ( r e s = r e s , d = d1 )}DescriptiveK.U.


Note structure of Output ObjectOutput is a list that includes 2 list objects:res: regression result objectd: another list containingdatinStateoutStateShould consider “throwing away” dat, since it could beregenerated from “inState”DescriptiveK.U.


Run that 100 times> mysims ← l a p p l y ( 1 : 1 0 0 , conductSim )Note: could have used replicate() insteadDescriptiveK.U.


Students like For Loops InsteadNote: Reset system seed from saved state of first run> .Random.seed ← mysims [ [ 1 ] ] $d$ i n S t a t e> mysims2 ← l i s t ( )> f o r ( i i n 1 : 1 0 0 ) {mysims2 [ [ i ] ] ← conductSim ( i )}> a l l . e q u a l ( mysims , mysims2 )[ 1 ] TRUEEquivalence of collections confirmedDescriptiveK.U.


Reformat as ”matrix” of ”list elements” I> r e g s ← d o . c a l l ( ”r b i n d ” , mysims )> dim ( r e g s )[ 1 ] 100 2> names ( r e g s [ [ 1 , 2 ] ] )[ 1 ] ”dat ” ”i n S t a t e ” ”o u t S t a t e ”> names ( r e g s [ [ 1 , 1 ] ] )DescriptiveK.U.


Reformat as ”matrix” of ”list elements” II[ 1 ] ” c o e f f i c i e n t s ” ” r e s i d u a l s ” ” e f f e c t s ””rank ” ” f i t t e d . v a l u e s ” ”a s s i g n ”[ 7 ] ”qr ” ” d f . r e s i d u a l ” ” x l e v e l s ”” c a l l ” ”terms ” ”model ”> i s . a r r a y ( r e g s )[ 1 ] TRUE> i s . a r r a y ( r e g s )[ 1 ] TRUEDescriptiveK.U.


Reformat as ”matrix” of ”list elements” III> i s . l i s t ( r e g s [ , 1 ] )[ 1 ] TRUESo, for example, regs[[1,1]] returns the first regression object.DescriptiveK.U.


”Inspect the 33rd element in column 1” ISo, for example, regs[[1,1]] returns the first regression object.> summary ( r e g s [ [ 3 3 , 1 ] ] )C a l l :lm ( f o r m u l a = output ∼ input , data = dat )R e s i d u a l s :Min 1Q Median 3Q Max−6.9005 −1.6788 0 .2666 1 .7737 8 .2765C o e f f i c i e n t s :E s t i m a t e S t d . E r r o r t v a l u e Pr ( >| t| )( I n t e r c e p t ) 2 .273297 0 .371428 6 . 1 2 1.93e−08 ***DescriptiveK.U.


”Inspect the 33rd element in column 1” IIi n p u t 5 .002491 0 .003373 1483 . 2 3 < 2e−16 ***−−−S i g n i f . codes : 0 ' *** ' 0 .001 ' ** ' 0 . 0 1 ' * ' 0. 0 5 ' . ' 0 . 1 ' ' 1R e s i d u a l s t a n d a r d e r r o r : 3 .092 on 98 d e g r e e so f freedomM u l t i p l e R 2 : 1 , A d j u s t e d R 2 : 1F − s t a t i s t i c : 2 . 2 e +06 on 1 and 98 DF, p−value :< 2 .2e−16DescriptiveK.U.


”lapply: Get Summaries For All Regression Objects”lapply returns a list of regression summaries> l s i m r e s ← l a p p l y ( ( r e g s [ , 1 ] ) , summary )DescriptiveK.U.


”sapply: Get Summaries For All Regression Objects”Sapply returns a simplified structure of same regression summaries> s s i m r e s ← s a p p l y ( ( r e g s [ , 1 ] ) , summary )> i s . m a t r i x ( s s i m r e s )[ 1 ] TRUE> dim ( s s i m r e s )[ 1 ] 11 100Descriptive> rownames ( s s i m r e s )[ 1 ] ” c a l l ” ”terms ” ”r e s i d u a l s ” ” c o e f f i c i e n t s ” ” a l i a s e d ””sigma ”[ 7 ] ”df ” ”r . s q u a r e d ” ”a d j . r . s q u a r e d ” ” f s t a t i s t i c ” ”c o v . u n s c a l e d ”K.U.


”Want to plot some R 2 s?”ssimres[8,] returns a list of 100 rsquaresR’s unlist function strips out the list structure, giving back avector.> r s q u a r e s ← s s i m r e s [ 8 , ]> u r s q u a r e s ← u n l i s t ( r s q u a r e s )> h i s t ( u r s q u a r e s , x l a b = ”100 e s t i m a t e dR−squares ” , prob = T, y l a b = ”p r o p o r t i o n ” ,main = ””)> l i n e s ( d e n s i t y ( u r s q u a r e s ) , c o l = ”red ” , l t y =2)DescriptiveK.U.


The R-square Histogramproportion0 10000 300000.99993 0.99994 0.99995 0.99996 0.99997 0.99998100 estimated R−squaresDescriptiveK.U.


CritiquesSee my R/WorkingExamples/stackListItems.R codeDescriptiveK.U.


Example: OLS IDescriptiveK.U.


Better Check That IDescriptiveK.U.


Take What You Need IDescriptiveK.U.


Sapply might be better IDescriptiveK.U.


Example: OLS IDescriptiveK.U.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!