Transcription of Using R for Introductory Calculus and Statistics - …
1 Using R for Introductory Calculus and StatisticsDaniel KaplanMacalester CollegeAugust 9, 2007 Slide 1/35 Daniel KaplanUsing R for Introductory Calculus and StatisticsBackgroundII have been Using R for 11 years for Introductory years ago we started to revise our year-one introductorycurriculum: Calculus and and Statistics topics were entirely unrelated theme of the revision was applied multivariate ties together the Calculus and Statistics wanted a computing platform that could support bothCalculus and is still resistence from faculty who do not appreciate thevalue of an integrated approach and who want to use apackage that they are familiar with: Mathematica, Excel,SPSS, STATAS lide 2/35 Daniel KaplanUsing R for Introductory Calculus and StatisticsApplied Calculus .
2 GoalsIIntended for students who do not plan to take a multi-coursecalculus them the math they need to work in their field ofinterest, rather than the foundation for future math coursesthey will never 3/35 Daniel KaplanUsing R for Introductory Calculus and StatisticsApplied Calculus : TopicsIChange: ordinary, partial, and directional : including fitting and contrained :Ifunction building blocks: linear, polynomial, exp, sin,power-lawIfunctions of multiple variablesIdifference & differential equations & the phase planeIunits and : polynomials to 2nd order in two variables, ,bicycle speed as function of hill steepness and gear. There isan interaction between steepness and 4/35 Daniel KaplanUsing R for Introductory Calculus and StatisticsIntroduction to Statistical Modeling: GoalsGive students the conceptual understanding and specific skills theyneed to address real statistical issues in their fields of explicitly that client fields routinely work withmultiple provides the foundations for doing to provide a unified framework that applies to manydifferent fields Using different methods and of the conventional course.
3 IIt assumes that we need to teach students about t-tests, absurdly, that they can figure out the multivariate stuff ontheir 5/35 Daniel KaplanUsing R for Introductory Calculus and StatisticsIntroduction to Statistical Modeling: TopicsILinear models: interpretation of terms (incl. interactionterms), meaning of coefficients, fittingIIssues of collinearity: Simpson s paradox, degrees of freedom, inferential techniques:IBootstrapping and simulation to develop conceptsI Black box normal theory resultsIAnovaITheory is presented in a geometrical 6/35 Daniel KaplanUsing R for Introductory Calculus and StatisticsWho takes these courses?IMore than 100 students each year (out of a class size of 450).
4 ICalculus and Statistics required for the biology majors take it before majors are required to take Statistics (very unusual!).They take it after linear 2/3 of Calculus students have had some Calculus inhigh 1/3 of Statistics students have had an AP-typestatistics course in high 7/35 Daniel KaplanUsing R for Introductory Calculus and StatisticsWhat Makes R Effective?IFree, multi-platformIPowerful & integrated with based & modeling languageIExtensible, programmableIFunctional style, incl. lazy evaluation. This allows sensiblecommand-line 8/35 Daniel KaplanUsing R for Introductory Calculus and StatisticsExample from Calculus : FunctionsWhat students need to know about functions:IFunctions take one or more arguments and return a a function describes the a function to arguments produces the supports definition with little syntactical overheadf = function(x){ x^2 + 2*x }and application is very easy> f(3)[1] 15R emphasizes that the function itself is a thing, distinct from itsapplication:> ffunction(x){ x^2 + 2*x }Slide 9/35 Daniel KaplanUsing R for Introductory Calculus and StatisticsFunctions.
5 What s missingSimple support for multivariate functions with vector arguments, would be nice to be able to say,f = function([x,y,z]){ x^2 + 2*x*y + sqrt(z)*x }Currently, I have to sayf = function(v){ v[1]^2 + 2*v[1]*v[2] + sqrt(v[3])*v[1] }This isn t terrible, but it s hard to read and introduces more syntaxand concepts ( , indexing)Slide 10/35 Daniel KaplanUsing R for Introductory Calculus and StatisticsVectors: What s Missing?ISimple, concise operations for assembling matrices. It s uglyto say:> M = cbind( rbind(1,2,3), rbind(6,5,4) )[,1] [,2][1,] 1 6[2,] 2 5[3,] 3 4 IMatlab-like consistency. If you extract a column from amatrix, it should be a column.
6 NOT> M[,1][1] 1 2 3 Slide 11/35 Daniel KaplanUsing R for Introductory Calculus and StatisticsExample from Calculus : DifferentiationWhat students need to know about the derivative a function as input, produces a function as output function gives the slope of the input function atany PRIMARILY:IAlgebraic algorithms for transforms: ,xn nxn 1 IThe theory of the simple differentiation operator:D = function(f,delta=.000001){function(x){ (f(x+delta) - f(x-delta))/(2*delta)} }Slide 12/35 Daniel KaplanUsing R for Introductory Calculus and StatisticsUsing D> f = function(x){x^2 + 2*x}> plot(f, 0, 10)> plot(D(f), 0, 10) 10 5051002060100xf (x)Numerical pathology of(D(D(f)))> plot(D(D(f)), 0, 10) 10 (D(f)) (x)Slide 13/35 Daniel KaplanUsing R for Introductory Calculus and StatisticsUsing D> f = function(x){x^2 + 2*x}> plot(f, 0, 10)> plot(D(f), 0, 10) 10 5051002060100xf (x) 10 50510 1001020xD(f) (x)Numerical pathology of(D(D(f)))> plot(D(D(f)), 0, 10) 10 (D(f)) (x)Slide 13/35 Daniel KaplanUsing R for Introductory Calculus and StatisticsUsing D> f = function(x)
7 {x^2 + 2*x}> plot(f, 0, 10)> plot(D(f), 0, 10) 10 5051002060100xf (x) 10 50510 1001020xD(f) (x)Numerical pathology of(D(D(f)))> plot(D(D(f)), 0, 10) 10 (D(f)) (x)Slide 13/35 Daniel KaplanUsing R for Introductory Calculus and StatisticsWhy not the built-in D?IIt doesn t reinforce the notion of an operator on s too complicated.> g = deriv( ~ sin( 3*x), x )> gexpression({.expr1 <- 3 * x; .value <- sin(.expr1).grad <- array(0, c(length(.value), 1), list(NULL, c("x"))).grad[, "x"] <- cos(.expr1) * 3; attr(.value, "gradient") <- . })> x = 7> eval(g)[1] (,"gradient")x[1,] need to understand better the relationship between functions andformulas, and operations on formulas for extracting 14/35 Daniel KaplanUsing R for Introductory Calculus and StatisticsWhy not the built-in D?
8 IIt doesn t reinforce the notion of an operator on s too complicated.> g = deriv( ~ sin( 3*x), x )> gexpression({.expr1 <- 3 * x; .value <- sin(.expr1).grad <- array(0, c(length(.value), 1), list(NULL, c("x"))).grad[, "x"] <- cos(.expr1) * 3; attr(.value, "gradient") <- . })> x = 7> eval(g)[1] (,"gradient")x[1,] need to understand better the relationship between functions andformulas, and operations on formulas for extracting 14/35 Daniel KaplanUsing R for Introductory Calculus and StatisticsExample: Fitting Linear ModelsR makes this amazingly easy.> g = ( )family father mother sex height nkids1 1 M 42 1 F 2 M 4and so on> lm( height ~ sex + father, data=g)(Intercept) sexM > lm( height ~ sex + father + mother, data=g)(Intercept) sexM father 15/35 Daniel KaplanUsing R for Introductory Calculus and StatisticsOperating on the results of linear modelingSum of squares relationship.
9 > sum( g$height^2)[1] 4013892> m1 = lm( height ~ sex + father, data=g)> sum( m1$fitted^2) + sum( m1$resid^2)[1] 4013892> m2 = lm( height ~ sex + father + mother, data=g)> sum( m2$fitted^2) + sum( m2$resid^2)[1] 4013892 Orthogonality of fitted and residual> sum( m2$fitted * m2$resid )[1] -- essentially 0 Slide 16/35 Daniel KaplanUsing R for Introductory Calculus and StatisticsModeling: What s missingSyntax is not forgiving of small mistakes:IMis-spelled column name:> sum( g$heights )[1] 0> sum( g$height )[1] argument confounding. You flip 50 fair coins. Where sthe 10th percentile on the number of heads?> qbinom( .10, size=50, prob=.5)[1] 20> qbinom(.)
10 10, size=50, p=.5)[1] 5 Slide 17/35 Daniel KaplanUsing R for Introductory Calculus and StatisticsStandard summaries are very easy> m3 = lm( height ~ sex + father + mother + nkids, data=g)> summary(m3)Estimate Std. Error t value Pr(>|t|)(Intercept) < 2e-16father < 2e-16mother < 2e-16nkids > anova(m3)Df Sum Sq Mean Sq F value Pr(>F)(Intercept) 1 4002377 4002377 +05 <2e-16sex 1 5875 5875 +03 <2e-16father 1 1001 1001 +02 <2e-16mother 1 490 490 +02 <2e-16nkids 1 12 12 +00 893 4137 5 Note: I added the Intercept term to theAnovatable.