Drew's Adventures in Statistics: 2013

Monday, February 25, 2013

nests

http://udel.edu/~mcdonald/statnested.html

http://psych.colorado.edu/~carey/Courses/PSYC5741/handouts/Nested_ANOVA.pdf

http://www3.imperial.ac.uk/pls/portallive/docs/1/1171923.PDF

Monday, February 18, 2013

http://ekhartman.berkeley.edu/work/ANOVA.pdf

http://www.statsoft.com/textbook/anova-manova/

http://animsci.agrenv.mcgill.ca/servers/anbreed/statisticsII/type1.htm

http://en.wikipedia.org/wiki/Degrees_of_freedom_(statistics)

http://www.pindling.org/Math/Statistics/Textbook/Chapter10_ANOVA/ANOVA.htm

http://statwww.epfl.ch/davison/SM/SMsample.pdf

Wednesday, February 6, 2013

Linear Regression explained on YouTube
http://www.youtube.com/watch?v=ocGEhiLwDVc

Linear Regression in R:
http://www.montefiore.ulg.ac.be/~kvansteen/GBIO0009-1/ac20092010/Class8/Using%20R%20for%20linear%20regression.pdf

Linear Regression in Excel:
http://office.microsoft.com/en-us/excel-help/perform-a-regression-analysis-HA001111963.aspx

Linear Regression in SAS:
http://www2.math.umd.edu/~slud//s798c/Handouts/Lec03Pt5B.pdf

Monday, February 4, 2013

Sassy

(*Note: Though this class is primarily focused on learning and manipulating data using the SAS or JMP statistical packages, I will be programing and posting solutions in R. I may try to post equivallen solutions in SAS simultaneously for those that are interested in learning both. R is free and does not require 22 Gazigabytes. )

T-Test:
History for the nerds-
http://en.wikipedia.org/wiki/William_Sealy_Gosset

Basic t-test with calculator-
http://www.stattools.net/tTest_Exp.php

More detailed explanation-
http://simon.cs.vt.edu/SoSci/converted/T-Dist/activity.html

Regression + ANOVA = ANCOVA

Regression:

covariance =

regression coefficient =

(*Note: The n or n-1 will cancel when the cov is divided by the var, thus whether the correction is applied or not is irrelevant)

Regression explained:
http://www.law.uchicago.edu/files/files/20.Sykes_.Regression.pdf

more simply:
http://www.stat.yale.edu/Courses/1997-98/101/linreg.htm
http://easycalculation.com/statistics/learn-regression.php

And explained well:
http://www.sjsu.edu/faculty/gerstman/StatPrimer/regression.pdf

Goodness of fit explained:
http://www.mathworks.com/help/curvefit/evaluating-goodness-of-fit.html

Regression in SAS:
http://www.ats.ucla.edu/stat/sas/webbooks/reg/chapter1/sasreg1.htm
http://www.youtube.com/watch?v=Bzm8TJYFZcs

Regression in R
http://msenux.redwoods.edu/math/R/regression.php

Model I and II regressions:
http://www.mbari.org/staff/etp3/regress/about.htm

WOOOOO!

HW # 1

To those who unfortunately are reading this as opposed to vacationing in Vegas,

Andrew Jones

Biometry

2/3/13

Small Arabinose Negative Lineages vs. Large Arabinose Negative Lineages

Average of Small-

0.765635645
0.890993539
0.860948991
0.886212273
0.859489471
0.934212218
0.945863536
0.999423109
0.899233247
0.787217193
0.938261524
0.984696833
0.83820725
0.827858702

(∑Obs)/n where n = 14.

(1) Mean = .887

(2) Var = (∑(obs-µ)^2)/(n-1)

=(.00016+.000052+.00021+.00053+.00040+.00006+.00040+.00002+.00021+.00289+. 00231+.00028+.00151+.00174)/13

=.00087

Average of Large-

0.887503593
0.907561395
0.914647822
0.877401142
0.920149004
0.907823388
0.880485947
0.896073919
0.88584494
0.954043492
0.852222311
0.9171615
0.861517592
0.942045965
0.916409303

(∑Obs)/n where n = 15.

(1) Mean = .901

(2) Var = (∑(obs-µ)^2)/(n-1)

=(.01473 + .00002 + .00068 +.00000 + .00076 + .00228 + .00346 + .01263 + .00015 + .00996 + .00263 + .00954 + .00238 + .00350)/14

=.00482

(3) Mean of means= (.901 + .887)/2 = 0.895

(4) Variance of Mean of Means((.900-.89)^2 + (.887-.894)^2)/n-1 = .000085

(5) Grand Mean

0.765635645
0.890993539
0.860948991
0.886212273
0.859489471
0.934212218
0.945863536
0.999423109
0.899233247
0.787217193
0.938261524
0.984696833
0.83820725
0.827858702
0.887503593
0.907561395
0.914647822
0.877401142
0.920149004
0.907823388
0.880485947
0.896073919
0.88584494
0.954043492
0.852222311
0.9171615
0.861517592
0.942045965
0.916409303
/19

=.8945

(6) Variance

(.0165 + .00001 + .00112 + .00007 + .00122 + .00158 + .00264 + .01102 + .00002 + .01150 + .00192 + .00814 + .00316 + .00443 + .00005 + .00017 + .00041 + .00029 + .00066 + .00018 + .00020 + .00000 + .00007 + .0036 + .00178 + .00052 + .00108 + .00227 + .00048)/28

=.00268

(7) The Weird One

Obs- .8945

-0.128817625

-0.003459731

-0.033504279

-0.008240997

-0.034963799

0.039758948

0.051410266

0.104969839

0.004779977

-0.107236077

0.043808254

0.090243563

-0.056246020

-0.066594568

-0.006949677

0.013108125

0.020194552

-0.017052128

0.025695734

0.013370118

-0.013967323

0.001620649

-0.008608330

0.059590222

-0.042230959

0.022708230

-0.032935678

0.047592695

0.021956030

(-0.128817625 + -0.003459731 + -0.033504279 + -0.008240997 + -0.034963799 + 0.039758948 + 0.051410266 + 0.104969839 + 0.004779977 + -0.107236077 + 0.043808254 + 0.090243563 + -0.056246020 + -0.066594568 + -0.006949677 + 0.013108125 + 0.020194552 + -0.017052128 + 0.025695734 + 0.013370118 -+ 0.013967323 + 0.001620649 + -0.008608330 + 0.059590222 + -0.042230959+ 0.022708230 + -0.032935678 + 0.047592695 + 0.021956030)/29

= 1.15 x 10^-17

(8) (-0.128817625- 1.15 x 10^-17)^2 + (-0.003459731- 1.15 x 10^-17) ^2 + (-0.033504279- 1.15 x 10^-17) ^2 + (-0.008240997- 1.15 x 10^-17) ^2 + (-0.034963799- 1.15 x 10^-17) ^2 + (0.039758948- 1.15 x 10^-17) ^2 + (0.051410266- 1.15 x 10^-17) ^2 + (0.104969839- 1.15 x 10^-17) ^2 + (0.004779977- 1.15 x 10^-17) ^2 + (-0.107236077- 1.15 x 10^-17) ^2 + (0.043808254- 1.15 x 10^-17) ^2 + (0.090243563- 1.15 x 10^-17) ^2 + (-0.056246020- 1.15 x 10^-17) ^2 + (-0.066594568- 1.15 x 10^-17) ^2 + (-0.006949677- 1.15 x 10^-17) ^2 + (0.013108125- 1.15 x 10^-17) ^2 + (0.020194552- 1.15 x 10^-17) ^2 + (-0.017052128- 1.15 x 10^-17) ^2 + (0.025695734- 1.15 x 10^-17) ^2 + (0.013370118- 1.15 x 10^-17) ^2 + (0.013967323- 1.15 x 10^-17) ^2 + (0.001620649- 1.15 x 10^-17) ^2 + (-0.008608330- 1.15 x 10^-17) ^2 + (0.059590222- 1.15 x 10^-17) ^2 + (-0.042230959- 1.15 x 10^-17) ^2+ (0.022708230- 1.15 x 10^-17) ^2 + (-0.032935678- 1.15 x 10^-17) ^2 + (0.047592695- 1.15 x 10^-17) ^2 + (0.021956030- 1.15 x 10^-17) ^2

All divided by 29

=0.0026

magic

Monday, January 28, 2013

Hypothesis Testing

Chapters 1,2,4,5, and 6 covered. Chapter 3 is 'bonus'.

Bayesian (boo) vs. Frequentists (yea!):

http://oikosjournal.wordpress.com/2011/10/11/frequentist-vs-bayesian-statistics-resources-to-help-you-choose/

Bayes' Theorem explained:

http://betterexplained.com/articles/an-intuitive-and-short-explanation-of-bayes-theorem/

For R programmers check the following link:

http://meandering-through-mathematics.blogspot.com/2011/05/bayesian-probability.html

Resource summing up hypothesis testing:
http://www.sjsu.edu/faculty/gerstman/StatPrimer/hyp-test.pdf

Type I and Type II error:
Type I- Falsely rejecting the null hypothesis. To accept the significance of our result mistakenly.
Type II- The opposite. Falsely rejecting the significance of a result. Falsely accepting the null hypothesis.

For a video on type I error:

http://www.khanacademy.org/math/probability/statistics-inferential/hypothesis-testing/v/type-1-errors

(Aside ** A link for the Bonferroni correction explained:

http://www.aaos.org/news/aaosnow/apr12/research7.asp )

The null hypothesis for s's and g's:
http://www.null-hypothesis.co.uk/science//item/what_is_a_null_hypothesis

What is a model anyway?:
http://www.sportsci.org/resource/stats/models.html

Friday, January 25, 2013

Stats 1/25/2013

Big N little n What begins with those?
Nine new neckties and a nightshirt and a nose.

Big N = Population. Little n = sample.

(n-1) explained:

http://sinairv.wordpress.com/2011/08/21/sample-variance-vs-population-variance-bessels-correction/

And if you are really bored at night:

http://en.wikipedia.org/wiki/Bessel's_correction

http://en.wikipedia.org/wiki/Friedrich_Bessel

Dividing standard deviation by the mean is the coefficient of variation. Great for analyzing variation between populations.

$c_v = \frac{\sigma}{\mu}$

Standard Error of the Mean (SEM) =

**(n-1) again for samples.**

Standard error is what is typically used instead of standard deviations. As such, error bars in graphs are typically calculated using the standard error.

Kurtosis:

Next week!! Hypothesis testing and the assumption of our distributions.

http://mathworld.wolfram.com/HypothesisTesting.html

Friday, January 18, 2013

Stats 1/18/2013

How to Look at Graphs: Frequency Distribuions

Bin size...can turn bins into classes
Random distribution should be a clumped distribution. This is because one that appears evenly dispersed may be hyper-dispersed, which is a non random separation of the data. For an example, check out this site: http://2600hertz.wordpress.com/2010/03/12/how-random-is-random/

Mean, Median, and Mode:
http://www.fgse.nova.edu/edl/secure/stats/lesson1.htm

Geometric Mean:
http://www.cliffsnotes.com/study_guide/Geometric-Mean.topicArticleId-18851,articleId-18817.html

Range show distance between most extreme values.

And the standard (NOT AVERAGE) deviation:

NOTE** the n-1 (vs. n) is used for samples versus the entire population. See fudge factors next week

Or the variance:

Thus: http://en.wikipedia.org/wiki/68-95-99.7_rule

To compare deviations of two different populations that may be on different scales:

To analyze which of two samples from two different populations differs 'more' from the mean:

TYPES OF DISTRIBUTIONS:

Poisson:

http://eprints.ma.man.ac.uk/894/02/0-19-852868-X.pdf

m=8 is a special case called the normal distribution.

Friendly fudge factor next week!!

Wednesday, January 16, 2013

There are three kinds of lies: lies, damned lies, and statistics

There are three kinds of lies: lies, damned lies, and statistics -Marky Mark Twain

An observation (~individual) defined:
http://epp.eurostat.ec.europa.eu/statistics_explained/index.php/Glossary:Observation_unit

A sample (~population) defined:
http://www.stats.gla.ac.uk/steps/glossary/sampling.html

"PCA principle components analysis is regression in more than two dimensions" - Francisco Moore

Repeated measures will be revisited and can be seen here:
http://biostat.mc.vanderbilt.edu/wiki/pub/Main/ClinStat/repmeas.PDF

Types of variables:
http://www.unesco.org/webworld/idams/advguide/Chapt1_3.htm

Meristic for fish people:
http://en.wikipedia.org/wiki/Meristics

A paper on error and philosophy:
http://www.ets.org/Media/Research/pdf/PICANG12.pdf

'Never ever ever ever ever used a derived variable in stats. Unless you have to. Distributions get wonky. ' paraphrased -Francisco B.G. Moore the University of Akron Summit on Statistical Analysis 1-16-13

Drew's Adventures in Statistics