How do the t-distribution and standard normal distribution differ, and why is t-distribution used more?

How do the t-distribution and standard normal distribution differ, and why is t-distribution used more?



For statistical inference (e.g., hypothesis testing or computing confidence intervals), why do we use the t-distribution instead of the standard normal distribution? My class started with the standard normal distribution and shifted to the t-distribution, and I am not fully sure why. Is it because t-distributions can a) deal with small sample sizes (because it gives more emphasis to the tails) or b) be more robust to a non-normally distributed sample?





Possibly related: stats.stackexchange.com/questions/285649/…
– Henry
Aug 22 at 23:09





Searcher like stats.stackexchange.com/search?q=t-distribution+normal and stats.stackexchange.com/search?q=t-test+normal will include a number of relevant posts (and a lot of other hits so you may need to add further keywords to reduce the clutter).
– Glen_b
Aug 23 at 2:18




2 Answers
2



The normal distribution (which is almost certainly returning in later chapters of your course) is much easier to motivate than the t distribution for students new to the material. The reason why you are learning about the t distribution is more or less for your first reason: the t distribution takes a single parameter—sample size minus one—and more correctly accounts for uncertainty due to (small) sample size than the normal distribution when making inferences about a sample mean of normally-distributed data, assuming that the true variance is unknown.



With increasing sample size, both t and standard normal distributions are both approximately as robust with respect to deviations from normality (as sample size increases the t distribution converges to the standard normal distribution). Nonparametric tests (which I start teaching about half way through my intro stats course) are generally much more robust to non-normality than either t or normal distributions.



Finally, you are likely going to learn tests and confidence intervals for many different distributions by the end of your course (F, $chi^2$, rank distributions—at least in their table p-values, for example).





Thank you so much for this awesome response. I now get that t-distributions can better account for small sample sizes. However, if the sample size is large (> 30), it doesn't matter whether we use a t or standard normal distribution, right?
– Jane Sully
Aug 22 at 19:26





they become very similar when the degrees of freedom rise.
– Bernhard
Aug 22 at 19:37





@JaneSully Sure, but, for inference about means of normal data, it is never wrong to use the t distribution.
– Alexis
Aug 22 at 21:13





(Also, when/if you like an answer enough to say that it has answered your question, you can "accept" it by clicking on the check mark to the top left of the question. :).
– Alexis
Aug 22 at 21:24





I disagree with this statement: "the t distribution takes a single parameter—sample size minus one—and more correctly accounts for uncertainty due to (small) sample size than the normal distribution when making inferences about a sample mean of normally-distributed data." E.g. see this lecture: onlinecourses.science.psu.edu/stat414/node/173 There's no need for t-distribution on Gaussian data when standard deviation is known. The key here is whether you do or do not know the variance, not the n-1 adjustment
– Aksakal
Aug 23 at 3:49




The reason t-distribution is used in inference instead of normal is due to the fact that the theoretical distribution of some estimators is normal (Gaussian) only when the standard deviation is known, and when it is unknown the theoretical distribution is Student t.



We rarely know the standard deviation. Usually, we estimate from the sample, so for many estimators it is theoretically more solid to use Student t distribution and not normal.



Some estimators are consistent, i.e. in layman terms, they get better when the sample size increases. Student t becomes normal when sample size is large.



Example: sample mean



Consider a mean $mu$ of the sample $x_1,x_2,dots,x_n$. We can estimate it using a usual average estimator: $bar x=frac 1 nsum_i=1^nx_i$, which you may call a sample mean.



If we want to make inference statements about the mean, such as whether a true mean $mu<0$, we can use the sample mean $bar x$ but we need to know what is its distribution. It turns out that if we knew the standard deviation $sigma$ of $x_i$, then the sample mean would be distributed around the true mean according to Gaussian: $bar xsimmathcal N(mu,sigma^2/n)$, for large enough $n$



The problem's that we rarely know $sigma$, but we can estimate its value from the sample $hatsigma$ using one of the estimators. In this case the distribution of the sample mean is no longer Gaussian, but closer to Student t distribution.






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

𛂒𛀶,𛀽𛀑𛂀𛃧𛂓𛀙𛃆𛃑𛃷𛂟𛁡𛀢𛀟𛁤𛂽𛁕𛁪𛂟𛂯,𛁞𛂧𛀴𛁄𛁠𛁼𛂿𛀤 𛂘,𛁺𛂾𛃭𛃭𛃵𛀺,𛂣𛃍𛂖𛃶 𛀸𛃀𛂖𛁶𛁏𛁚 𛂢𛂞 𛁰𛂆𛀔,𛁸𛀽𛁓𛃋𛂇𛃧𛀧𛃣𛂐𛃇,𛂂𛃻𛃲𛁬𛃞𛀧𛃃𛀅 𛂭𛁠𛁡𛃇𛀷𛃓𛁥,𛁙𛁘𛁞𛃸𛁸𛃣𛁜,𛂛,𛃿,𛁯𛂘𛂌𛃛𛁱𛃌𛂈𛂇 𛁊𛃲,𛀕𛃴𛀜 𛀶𛂆𛀶𛃟𛂉𛀣,𛂐𛁞𛁾 𛁷𛂑𛁳𛂯𛀬𛃅,𛃶𛁼

ャフサォクコ ケウ,コ,ワ メ,ロスョノ゙,クネ,フムカヤヲニ,エコ゚ツ ウイオン゙ケワサネォキモュキォウイノンコチ゚メヌナイゥフュ,カヒウネェ ネ,ホノケ,ムュキ ッボーミュハ,チ ツス ィ メウイマヤ,゙ウチ ヅ ロ,ォジヌェ ャヌット ェ,マャ,チナエヒネソキツテ トホヲヲミーァ

Node.js puppeteer - Use values from array in a loop to cycle through pages