One of the most frequent statistical problems is testing hypotheses about the mean of the samples considered.
This test is used to check hypotheses about the fact that the mean of random variable X equals to given μ. Testing sample should be a sample of a normal random variable. During its work, the test calculates t-statistic:
If X has a normal distribution, the t-statistic will have Student's distribution with N-1 degrees of freedom. This allows the use of the Student's distribution to define the significance level which corresponds to the value of t-statistic.
Note #1
If X is not normal, t will have an unknown distribution and, strictly speaking, the t-test is inapplicable. However, according to the central limit theorem, as the sample size increases, the distribution of t tends to be normal. Therefore, if the sample size is big, we can use the t-test even if X is not normal. But there is no way to find out what value is big enough. This value depends on how X deviates from the normal distribution. Some sources claim that N should be greater than 30, but sometimes even this size is not enough. Alternatively, we can use non-parametric test: sign test or Wilcoxon rank-sign test.
Subroutine StudentTTest1 returns three p-values:
This test checks hypotheses about the fact that the means of two random variables X and Y which are represented by samples xS and yS are equal. The test works correctly under the following conditions:
During its work, the test calculates t-statistic:
If X and Y have a normal distribution, the t-statistic will have Student's distribution with NX+NY-2 degrees of freedom. This allows the use of the Student's distribution to define a significance level which corresponds to the value of t-statistic.
Note #2
If X or Y is not normal, t will have an unknown distribution and, strictly speaking, the t-test is inapplicable. However, according to the central limit theorem, as the sample sizes increase, the distribution of t tends to be normal. Therefore, if sample sizes are big enough, we can use the t-test even if X or Y is not normal. But there is no way to find what values for NX and NY are big enough. These values depend on how X and Y deviate from the normal distribution. Some sources claim that NX+NY should be greater than 40, but sometimes even these sizes are not enough. If you are not confident that distributions are normal, it's better to use non-parametric test: Mann-Whitney U-test.
Subroutine StudentTTest2 returns three p-values:
This test checks hypotheses about the fact that the means of two random variables X and Y which are represented by samples xS and yS are equal. The test works correctly under the following conditions:
Dispersion equality is not required.
During its work, the test calculates the t-statistic:
If X and Y have a normal distribution, the t-statistic will have Student's distribution with DF degrees of freedom:
This allows the use of the Student's distribution to define the significance level which corresponds to the value of the t-statistic.
Note #3
If X or Y is not normal, t will have an unknown distribution and, strictly speaking, the t-test is inapplicable. However, according to the central limit theorem, as the sample sizes increase, the distribution of t tends to be normal. Therefore, if sample sizes are big enough, we can use the t-test even if X or Y is not normal. But there is no way to find what values for NX and NY are big enough. These values depend on how X and Y deviate from the normal distribution. Some sources claim that NX +NY should be greater than 40, but sometimes even these sizes are not enough. If you are not confident that the distributions are normal, it's better to use non-parametric test: Mann-Whitney U-test.
Subroutine UnequalVarianceTTest returns three p-values:
This article is licensed for personal use only.
ALGLIB Project offers you two editions of ALGLIB:
ALGLIB Free Edition:
+delivered for free
+offers full set of numerical functionality
+extensive algorithmic optimizations
-no multithreading
-non-commercial license
ALGLIB Commercial Edition:
+flexible pricing
+offers full set of numerical functionality
+extensive algorithmic optimizations
+high performance (SMP, SIMD)
+commercial license with support plan
Links to download sections for Free and Commercial editions can be found below: