This with this post I am expanding on the previous post on split testings. In particular, the last post assumed that the test groups had distinct and unknown means and variances. While that case is interesting it might not be useful. The problem is that it tests if the mean or variance are different between the groups; life is too short to care about differences in variance.
Here I give a table of evidence ratios for three types of tests. First where the variance of the two groups is know. Second where the variance of the two groups are unknown but assumed to be the same. And third where the variances are unknown and allowed to be different (the previously presented case).
Before jumping into that it is useful (though maybe not interesting? look you chose to read this) to review the paradigm of Bayesian evidence as used in split testing.
Here is the prior distribution of , is the likelihood
of the parameter vector (i.e., the probability of the data given the model) and is the Bayesian evidence. The sub script denotes the choice of parameterization. The posterior distribution , is a distribution! That might seem obvious however it has a significant consequence; distributions must sum / integrate to 1. Thus . Which in turn means that .
Returning to Bayes theorem:
If we relabel A, B as the model and data then
i.e., . The probability of the data is the integral over all parameters of the prior and likelihood.
The the evidence ratio of two groups of sizes N and M with unknown means and the same known variance to one group with known variance is:
The the evidence ratio of two groups of sizes N and M with unknown means and the same unknown variance to one group with unknown variance is:
The the evidence ratio of two groups of sizes N and M with unknown means and different unknown variances to one group with unknown variance is:
A few final notes these equations look ugly; but actually they are very easy to interpret. If the ratio is greater than 1 then the data (observations) are more likely to have come from the more complex model and if they are less than 1 they are more likely to have come from the simpler model. None of the straw man garbage of the t test. Also expanding these for multiple groups does not fall into the p-val problem of multiple comparisons. Finally you probably want to compute the numerators and denominators in logs and the values get really big / small for any reasonably sized data set.