Prof. Dr. Estate Khmaladze: K. Pearson(1900) and the modern theory of distribution-free testing (Not a historic talk and not a review)

05.12.2018

Prof. Dr. Estate Khmaladze
School of Mathematics and Statistics
Victoria University of Wellington

Date: December 5, 2018
Time: 09.00 - 10.00
Place: Y27 H28

Title: K. Pearson(1900) and the modern theory of distribution-free testing (Not a historic talk and not a review)

Given a discrete distribution $(p_k)_{k=1}^m, m<\infty,$ and given frequencies of the corresponding disjoint events in $n$ independent trials, K.Pearson in 1900 invented a quadratic form to tests that the given discrete distribution is the true one and agrees with the frequencies well. The quadratic form, which was called “chi-square statistic”, has a crucial property — its limit distribution, as $n\to\infty$, does not depend on $(p_k)_{k=1}^m, m<\infty,$ if only it is true distribution. The property is called "(asymptotic) distribution-freeness”

After about 120 years, and enormous amount of research, the chi-square statistic remained the only statistic with this property.

The situation has changed in 2013 and now we have complete class of distribution free tests. The idea which led to this result is remarkably simple and could have been used many years ago, but wasn’t. In recent years we tried to demonstrate that the approach works for many parts of testing theory. For example, in testing for parametric families of distributions, for parametric regression, in martingale models for point processes and, now, for Markov chains.

In the main part of this talk I will present a result for Markov chains. That is, a result on testing the parametric family of transition matrices with tests statistics, which are asymptotically distribution free — their asymptotic distribution does not depend on the parametric family one is testing. This is done in broader framework — we do not consider particular test statistics; instead we construct an “obvious” empirical process , which compares frequencies of jumps from $i$ to $j$ with their probabilities $p_i(j)$, but then transform it into another form of empirical process, free from $p_i(j)$. Then any statistic based on the new process also will have asymptotic distribution free from $p_i(j)$.