Prof. Dr. Estate Khmaladze: K. Pearson(1900) and the modern theory of distribution-free testing (Not a historic talk and not a review)
s05.12.2018
Prof. Dr. Estate Khmaladze
School of Mathematics and Statistics
Victoria University of Wellington
Date: December 5, 2018
Time: 09.00 - 10.00
Place: Y27 H28
Title: K. Pearson(1900) and the modern theory of distribution-free testing
(Not a historic talk and not a review)
Given a discrete distribution $(p_k)_{k=1}^m, m<\infty,$ and given frequencies
of the corresponding disjoint events in $n$ independent trials, K.Pearson in
1900 invented a quadratic form to tests that the given discrete distribution is
the true one and agrees with the frequencies well. The quadratic form, which
was called “chi-square statistic”, has a crucial property — its limit distribution,
as $n\to\infty$, does not depend on $(p_k)_{k=1}^m, m<\infty,$ if only it is
true distribution. The property is called "(asymptotic) distribution-freeness”
After about 120 years, and enormous amount of research, the chi-square statistic
remained the only statistic with this property.
The situation has changed in 2013 and now we have complete class of distribution
free tests. The idea which led to this result is remarkably simple and could have been
used many years ago, but wasn’t. In recent years we tried to demonstrate that
the approach works for many parts of testing theory. For example, in testing for
parametric families of distributions, for parametric regression, in martingale
models for point processes and, now, for Markov chains.
In the main part of this talk I will present a result for Markov chains. That is,
a result on testing the parametric family of transition matrices with tests statistics,
which are asymptotically distribution free — their asymptotic distribution does
not depend on the parametric family one is testing. This is done in broader framework
— we do not consider particular test statistics; instead we construct an “obvious”
empirical process , which compares frequencies of jumps from $i$ to $j$ with their
probabilities $p_i(j)$, but then transform it into another form of empirical process,
free from $p_i(j)$. Then any statistic based on the new process also will have asymptotic
distribution free from $p_i(j)$.