The well-known hidden Markov model (HMM) is a two-dimensional stochastic process (X,Y), where Y is a Markov chain and conditionally on Y, the X-process consists of independent random variables, the distribution of the random variable X_t depending on Y_t, only. Over the last decades, HMM’s have become very popular stochastic models with applications to speech recognition, signal processing, linguistic, computational molecular biology and so on. Often the Y-process is unobserved (hidden) and the goal of the inference is to estimate its unobserved realization based on a realization of X-process. This task is called the segmentation problem and the standard ways to solve it is to use either maximum likelihood (so-called Viterbi) path or pointwise maximum likelihood (so-called PMAP) path.
A trivial but important property of HMM is that the process Z=(X,Y) is itself a Markov process with a product state space. This observation allows naturally enlarge the class of HMM’s to the class of pairwise Markov models (PMM) as follows: Z=(X,Y) is a PMM if Z has Markov property. Now it is clear that PMM’s are a much larger class of models whose HMM’s is just a little subclass. We briefly discuss several PMM’s like Markov switching models and HMM’s with dependent noise. It is important to note that if (X,Y) is a Markov process, then neither X nor Y need to have Markov property, but conditionally on X, the Y-process is Markov and vice versa.
It turns out that many good properties of HMM’s are mainly due to the Markov property of Z and hence these properties carry on to PMM’s as well. In particular the well-known Viterbi and forward-backward algorithms apply and so standard segmentation approaches can be applied in the case of PMM’s. Moreover, PMM-models provide a rather flexible and realistic model for the homology of random sequences. A triplet Markov model (TMM), introduced by W. Pieczynski, is a three-dimensional Markov process (X,Y,U), where, as previously, X stands for observations and Y is the hidden state sequence of interest. But in addition, there is another hidden component U. Since conditionally on U, the pair (X,Y) is an inhomogeneous PMM, the U-component models now the change of environment. It turns out that adding the U-component makes the model really flexible.
We give a general approach to the risk-based segmentation problem that also applies for PMM’s and TMM’s, discuss the weaknesses standard approaches and introduce a way to overcome these problems. We also discuss the asymptotics of Viterbi segmentation for PMM’s.