Pearson Fitting

Karl Pearson showed that if we know the first four moments of a distribution, we can construct a density function that is consistent with those moments. This can provide a neat way to build density functions that approximate a given set of data. For instance, for a given data set, let us suppose that:

[Graphics:Images/index_gr_1.gif]

denoting estimates of the mean, and of the second, third and fourth central moments. The Pearson family consists of 7 main Types, so our first task is to find out which type this data is consistent with. We do this with mathStatica's PearsonPlot function:

[Graphics:Images/index_gr_2.gif]
[Graphics:Images/index_gr_3.gif]

[Graphics:Images/index_gr_4.gif]

Fig. 1: The [Graphics:Images/index_gr_5.gif] chart for the Pearson system

The big black dot in Fig. 1 is in the Type I zone. Then, the fitted Pearson density [Graphics:Images/index_gr_6.gif] and its domain are immediately given by:

[Graphics:Images/index_gr_7.gif]
[Graphics:Images/index_gr_8.gif]

The actual data used to create this example is grouped data depicting the number of sick people (freq) at different ages (X):

[Graphics:Images/index_gr_9.gif]

We can easily compare the histogram of the empirical data with our fitted Pearson pdf:

[Graphics:Images/index_gr_10.gif]

[Graphics:Images/index_gr_11.gif]

Fig. 2: The data histogram and the fitted Pearson pdf