[R] How can I use IPF function correctly?

David L Carlson dcarlson at tamu.edu
Fri Jul 27 23:11:43 CEST 2012


It is not clear what you are trying to do. The ipf() function you are using
seems to be the one included in package cat for imputing missing values for
categorical variables. For ipf() you have not read the instructions
carefully because you have entered the marginal values, not their dimensions
and you have given ipf() a 2 way table but miss-specified a three way model.
No wonder it is confused. Function loglin() which is part of the included
stats package also does iterative proportional fitting. 

Iterative proportional fitting (ipf) is used for fitting models for
categorical data when there are three or more variables. There is no need
for ipf on a table with two variables since, the values can be directly
calculated. 

Your example data does not include the raw data counts (as it should), but
percentages for each of the 3 x 2 cells (I assume, since they sum to 100).
The marginal values you list (again percentages) are for a model assuming
equal margins. That is easily computed as 1/3*1/2*100 (one third in each row
by one half in each column times 100). So each cell should be 16.667 percent
of the total. Using loglin() that would be specified as follows:

> loglin(raw, margin=list(0), fit=TRUE)
0 iterations: deviation  
$lrt
[1] 25.87661

$pearson
[1] 23.80933

$df
[1] 5

$margin
[1] 0

$fit
         [,1]     [,2]
[1,] 16.66667 16.66667
[2,] 16.66667 16.66667
[3,] 16.66667 16.66667

The lrt and pearson statistics are not valid because you are not using
original counts. Note that the number of iterations is 0 because in a 2 way
model the values are directly computed.

----------------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Miao Zhang
> Sent: Friday, July 27, 2012 6:52 AM
> To: r-help at r-project.org
> Subject: [R] How can I use IPF function correctly?
> 
> Hi All,
> I am trying to creat a simple example byusing ipf function in R, but i
> could not get it succefully...I am very new to R, does anyone could
> help,
> to instruct me about this ipf fucntion?
>  Actually, this is what I mean
> 
>                   50   | 50
>               ----------------------
>         33.4| 28.57 | 14.29
>         33.3| 23.81 | 4.762
>         33.3| 9.523 | 19.05
>               ----------------------
> A 3*2 matrix
> raw<-matrix(c(28.571,14.286,23.809,4.762,9.523,19.049),3, 2,byrow=TRUE)
> the sum of margin (the value I am setting as the target)
> m<-c(33.4,50,0,33.3,50,0,33.3,50)
> then call ipf function:
>  fit1<-ipf(table, margins=m,start=raw,eps = 1e-04, maxits = 50, showits
> =
>     TRUE)
> I could calculate it by hand with 7 iterations, but end by I am hoping
> to
> get R build in ipf function to get it done, what should I put "table"
> here?
> Thanks in advance!
> Mandy
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list