[R] norm package prelim.norm

(Ted Harding) Ted.Harding at nessie.mcc.ac.uk
Thu Feb 2 02:42:16 CET 2006


On 01-Feb-06 Ted Harding wrote:
> On 01-Feb-06 Elizabeth Lawson wrote:
>> Hey eveyone!  I hope someone can help wiht this question.  I have a
>> matirux of all zeros and ones and I would like to indentify all unique
>> patterns in the rows andthe number of times the pattern occurs.   I
>> changed all zeros to NA tried to use prelim.norm to identify all
>> patterns of missing data in the rows.  I got the message 
>>    
>>   Warning message:
>> NAs introduced by coercion 
>> 
>>   Any ideas of how to get this to work?  Or are there any way to
>> indentify all the unique patterns in a huge matrix? ( 10000 x 71)
>>    
>>   Thanks for any suggestions!!
>>    
>>   Elizabeth Lawson
> 
> I think Chuck Celand has pretty well answered it: Don't worry
> about the warning, since I'm pretty sure it is generated when
> prelim.norm is calculating something else (e.g. the covariance
> matrix) and it is not related to generating prelim.norm(X)$r
> which is the list of patterns and the numbers of times they occur.
> 
> Best wsihes,
> Ted.

Sorry -- I should have read the detail of your original message
more carefully. In short, you have too many columns for prelim.norm
to work.

The long answer: prelim.norm analyses the missing data patterns
by representing the locations of NAs as integers, where the jth
bit in the binary representation of the integer is 1 for an NA,
0 for a non-NA. Hence the representation of the pattern runs out
of steam when there are more than a certain number of columns,
corresponding to the highest power of 2 that can be represented
as an integer in R.

  .Machine$integer.max
  [1] 2147483647

  2^31 -1
  [1] 2147483647

so that prelim.norm can only encode NA-patterns in an R integer
for up to 31 columns. More than that, and it will not work properly
or at all.

Check:

  X<-matrix(sample(c(0,1),87,replace=TRUE),ncol=29)
  Y<-X; Y[Y==0]<-NA
  prelim.norm(Y)$r
  [...] (no warning, 3 rows)

  X<-matrix(sample(c(0,1),90,replace=TRUE),ncol=30)
  Y<-X; Y[Y==0]<-NA
  prelim.norm(Y)$r
  [...] (no warning, 3 rows)

  X<-matrix(sample(c(0,1),93,replace=TRUE),ncol=31)
  Y<-X; Y[Y==0]<-NA
  prelim.norm(Y)$r
  [...] (no warning, 3 rows)

  X<-matrix(sample(c(0,1),93,replace=TRUE),ncol=32)
  Y<-X; Y[Y==0]<-NA
  prelim.norm(Y)$r
  [...] (3 rows, "Warning message: NAs introduced by coercion")

  X<-matrix(sample(c(0,1),93,replace=TRUE),ncol=33)
  Y<-X; Y[Y==0]<-NA
  prelim.norm(Y)$r
  [...] (2 rows, "Warning message: NAs introduced by coercion")

  X<-matrix(sample(c(0,1),93,replace=TRUE),ncol=34)
  Y<-X; Y[Y==0]<-NA
  prelim.norm(Y)$r
  [...] (1 row, "Warning message: NAs introduced by coercion")

(Try a few of these for yourself; it is very unlikely that you get
one 1 or 2 distinct rows when you have 3 rows of 30+ 0s and 1s
sampled at random).

A similar issue came up some time ago (I can't locate the thread
in the archive at the moment) in vennection with the 'mix'
package.

However, you can have as many columns as you like if you use
'unique' to identify the distinct patterns of 0s and 1s, rather
than using 'prelim.norm'.

Hoping this helps,
Ted.



--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 02-Feb-06                                       Time: 01:42:13
------------------------------ XFMail ------------------------------




More information about the R-help mailing list