[R] Removing Rows/Records from a Table
Marc Schwartz
MSchwartz at mn.rr.com
Sat Apr 15 19:01:56 CEST 2006
On Sat, 2006-04-15 at 08:19 -0700, Peter Lauren wrote:
> I would like to selectively remove rows from a table.
> I had hoped that I could create a table and
> selectively add rows with something like
> > NewTable<-table(nrow=100, ncol=4)
> > NewTable[1,]<-OldTable[10,]
>
> but that doesn't work. The former call gives
> > NewTable
> ncol
> nrow 4
> 100 1
> while the latter call gives a table the length of
> OldTable. Making a matrix, m, with the desired
> table entries and doing
> >NewTable-table(m)
> also doesn't work.
>
> Can anyone suggest the best way for me to do what I
> want to do?
>
> Many thanks in advance,
> Peter Lauren.
First, I think that we need to clarify terminology, as you seem to be
mixing tables and matrices (or at least, the intention of the table()
function). See ?table and ?matrix.
The table() function creates and returns a contingency table based upon
the [cross-]tabulation of one or more objects, such as vectors, factors
or lists. The contingency table interprets these objects as factors,
generating the frequency counts of each combination of the factor
levels. See ?factor for more information.
So...for example, we can generate a table of the counts of the possibly
repeating unique elements in a single vector:
set.seed(1)
vec <- sample(letters[1:4], 10, replace = TRUE)
> vec
[1] "b" "b" "c" "d" "a" "d" "d" "c" "c" "a"
> table(vec)
vec
a b c d
2 2 3 3
Or...we can generate a 2d contingency table of the cross-tabulation of
two vectors:
set.seed(2)
vec2 <- sample(LETTERS[1:4], 10, replace = TRUE)
> vec2
[1] "A" "C" "C" "A" "D" "D" "A" "D" "B" "C"
> table(vec, vec2)
vec2
vec A B C D
a 0 0 1 1
b 1 0 1 0
c 0 1 1 1
d 2 0 0 1
So, here we have the result of the combinations of letters found in the
two vectors, based upon in effect pairing the two vectors in order. It
may be easier to visualize them together in this fashion (see ?rbind):
> rbind(vec, vec2)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
vec "b" "b" "c" "d" "a" "d" "d" "c" "c" "a"
vec2 "A" "C" "C" "A" "D" "D" "A" "D" "B" "C"
Note for example, that there are 2 occurrences of 'd' paired with 'A' in
columns 4 and 7, which is reflected in the lower left hand corner of the
table above.
The table() function does not just create an n-dimensional matrix but
actually manipulates the data passed to it to create the counts in the
resultant contingency table.
Note that in the case of the second example, the result is an object of
class 'table', which is in essence, a 2d integer matrix of the counts,
with additional attributes (see ?str for more information):
str(table(vec, vec2))
int [1:4, 1:4] 0 1 0 2 0 0 1 0 1 1 ... # This is the result matrix
- attr(*, "dimnames")=List of 2 # These are the row/col names
..$ vec : chr [1:4] "a" "b" "c" "d"
..$ vec2: chr [1:4] "A" "B" "C" "D"
- attr(*, "class")= chr "table" # This shows the object class
Now, let's contrast that process with the creation of an integer
matrix.
vec3 <- 1:10
> vec3
[1] 1 2 3 4 5 6 7 8 9 10
> matrix(vec3, ncol = 2, nrow = 5)
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 3 8
[4,] 4 9
[5,] 5 10
Note that we have taken the 1d vector and converted it to a 2d matrix
with 2 columns and 5 rows. There is no manipulation of the data, simply
a restructuring of the object. By default, the matrix is created in
column order. We can change the order of creation by using 'byrow':
> matrix(vec3, ncol = 2, nrow = 5, byrow = TRUE)
[,1] [,2]
[1,] 1 2
[2,] 3 4
[3,] 5 6
[4,] 7 8
[5,] 9 10
And...as a quick short cut, we can also do this, which yields the same
result as the first use of matrix() above:
dim(vec3) <- c(5, 2)
> vec3
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 3 8
[4,] 4 9
[5,] 5 10
This simply shows that a matrix is a vector with a 'dim' attribute.
Now, back to the original question which is the removal (or could be
adding) of rows (or columns) to a matrix, whether the result of a
matrix() type operation or the result of using the table() function.
Let's take the result of the table operation in the second example:
tab <- table(vec, vec2)
> tab
vec2
vec A B C D
a 0 0 1 1
b 1 0 1 0
c 0 1 1 1
d 2 0 0 1
Now, we want to remove the third row:
> tab[-3, ]
vec2
vec A B C D
a 0 0 1 1
b 1 0 1 0
d 2 0 0 1
The same syntax can be used on the integer matrix we created above:
mat <- matrix(vec3, ncol = 2, nrow = 5)
> mat
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 3 8
[4,] 4 9
[5,] 5 10
> mat[-3, ]
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 4 9
[4,] 5 10
So, in both cases, we can manipulate the resultant object by using
standard object indexing. See ?Extract for more information.
The key is that the table() function does not just create a matrix (in
the case of two or more objects being passed), but that it actually
manipulates those objects internally to create a contingency table.
Thus, the result of your first example:
NewTable <- table(nrow = 100, ncol = 4)
is the creation of a table, based upon passing two objects:
nrow <- 100
ncol <- 4
resulting in:
> NewTable
ncol
nrow 4
100 1
showing that there is 1 occurrence of the combination of 100 with 4.
The result is _not_ a matrix with 100 rows and 4 columns.
The matrix() function restructures the object passed to it, without
manipulating the object's elements.
You can also add rows and/or columns to a matrix by using the rbind()
and cbind() functions, respectively. See ?rbind, which will bring up the
help for both functions.
HTH,
Marc Schwartz
More information about the R-help
mailing list