[R] Delete the first instances of the unique values of a vector in R

Wed Jan 11 18:33:54 CET 2017

I think you should probably go read some introductory material on R.
There are lots of good references out there. R does not work in the
same way as MATLAB.

You should probably also read the posting guide, and this article on
making good reproducible examples:
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example

Meanwhile, it sounds like you might want this:

> vec <- c(1, 4, 4, 4, 4, 4, 6, 6)
> !duplicated(vec)
[1]  TRUE  TRUE FALSE FALSE FALSE FALSE  TRUE FALSE

On Tue, Jan 10, 2017 at 4:39 PM, Tunga Kantarcı <tungakantarci at gmail.com> wrote:
> Consider a data frame which I name as rwrdatafile. It includes several
> variables stored in columns. For each variable there are 1000
> observations and hence 1000 rows. The interest lies in the values of
> the second column of this data frame, that is in rwrdatafile[,2]. What
> I am trying to accomplish is to delete the rows of the data frame if
> it is the first instance of a unique value in rwrdatafile[,2]. That
> is, the values stored in rwrdatafile[,2] look like
>
> 1
> 4
> 4
> 4
> 4
> 4
> 4
> 6
> 6
>
> and the routine should delete 1 (and the other values in that row),
> the first 4 (and the other values in that row), and the first 6 (and
> the other values in that row). I did an online search, and indeed
> there are similar examples, but they did not help for what I am trying
> to achieve. What is specific to what I am trying to achieve is that
> the routine should use a for loop. I have written a routine that is
> not using a for loop and it works fine and I paste it below
> (Vector-oriented coding in R). I need to write a for loop that
> accomplishes the same task. In fact, I have written this for loop but
> it has a problem (Scalar-oriened coding in R pasted below). Note that
> the data stored in rwrdatafile[,2] has three unique values (there are
> more but for making the example that does not matter) which are 1, 4,
> 6. The for loop I have written first determines the number of unique
> values in rwrdatafile[,2], with length(unique(rwrdatafile[,2])), and
> uses that number in the sequence of the for loop. The length is 3 so
> the sequence is 1:3. But there is a catch! When 1 is deleted (and
> other values row wise), the length decreases to 2 but the for loop
> attempts 3 and therefore it returns NULL at the end of the loop.
> Therefore I subtract 1 from the length. But this is not good coding. I
> wondered about the NULL result and it took me a while to figure out
> the problem, and worse is that I could have never found the problem.
> So the for loop here is not reliable because it requires that the user
> knows that there are multiple instances of the unique values (so
> multiple instances of 1). How can I fix the problem? The restriction I
> have is that I need to keep the for loop and it should resemble the
> for loop I have written for MATLAB (pasted below). The aim is to
> translate the MATLAB routine as close as possible in R. So I do not
> want to deviate (much) from the MATLAB version of the code because
> otherwise I cannot compare the routines while I am teaching this. That
> is, I need to use a function in the for loop in R that is as close as
> possible to the find function (with the first option) of MATLAB.
>
> # Scalar-oriented coding in R
> length(unique(rwrdatafile[,2]))
> for (i in 1:(.Last.value-1)){
>   rwrdatafile = rwrdatafile[-(which(rwrdatafile[,2] ==
> unique(rwrdatafile[,2])[i])[1]),]
> }
>
> # Vector-oriented coding in R
> unique(rwrdatafile[,2])
> tag = match(.Last.value,rwrdatafile[,2])
> rwrdatafile = rwrdatafile[!row.names(rwrdatafile) %in% tag,]
>
> # Scalar-oriented coding in MATLAB
> unique(mwmatfile.data(:,2));
> for i = ans'
>     mwmatfile.data(find(mwmatfile.data(:,2) == i,1,'first'),:) = [];
> end
>

-- 
Sarah Goslee
http://www.functionaldiversity.org