[R] If cycle takes to much time...

marco.guerzoni at unito.it marco.guerzoni at unito.it
Fri Jan 25 12:16:49 CET 2013

On 25.01.2013 12:08, Berend Hasselman wrote:
> On 25-01-2013, at 10:25, marcoguerzoni <marco.guerzoni at unito.it> 
> wrote:
>> dear all,
>> thank you for reading.
>> I have a dataset of artists and where and when they had an 
>> exhibition.
>> I'd like to create an affiliation network in the form of matrix, 
>> telling me
>> which aritist have been in the same at the same time.
>> I manage to do it, but given that I have 96000 observation the 
>> program takes
>> 30 months to complete.
>> her what i have done.
>> the data look like this
>> Artist <-c(1,2,3,2,4,4,5)
>> Begin <- as.Date(c('2006-08-23', '2006-03-21', '2006-03-06', 
>> '2006-01-13',
>> '2006-05-20', '2006-07-13', '2006-07-20'))
>> End <- as.Date(c('2006-10-23', '2006-11-30', '2006-05-06', 
>> '2006-12-13',
>> '2006-09-20', '2006-08-13', '2006-09-20'))
>> Istitution <- c(1, 2, 2, 1, 1, 2, 1)
>> artist is the name of the artist, Begin and End is the when and 
>> Istitutionis
>> the where.
>> my IF is working,
>> #number of unique artist
>> c <- unique(Artist)
>> d <- length(c)
>> a <-length(Artist)
>> B <- mat.or.vec(d,d)
>> for(i in 1:d) {
>> for(j in 1:d) {
>> if (Istitution[i]  == Istitution[j]) {
>> if (Begin[i] <= End[j])
>> {
>> if (End[i]-Begin[j] >= 0) {
>> B[i,j] <- B[i,j]+1
>> B[i,i] <- 0
>> }
>> }
>> else{
>> if (End[j]-Begin[i] >= 0) {
>> B[i,j] <- B[i,j]+1
>> B[i,i] <- 0
>> }
>> }
>>  }
>>   }
>> print(i)
>>    }
>> do you have a way to make the programm simpler and faster?
> It is not clear why you are only using the unique artists.
> You shouldn't be using "c" as variable name. It is a builtin 
> function.
> Since the result is symmetric you can change the j-loop  to for(j in
> (i+1):d).
> After the loop you can do
> B[lower.tri(B)] <- t(B)[lower.tri(B)]
> to fill the remainder of the matrix B. This would certainly be more
> efficient.
> But I don't quite understand what you are trying to do.
> With you example you could compute the result you desire.
> Gerrit's answer is concise.
> Berend

thank you Berend,

what I like to do is to have a symmetric matrix, where raws and colums 
are artists and value I get  1 (or true) if they had an exhibition in 
the same and in the same place.
My unelegant code is working, but for 96000 observation is requiring 
months and months. Gerrit is very elegant, but i run out of memory...

the problem is the size. I am looking maybe for a way to divide gerrit 
solutonin smaller steps which can be handled



More information about the R-help mailing list