[R] Assigning week numbers

David Winsemius dwinsemius at comcast.net
Wed Apr 21 21:21:58 CEST 2010


On Apr 21, 2010, at 1:15 PM, David Winsemius wrote:

>
> On Apr 21, 2010, at 12:50 PM, Michael Hosack wrote:
>
>>
>>
>> I provided a minimized version of my dataframe at the bottom of  
>> this message containing the results of David's code in variable  
>> ('wkoffset') and Jeff Hallman's code in ('WEEK'). Jeff's code  
>> produced the correct results (thank you Jeff) though I have been  
>> unable to understand it. David, as you can see your code begins  
>> week 2 for year 2011 on a Wednesday, rather than on a Saturday, as  
>> it should. Your adjustment seems not to correct the problem, but I  
>> concede I may be using it incorrectly. If you are obtaining the  
>> correct results please let me know what I am doing wrong.
>
> ???
>
> require(chron)
> SCHEDULE4$wkoffset <- 2 + as.integer(SCHEDULE4$DATE. -
>          as.Date("2011-05-01")- day.of.week(2011, 5, 1) ) %/% 7

Bad code. I thought the order of arguments to day.of.week would match  
the POSIX convention, but it is really month, day, year. Even if I  
were on target with the arguments, I would have needed to have a  
different adjustment to get the modulo arithmetic working correctly:

 > dseq <- as.Date("2011-05-01") + (-4:14)
 > dseq
  [1] "2011-04-27" "2011-04-28" "2011-04-29" "2011-04-30" "2011-05-01"  
"2011-05-02"
  [7] "2011-05-03" "2011-05-04" "2011-05-05" "2011-05-06" "2011-05-07"  
"2011-05-08"
[13] "2011-05-09" "2011-05-10" "2011-05-11" "2011-05-12" "2011-05-13"  
"2011-05-14"
[19] "2011-05-15"

 > data.frame(dseq, wk= 1+ as.integer(dseq - as.Date("2011-05-01")-  
day.of.week(5, 1, 2011) +1) %/% 7)
          dseq wk
1  2011-04-27  0
2  2011-04-28  0
3  2011-04-29  0
4  2011-04-30  1
5  2011-05-01  1
6  2011-05-02  1
7  2011-05-03  1
8  2011-05-04  1
9  2011-05-05  1
10 2011-05-06  1
11 2011-05-07  2
12 2011-05-08  2
13 2011-05-09  2
14 2011-05-10  2
15 2011-05-11  2
16 2011-05-12  2
17 2011-05-13  2
18 2011-05-14  3
19 2011-05-15  3

So the week counter does now change on a Saturday.

-- 
David



>
> > tail(SCHEDULE4)
>          DATE. YEAR MONTH DAY DOW SITE  TOD DOW. wkoffset WEEK
> 374  2011-05-06 2011     5   6 Fri  103 MORN    7        2    1
> 1110 2011-05-06 2011     5   6 Fri  103 AFTN    7        2    1
> 558  2011-05-06 2011     5   6 Fri  104 MORN    7        2    1
> 1294 2011-05-06 2011     5   6 Fri  104 AFTN    7        2    1
> 7    2011-05-07 2011     5   7 Sat  101 MORN    1        2    2
> 743  2011-05-07 2011     5   7 Sat  101 AFTN    1        2    2
>
> I don't know how you are using it, but the "registration" with a  
> Saturday==1 seems to be working for me.

Only because I was looking a the wrong column. Arrrgghh.

> Perhaps you did not adjust the year within day.of.week()?
>
>
>>
>> Thanks,
>>
>> Mike
>>
>> Jeff Hallman's code:
>>
>> weeknumber <- function(aDate){
>>   aTi <- ti(aDate, tif = "wfriday")
>>   may1ymd <- 10000*year(aTi) + 501
>>   baseWeek <- ti(may1ymd, tif = "wfriday")
>>   return(aTi - baseWeek + 1)
>> }
>>
>> SCHEDULE3$WEEK<-weeknumber(SCHEDULE3$DATE.)
>>
>>
>>
>>
>> Thank you David, this approach is a start in the right direction but
>>> it does
>>> not yield the needed results. I need new week numbers to only begin
>>> on Saturdays. The only exception will be for the first date (May 01)
>>> which will start week 1 on a different day of the week depending on
>>> the year. The proceeding Saturday will begin Week #2. The approach  
>>> you
>>> provided does not end Week 1 on a Friday and does not end subsequent
>>> week #'s on Fridays.
>>
>> My solution should be adjustable for any day-of-week ending number.
>> Why don't you provide a minimal example for testing and show how  
>> apply
>> my solution fails? Your original version was definitely not  
>> "minimal."
>> -- 
>> David.
>>>
>>> Thanks again,
>>>
>>> Mike
>>>
>>> ----------------------------------------
>>>> CC: mhosack9 at hotmail.com; r-help at r-project.org
>>>> From: dwinsemius at comcast.net
>>>> To: dwinsemius at comcast.net
>>>> Subject: Re: [R] Assigning Week Numbers
>>>> Date: Tue, 20 Apr 2010 16:03:09 -0400
>>>>
>>>>
>>>> On Apr 20, 2010, at 2:55 PM, David Winsemius wrote:
>>>>
>>>>>
>>>>> On Apr 20, 2010, at 1:59 PM, Michael Hosack wrote:
>>>>>
>>>>>>
>>>>>> R experts,
>>>>>>
>>>>>> How could I extract the week number from a date vector (in Date
>>>>>> class)
>>>>>> such that week numbering (week 1...2...) begins (May 01) and ends
>>>>>> (October 31) on the same specific dates each year? Week numbering
>>>>>> must conform to the following day numbering format
>>>>>> (Sat=1,Sun=2,Mon=3.....Fri=7).
>>>>>> This means that new weeks must begin on Saturdays, and end on
>>>>>> Fridays
>>>>>> (except for the first date of May 01, which always begins week 1;
>>>>>> week 2
>>>>>> begins on the proceeding Saturday). This needs to be applicable
>>>>>> across years
>>>>>> to work effectively. I have tried using both vectorized and loop
>>>>>> approaches with
>>>>>> no success.
>>>>>>
>>>>>
>>>>> Modulo arithmetic will work if you first convert the difftime  
>>>>> object
>>>>> to integer:
>>>>
>>>> And you can use teh chron function day.of.week to return an offset
>>>> for
>>>> the first week
>>>>
>>>>> weekdays(as.Date("2013-05-01") )
>>>> [1] "Wednesday"
>>>>> day.of.week(2013, 5, 1)
>>>> [1] 5
>>>>> SCHEDULE3$wkoffset <- as.integer(SCHEDULE3$DATE. -
>>>> as.Date("2013-05-01")- day.of.week(2013, 5, 1) ) %/% 7
>>>>> range(SCHEDULE3$wkoffset)
>>>> [1] -1 25
>>>> # So you need a further adjustment of 2 to set the starting week
>>>> number to "1":
>>>>
>>>>> SCHEDULE3$wkoffset <- 2+as.integer(SCHEDULE3$DATE. -
>>>> as.Date("2013-05-01")- day.of.week(2013, 5, 1) ) %/% 7
>>>>> range(SCHEDULE3$wkoffset)
>>>> [1] 1 27
>>>>
>>>>>
>>>>>> SCHEDULE3$wkoffset <- as.integer(SCHEDULE3$DATE. -
>>>>> as.Date("2010-05-01") ) %/% 7
>>>>>> str(head(SCHEDULE3))
>>>>> 'data.frame': 6 obs. of 9 variables:
>>>>> $ DATE. :Class 'Date' int [1:6] 15826 15826 15826 15826 15826  
>>>>> 15826
>>>>> $ YEAR : num 2013 2013 2013 2013 2013 ...
>>>>> $ MONTH : num 5 5 5 5 5 5
>>>>> $ DAY : num 1 1 1 1 1 1
>>>>> $ DOW : chr "Wed" "Wed" "Wed" "Wed" ...
>>>>> $ SITE : num 101 101 102 102 103 103
>>>>> $ TOD : Factor w/ 2 levels "MORN","AFTN": 1 2 1 2 1 2
>>>>> $ DOW. : num 5 5 5 5 5 5
>>>>> $ wkoffset: num 156 156 156 156 156 156
>>>>>> range(SCHEDULE3$DATE.)
>>>>> [1] "2013-05-01" "2013-10-31"
>>>>>> range(SCHEDULE3$wkoffset)
>>>>> [1] 156 182
>>>>>
>>>>>> I am including a bit of old Systat code that does the trick  
>>>>>> simply
>>>>>> and concisely.
>>>>>> If anyone knows an analogous method in R please let me know. My R
>>>>>> dataframe contains
>>>>>> all the variables and data in the Systat temp file.
>>>>>>
>>>>>> Use sched3.t
>>>>>> Save sched4.t
>>>>>> Hold
>>>>>> By mm dd
>>>>>> If bof then let week=1
>>>>>> Else if bog and DOW$="SAT" then let week = week + 1
>>>>>> Run
>>>>>>
>>>>>>
>>>>>> Thank you,
>>>>>>
>>>>>> Mike
>>>>>>
>>>>>>
>>>>>> SCHEDULE3 <-
>>>>>> structure(list(DATE. = structure(c(15826L, 15826L, 15826L,  
>>>>>> 15826L,
>>>
>> snipped the rather larger than minimal example.
>> David Winsemius, MD
>> West Hartford, CT
>>
>>
>>
>>
>> This is about as minimal as I could get it.
>>
>> SCHEDULE4 <-
>> structure(list(DATE. = structure(c(15095L, 15095L, 15095L, 15095L,
>> 15095L, 15095L, 15095L, 15095L, 15096L, 15096L, 15096L, 15096L,
>> 15096L, 15096L, 15096L, 15096L, 15097L, 15097L, 15097L, 15097L,
>> 15097L, 15097L, 15097L, 15097L, 15098L, 15098L, 15098L, 15098L,
>> 15098L, 15098L, 15098L, 15098L, 15099L, 15099L, 15099L, 15099L,
>> 15099L, 15099L, 15099L, 15099L, 15100L, 15100L, 15100L, 15100L,
>> 15100L, 15100L, 15100L, 15100L, 15101L, 15101L), class = "Date"),
>>   YEAR = c(2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011,
>>   2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011,
>>   2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011,
>>   2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011,
>>   2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011,
>>   2011, 2011), MONTH = c(5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
>>   5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
>>   5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5),
>>   DAY = c(1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3,
>>   3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5,
>>   5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7), DOW = c("Sun",
>>   "Sun", "Sun", "Sun", "Sun", "Sun", "Sun", "Sun", "Mon", "Mon",
>>   "Mon", "Mon", "Mon", "Mon", "Mon", "Mon", "Tue", "Tue", "Tue",
>>   "Tue", "Tue", "Tue", "Tue", "Tue", "Wed", "Wed", "Wed", "Wed",
>>   "Wed", "Wed", "Wed", "Wed", "Thu", "Thu", "Thu", "Thu", "Thu",
>>   "Thu", "Thu", "Thu", "Fri", "Fri", "Fri", "Fri", "Fri", "Fri",
>>   "Fri", "Fri", "Sat", "Sat"), SITE = c(101, 101, 102, 102,
>>   103, 103, 104, 104, 101, 101, 102, 102, 103, 103, 104, 104,
>>   101, 101, 102, 102, 103, 103, 104, 104, 101, 101, 102, 102,
>>   103, 103, 104, 104, 101, 101, 102, 102, 103, 103, 104, 104,
>>   101, 101, 102, 102, 103, 103, 104, 104, 101, 101), TOD =  
>> structure(c(1L,
>>   2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
>>   1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
>>   2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
>>   1L, 2L, 1L, 2L), .Label = c("MORN", "AFTN"), class = "factor"),
>>   DOW. = c(2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3,
>>   4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6,
>>   6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7, 1, 1), wkoffset = c(1,
>>   1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
>>   1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
>>   2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), WEEK = c(1, 1, 1, 1, 1,
>>   1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
>>   1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
>>   1, 1, 1, 1, 1, 2, 2)), .Names = c("DATE.", "YEAR", "MONTH",
>> "DAY", "DOW", "SITE", "TOD", "DOW.", "wkoffset", "WEEK"), row.names  
>> = c(1L,
>> 737L, 185L, 921L, 369L, 1105L, 553L, 1289L, 2L, 738L, 186L, 922L,
>> 370L, 1106L, 554L, 1290L, 3L, 739L, 187L, 923L, 371L, 1107L,
>> 555L, 1291L, 4L, 740L, 188L, 924L, 372L, 1108L, 556L, 1292L,
>> 5L, 741L, 189L, 925L, 373L, 1109L, 557L, 1293L, 6L, 742L, 190L,
>> 926L, 374L, 1110L, 558L, 1294L, 7L, 743L), class = "data.frame") 		  
>> 	   		
>> _________________________________________________________________
>> Hotmail has tools for the New Busy. Search, chat and e-mail from  
>> your inbox.
>>
>> N:WL:en-US:WM_HMP:042010_1
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list