[R] Help with Binning Data
Shouro Dasgupta
shouro at gmail.com
Fri Sep 11 00:28:19 CEST 2015
Dear all,
I have 3-hourly temperature data from 1970-2010 for 122 cities in the US. I
would like to bin this data by city-year-week. My idea is if the
temperature for a particular city in a given week falls within a given
range (-17.78 & -12.22), (-12.22 & -6.67), ... (37.78 & 43.33), then the
corresponding bin would have a value of 1 and 0 otherwise.
The data looks like this. Basically, I need to generate a dummy variable
for each temperature range. Any help will be greatly appreciated.
tmp2<- dput(head(tmp1,10))
> structure(list(yearday = c(1970001L, 1970001L, 1970001L, 1970001L,
> 1970001L, 1970001L, 1970001L, 1970001L, 1970001L, 1970001L),
> City = structure(1:10, .Label = c("AKRON", "ALBANY", "ALBUQUERQUE",
> "ALLENTOWN", "ATLANTA", "AUSTIN", "BALTIMORE", "BATON ROUGE",
> "BERKELEY", "BIRMINGHAM", "BOISE", "BOSTON", "BRIDGEPORT",
> "BUFFALO", "CAMBRIDGE", "CAMDEN", "CANTON", "CHARLOTTE",
> "CHATTANOOGA", "CHICAGO", "CINCINNATI", "CLEVELAND", "COLORADO
> SPRINGS",
> "COLUMBUS", "CORPUS CHRISTI", "DALLAS", "DAYTON", "DENVER",
> "DES MOINES", "DETROIT", "DULUTH", "EL PASO", "ELIZABETH",
> "ERIE", "EVANSVILLE", "FALL RIVER", "FLINT", "FORT WAYNE",
> "FRESNO", "FT WORTH", "GARY", "GLENDALE", "GRAND RAPIDS",
> "HARTFORD", "HONOLULU", "HOUSTON", "INDIANAPOLIS", "JACKSONVILLE",
> "JERSEY CITY", "KANSAS CITY", "KANSAS ITY", "KNOXVILLE",
> "Lansing ", "LAS VEGAS", "LEXINGTON", "LINCOLN", "LITTLE ROCK",
> "LONG BEACH", "LOS ANGELES", "LOUISVILLE", "LOWELL", "LYNN",
> "MADISON", "MEMPHIS", "MIAMI", "MILWAUKEE", "MINNEAPOLIS",
> "MOBILE", "MONTGOMERY", "NASHVILLE", "NEW BEDFORD", "NEW HAVEN",
> "NEW ORLEANS", "NEW YORK CITY", "NEWARK", "NORFOLK", "OAKLAND",
> "OGDEN", "OKLAHOMA CITY", "OMAHA", "PASADENA", "PATERSON",
> "PEORIA", "PHILADELPHIA", "PHOENIX", "PITTSBURG", "PORTLAND",
> "PROVIDENCE", "PUEBLO", "READING", "RICHMOND", "ROCHESTER",
> "ROCKFORD", "SACRAMENTO", "SALT LAKE CITY", "SAN ANTONIO",
> "SAN CRUZ", "SAN DIEGO", "SAN FRANCISCO", "SAN JOSE", "SAVANNAH",
> "SCHENECTADY", "SCRANTON", "SEATTLE", "SHREVEPORT", "SOMERVILLE",
> "SOUTH BEND", "SPOKANE", "SPRINGFIELD", "ST LOUIS", "ST PAUL",
> "ST PETERSBURG", "SYRACUSE", "TACOMA", "TAMPA", "TOLEDO",
> "TRENTON", "TUCSON", "TULSA", "UTICA", "WASHINGTON", "WATERBURY",
> "WICHITA", "WILMINGTON", "WORCESTER", "YONKERS", "YOUNGSTOWN"
> ), class = "factor"), cell_number = c(17379L, 17027L, 19514L,
> 17745L, 20256L, 21323L, 18104L, 21329L, 18779L, 20254L),
> longitude = c(-81.519005, -73.756232, -106.609991, -75.490183,
> -84.387982, -97.743061, -76.612189, -91.14032, -121.635963,
> -86.80249), latitude = c(41.081445, 42.652579, 35.110703,
> 40.608431, 33.748995, 30.267153, 39.290385, 30.458283, 37.871744,
> 33.520661), State = structure(c(29L, 28L, 27L, 32L, 10L,
> 35L, 19L, 17L, 4L, 1L), .Label = c(" ALA", " ARIZ", " ARK",
> " CAL", " COLO", " CONN", " DC", " DEL", " FLA", " GA", " HAWAII",
> " ILL", " IND", " IOWA", " KANS", " KY", " LA", " MASS",
> " MD", " MICH", " MINN", " MO", " NC", " NEBR", " NEV", " NJ",
> " NM", " NY", " OHIO", " OKLA", " ORE", " PA", " RI", " TENN",
> " TEX", " UTAH", " VA", " WASH", " WIS", "CAL", "CONN", "IDAH",
> "KY", "MASS"), class = "factor"), avsft = c(-7.81, -16.06,
> -7.71999999999997, -1.88999999999999, 2.90000000000003, 5.12,
> -5.02999999999997, 9.33000000000004, 15.08, 2.89000000000004
> ), year = c(1970L, 1970L, 1970L, 1970L, 1970L, 1970L, 1970L,
> 1970L, 1970L, 1970L), day = c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L), hour = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
> 0L), yearweek = c(197001L, 197001L, 197001L, 197001L, 197001L,
> 197001L, 197001L, 197001L, 197001L, 197001L), week = c(1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L)), .Names = c("yearday",
> "City", "cell_number", "longitude", "latitude", "State", "avsft",
> "year", "day", "hour", "yearweek", "week"), row.names = c(NA,
> 10L), class = "data.frame")
Sincerely,
Shouro
[[alternative HTML version deleted]]
More information about the R-help
mailing list