Is there a function in R that will sum values based on Date of Year?
我有一个数据表(Precip15),该数据表由降水量,年份(DOY)和POSIXct格式的Date_Time列组成。我需要能够看到每天记录的总降水量(Rain_cm)。有什么建议么?
数据表格式的示例如下:
1 2 3 4 5 6 7 | DOY Rain Rain_cm Date_Time 179 6 0.6 2019-06-28 15:00:00 179 0 NA 2019-06-28 15:15:00 179 2 0.2 2019-06-28 16:45:00 180 0 NA 2019-06-29 10:00:00 180 10.2 1.2 2019-06-29 10:15:00 180 2 0.2 2019-06-29 13:00:00 |
我需要它看起来像这样:
1 2 3 | DOY Rain_cm 179 0.8 180 1.4 |
或者可能是:
1 2 3 | Date Rain_cm 2019-06-28 0.8 2019-06-29 1.4 |
在此先感谢您的帮助!
这里有一些基本R解决方案,它们使用在结尾处的注释中可重复定义的数据帧
1)根据需要,在
1 2 3 4 5 6 7 8 9 10 | aggregate(Rain_cm ~ DOY, DF, sum) ## DOY Rain_cm ## 1 179 0.8 ## 2 180 1.4 DF2 <- transform(DF, Date = as.Date(Date_Time)) aggregate(Rain_cm ~ Date, DF2, sum) ## Date Rain_cm ## 1 2019-06-28 0.8 ## 2 2019-06-29 1.4 |
2)rowsum另一种基本的R解决方案是
1 2 3 4 5 6 7 8 9 | with(na.omit(DF), rowsum(Rain_cm, DOY)) ## [,1] ## 179 0.8 ## 180 1.4 with(na.omit(DF2), rowsum(Rain_cm, Date)) ## [,1] ## 2019-06-28 0.8 ## 2019-06-29 1.4 |
3)轻按另一个基本的R方法是
1 2 3 4 5 6 7 | with(DF, tapply(Rain_cm, DOY, sum, na.rm = TRUE)) ## 179 180 ## 0.8 1.4 with(DF2, tapply(Rain_cm, Date, sum, na.rm = TRUE)) ## 2019-06-28 2019-06-29 ## 0.8 1.4 |
4)xtabs
1 2 3 4 5 6 7 8 9 | xtabs(Rain_cm ~ DOY, DF) ## DOY ## 179 180 ## 0.8 1.4 xtabs(Rain_cm ~ Date, DF2) ## Date ## 2019-06-28 2019-06-29 ## 0.8 1.4 |
笔记
可复制形式的数据假定为:
1 2 3 4 5 6 7 8 9 | Lines <-"DOY Rain Rain_cm Date_Time 179 6 0.6 2019-06-28 15:00:00 179 0 NA 2019-06-28 15:15:00 179 2 0.2 2019-06-28 16:45:00 180 0 NA 2019-06-29 10:00:00 180 10.2 1.2 2019-06-29 10:15:00 180 2 0.2 2019-06-29 13:00:00" L <- readLines(textConnection(Lines)) DF <- read.csv(text = gsub(" +",",", Lines)) |
您可以使用
1 2 | precipTotals <- aggreate(df$Rain_cm ~ cut(df$Date_Time, breaks ="day"), x = df, FUN = sum, na.rm = TRUE) |
确保您的precip列为
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | df <- tribble( ~DOY, ~Rain, ~Rain_cm, ~Date_Time , 179 , 6 , 0.6 ,"2019-06-28 15:00:00" , 179 , 0 , NA ,"2019-06-28 15:15:00" , 179 , 2 , 0.2 ,"2019-06-28 16:45:00" , 180 , 0 , NA ,"2019-06-29 10:00:00" , 180 , 10.2 , 1.2 ,"2019-06-29 10:15:00" , 180 , 2 , 0.2 ,"2019-06-29 13:00:00" ) df %>% mutate(Date_Time = ymd_hms(Date_Time)) %>% mutate(Date = as.Date(Date_Time)) %>% group_by(Date) %>% summarise(perDate = sum(Rain_cm, na.rm = TRUE)) Date perDate <date> <dbl> 1 2019-06-28 0.8 2 2019-06-29 1.4 |