关于r：产生漂亮的线性回归图(拟合线，置信度/预测带等)

Produce nice linear regression plot (fitted line, confidence / prediction bands, etc)

我在未来有这个样本10年回归。

1
2
3
4
5
6
7
8

date<-as.Date(c("2015-12-31","2014-12-31","2013-12-31","2012-12-31"))
value<-c(16348, 14136, 12733, 10737)
#fit linear regression
model<-lm(value~date)
#build predict dataframe
dfuture<-data.frame(date=seq(as.Date("2016-12-31"), by="1 year", length.out = 10))
#predict the futurne
predict(model, dfuture, interval ="prediction")

我该如何添加置信带？

相关讨论

以下代码将为您生成美观的回归图。我对代码的注释应解释清楚所有内容。代码将按照您的问题使用value，model。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38

## all date you are interested in, 4 years with observations, 10 years for prediction
all_date <- seq(as.Date("2012-12-31"), by="1 year", length.out = 14)

## compute confidence bands (for all data)
pred.c <- predict(model, data.frame(date=all_date), interval="confidence")

## compute prediction bands (for new data only)
pred.p <- predict(model, data.frame(date=all_date[5:14]), interval="prediction")

## set up regression plot (plot nothing here; only set up range, axis)
ylim <- range(range(pred.c[,-1]), range(pred.p[,-1]))
plot(1:nrow(pred.c), numeric(nrow(pred.c)), col ="white", ylim = ylim,
xaxt ="n", xlab ="Date", ylab ="prediction",
main ="Regression Plot")
axis(1, at = 1:nrow(pred.c), labels = all_date)

## shade 95%-level confidence region
polygon(c(1:nrow(pred.c),nrow(pred.c):1), c(pred.c[, 2], rev(pred.c[, 3])),
col ="grey", border = NA)

## plot fitted values / lines
lines(1:nrow(pred.c), pred.c[, 1], lwd = 2, col = 4)

## add 95%-level confidence bands
lines(1:nrow(pred.c), pred.c[, 2], col = 2, lty = 2, lwd = 2)
lines(1:nrow(pred.c), pred.c[, 3], col = 2, lty = 2, lwd = 2)

## add 95%-level prediction bands
lines(4 + 1:nrow(pred.p), pred.p[, 2], col = 3, lty = 3, lwd = 2)
lines(4 + 1:nrow(pred.p), pred.p[, 3], col = 3, lty = 3, lwd = 2)

## add original observations on the plot
points(1:4, rev(value), pch = 20)

## finally, we add legend
legend(x ="topleft", legend = c("Obs","Fitted","95%-CI","95%-PI"),
pch = c(20, NA, NA, NA), lty = c(NA, 1, 2, 3), col = c(1, 4, 2, 3),
text.col = c(1, 4, 2, 3), bty ="n")

regression plot

JPEG由以下代码生成：

1
2
3
4
5

jpeg("regression.jpeg", height = 500, width = 600, quality = 100)
## the above code
dev.off()
## check your working directory for this JPEG
## use code getwd() to see this director if you don't know

从图上可以看到，

当您尝试使预测值远离观测数据时，置信范围会越来越宽；
预测间隔比置信区间宽。

如果您想进一步了解predict.lm()内部如何计算置信度/预测间隔，请阅读predict.lm()如何计算置信度间隔和预测间隔？，并在那儿找到我的答案。

感谢Alex演示了visreg包的简单用法；但是我仍然更喜欢使用R base。