Why does my lm model not show a linear relationship but does in geom_smooth?
我试图建立一个线性模型来解释粒子浓度和荧光之间的关系。由于某种原因,我无法让模型使用
下面是对数荧光和对数粒子浓度的图...
我用下面的代码做了一个模型
1 2 3 | Calicurve.M1 <- lm(Fluorescence~Particle.conc, na.action = na.exclude, data = Calicurve) |
但是,当我使用此模型来预测值并添加到我的绘图(在 ggplot2 中)时,它看起来不正确
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | ### predict values and put into dataframe pdat <- expand.grid(Particle.conc = c(5, 50, 500, 5000, 50000, 500000, 5000000, 50000000, 500000000, 5000000000, 50000000000, 500000000000)) pred <- predict(Calicurve.M1, newdata = pdat, na.rm=T, type ="response", se.fit = TRUE) predframe <- data.frame (pdat, preds=pred$fit, se=pred$se.fit) predframe$upperse <- (predframe$preds+predframe$se) predframe$lowerse <- (predframe$preds-predframe$se) ### plot calibration curve ### plot <- ggplot(data=Calicurve, aes(x=Particle.conc, y=Fluorescence)) + geom_point()+ scale_y_log10(name ="Fluorecence (AFU)", limits = c(1,1200))+ scale_x_log10(name ="Particle concentration (particles/mL)")+ theme_bw() + theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(), strip.text = element_text(face ="italic"), legend.position = c(0.6, 0.75), legend.justification = c(1, 0)) + geom_line(data= predframe, aes(x=Particle.conc,y=preds), linetype=1) + geom_line(data= predframe, aes(x=Particle.conc,y=upperse), linetype=2) + geom_line(data= predframe, aes(x=Particle.conc,y=lowerse), linetype=2) |
最后,当我使用
我想了解第一个模型为什么不起作用以及如何修复它,以便我可以使用该模型来预测荧光值。
按要求提供样本数据
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | structure(list(Particle.conc = c(50, 50, 50, 500, 500, 500, 5000, 5000, 5000, 50000, 50000, 50000, 5e+05, 5e+05, 5e+05, 5e+06, 5e+06, 5e+06, 5e+07, 5e+07, 5e+07, 5e+08, 5e+08, 5e+08, 5e+09, 5e+09, 5e+09, 5e+10, 5e+10, 5e+10, 5e+11, 5e+11, 5e+11), Fluorecence = c(2.649, 2.671, 2.502, 3.926, 3.965, 4, 6.674, 6.337, 6.56, 12.204, 12.168, 12.209, 24.91, 25.54, 25.384, 38.232, 37.845, 37.979, 80.547, 80.343, 79.891, 168.693, 168.008, 168.826, 349.318, 351.304, 355.288, 556.081, 555.348, 554.112, 1105.749, 1103.063, 1097.552 ), Average.FL = c(2.607333333, NA, NA, 3.963666667, NA, NA, 6.523666667, NA, NA, 12.19366667, NA, NA, 25.278, NA, NA, 38.01866667, NA, NA, 80.26033333, NA, NA, 168.509, NA, NA, 351.97, NA, NA, 555.1803333, NA, NA, 1102.121333, NA, NA), Fl.Bl = c(1.463, NA, NA, 2.819333333, NA, NA, 5.379333333, NA, NA, 11.04933333, NA, NA, 24.13366667, NA, NA, 36.87433333, NA, NA, 79.116, NA, NA, 167.3646667, NA, NA, 350.8256667, NA, NA, 554.036, NA, NA, 1100.977, NA, NA)), .Names = c("Particle.conc", "Fluorecence","Average.FL","Fl.Bl"), class ="data.frame", row.names = c(NA, -33L)) |
在将变量传递给
之前,您必须对其进行转换
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | df = tibble(x=10^(1:100), y=10^((1:100)*2) + rnorm(100)) lm(log(y, 10)~log(x, 10), df) %>% summary() # Call: # lm(formula = log(y, 10) ~ log(x, 10), data = df) # # Residuals: # Min 1Q Median 3Q Max # -6.000e-05 -3.698e-05 -1.480e-05 8.320e-06 1.486e-03 # # Coefficients: # Estimate Std. Error t value Pr(>|t|) # (Intercept) 6.291e-05 3.086e-05 2.038e+00 0.0442 * # log(x, 10) 2.000e+00 5.305e-07 3.770e+06 <2e-16 *** # --- # Signif. codes: 0 a€?***a€? 0.001 a€?**a€? 0.01 a€?*a€? 0.05 a€?.a€? 0.1 a€? a€? 1 # # Residual standard error: 0.0001531 on 98 degrees of freedom # Multiple R-squared: 1, Adjusted R-squared: 1 # F-statistic: 1.421e+13 on 1 and 98 DF, p-value: < 2.2e-16 |
然后要预测,你必须像这样反转变换:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | df %>% mutate(y_hat = 10^predict(lm(log(y, 10)~log(x,10)))) # # A tibble: 100 x 3 # x y y_hat # <dbl> <dbl> <dbl> # 1 10 1.00e 2 1.00e 2 # 2 100 1.00e 4 1.00e 4 # 3 1000 1.00e 6 1.00e 6 # 4 10000 10.00e 7 1.00e 8 # 5 100000 1.00e10 1.00e10 # 6 1000000 10.00e11 1.00e12 # 7 10000000 1.00e14 1.00e14 # 8 100000000 1.00e16 1.00e16 # 9 1000000000 1.00e18 1.00e18 # 10 10000000000 1.00e20 1.00e20 # # ... with 90 more rows |