简体   繁体   English

如何将 predict.gam 与具有偏移量的 GAM 一起使用?

[英]How do I use predict.gam with a GAM that has an offset?

I have a dataset that looks like this:我有一个如下所示的数据集:

structure(list(effort = c(2633, 7871, 10273, 
5202, 8550, 4698, 7357, 3670, 8933, 8301, 4416, 5355, 443, 8946, 
11168, 14572, 15552, 13947, 7969, 7541, 27478, 8698, 9044, 10803, 
29567, 9261, 1892, 8258, 9744, 5937, 11277, 7260, 6600, 1385, 
6959, 13788, 11792, 10363, 27837, 12622, 20954, 11912, 14986, 
14331, 14612, 7230, 25266, 25518, 8293, 6637, 9049, 6053, 6195, 
9957, 5039, 4840, 9757, 7760, 5836, 5741, 203, 5857, 4584, 5022, 
17794, 3499, 17010, 14025, 12059, 21645, 7174, 16150, 11445, 
12035, 24534, 6379, 11183, 6072, 10104, 6675, 14265, 9222, 9099, 
14397, 14097, 15684, 19315, 8753, 13876, 22169, 15724, 4688, 
21923, 16051, 8415, 6117, 11456, 10134, 5044, 19750, 10624, 9225, 
3935, 5995, 26458, 15806, 10188, 1641, 11402, 54, 7203, 9196, 
22643, 13905, 561, 7675, 6913, 7765, 11046, 9639, 10833, 16405, 
26188, 14262, 10092, 9834, 33753, 28133, 7095, 12020, 14248, 
10619, 8587, 11951, 8739, 10862, 4872, 6351, 2243, 5272, 2870, 
963, 18789, 20216, 17339, 20585, 16121, 8203, 11968, 7082, 12494, 
4731, 9975, 8863, 14946, 7321, 11694, 3228, 3375, 5607, 6223, 
10922, 5594, 604, 13512, 715, 16321, 5429, 15807, 17313, 3273, 
18884, 22627, 21474, 7898, 11273, 10482, 15778, 9962, 10997, 
12926, 8386, 11580, 10621, 3296, 8579, 14194, 9817, 7873, 8868, 
8093, 9366, 11594, 6801, 15844, 3426, 342, 13291, 7239, 6943, 
11958, 20140, 11373, 36384, 9897, 12543, 4293, 6691, 3176, 9847, 
1750, 794, 554, 6591, 14309, 2740, 6856, 8444, 3242, 2640, 8481, 
3197, 2332, 9287, 15318, 6410, 20876, 23016, 6741, 16704, 15311, 
7531, 8648, 2784, 7355, 8113, 13470, 11159, 14903, 8367, 7075, 
7312, 7496, 14094, 15349, 7191, 12474, 11323, 6793, 21977, 11888, 
17712, 4310, 6308, 16487, 19514, 9420, 6320, 7026, 1655, 7041, 
3070, 3533, 11043, 3843, 7483, 7150, 4463, 4319, 10384, 7579, 
8298, 2502, 4803, 8676, 16523, 10248, 5342, 4780, 3936, 17412, 
31632, 10323, 19263, 12757, 13171, 11301, 4273, 8657, 7512, 9319, 
9483, 3695, 4496, 7407, 26571, 5176, 2454, 9207, 9075, 16222, 
14280, 9963, 9426, 10864, 10627, 6665, 17141, 18597, 6093, 8094, 
4238), landings = c(116, 31, 0, 
0, 0, 0, 0, 0, 0, 120, 0, 241, 9, 0, 64, 326, 142, 605, 139, 
410, 212, 470, 416, 309, 1269, 474, 22, 135, 395, 464, 451, 32, 
2537, 210, 299, 1522, 184, 550, 666, 429, 1372, 184, 147, 1208, 
159, 951, 1000, 1100, 301, 144, 244, 0, 0, 281, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 42, 594, 26, 747, 436, 0, 914, 182, 8, 275, 175, 
766, 130, 930, 31, 177, 123, 895, 88, 107, 0, 4, 481, 909, 511, 
877, 402, 295, 336, 645, 310, 301, 398, 411, 0, 205, 293, 49, 
454, 162, 138, 1171, 0, 138, 0, 111, 0, 0, 36, 78, 114, 0, 0, 
134, 44, 549, 0, 378, 716, 739, 393, 203, 839, 70, 454, 132, 
651, 63, 1850, 217, 403, 55, 0, 408, 43, 17, 12, 26, 2, 811, 
581, 1216, 154, 1059, 89, 1862, 1310, 297, 29, 680, 0, 0, 29, 
0, 0, 0, 0, 0, 0, 17, 6, 0, 0, 0, 44, 909, 0, 0, 0, 194, 0, 212, 
18, 46, 44, 56, 365, 37, 0, 73, 11, 16, 19, 0, 0, 0, 23, 0, 92, 
0, 216, 0, 16, 0, 80, 319, 59, 35, 929, 47, 0, 0, 356, 0, 0, 
33, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 13, 0, 0, 91, 362, 
0, 0, 0, 0, 0, 29, 0, 0, 392, 105, 0, 94, 15, 222, 34, 44, 178, 
1867, 0, 224, 241, 23, 1502, 492, 168, 0, 234, 299, 453, 0, 406, 
149, 0, 39, 57, 86, 0, 28, 23, 265, 0, 0, 0, 168, 31, 20, 0, 
28, 78, 244, 13, 0, 99, 168, 861, 52, 649, 0, 174, 0, 0, 2462, 
64, 178, 0, 61, 0, 321, 391, 33, 17, 227, 241, 248, 294, 1119, 
37, 90, 0, 85, 37, 89, 0, 0, 0),  month = c(1L, 
1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 
5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 
8L, 8L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L, 
11L, 12L, 12L, 12L, 12L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 
3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 
6L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 10L, 
10L, 10L, 10L, 11L, 11L, 11L, 11L, 11L, 12L, 12L, 12L, 12L, 1L, 
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 
5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 
8L, 8L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 
11L, 12L, 12L, 12L, 12L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 
6L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 10L, 
10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L, 12L, 12L, 12L, 12L, 12L, 
1L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 
5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 
8L, 9L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L, 
12L, 12L, 12L, 12L, 12L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 
3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 
6L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 10L, 
10L, 10L, 10L, 11L, 11L, 11L, 11L, 12L, 12L, 12L, 12L, 12L), 
    Date = c(2014, 2014.01916495551, 2014.03832991102, 2014.05749486653, 
    2014.07665982204, 2014.09582477755, 2014.11498973306, 2014.13415468857, 
    2014.15331964408, 2014.17248459959, 2014.1916495551, 2014.21081451061, 
    2014.22997946612, 2014.24914442163, 2014.26830937714, 2014.28747433265, 
    2014.30663928816, 2014.32580424367, 2014.34496919918, 2014.36413415469, 
    2014.3832991102, 2014.40246406571, 2014.42162902122, 2014.44079397673, 
    2014.45995893224, 2014.47912388775, 2014.49828884326, 2014.51745379877, 
    2014.53661875428, 2014.55578370979, 2014.5749486653, 2014.59411362081, 
    2014.61327857632, 2014.63244353183, 2014.65160848734, 2014.67077344285, 
    2014.68993839836, 2014.70910335387, 2014.72826830938, 2014.74743326489, 
    2014.7665982204, 2014.78576317591, 2014.80492813142, 2014.82409308693, 
    2014.84325804244, 2014.86242299795, 2014.88158795346, 2014.90075290897, 
    2014.91991786448, 2014.93908281999, 2014.9582477755, 2014.97741273101, 
    2014.99657768652, 2015.01574264203, 2015.03490759754, 2015.05407255305, 
    2015.07323750856, 2015.09240246407, 2015.11156741958, 2015.13073237509, 
    2015.1498973306, 2015.16906228611, 2015.18822724162, 2015.20739219713, 
    2015.22655715264, 2015.24572210815, 2015.26488706366, 2015.28405201916, 
    2015.30321697467, 2015.32238193018, 2015.34154688569, 2015.3607118412, 
    2015.37987679671, 2015.39904175222, 2015.41820670773, 2015.43737166324, 
    2015.45653661875, 2015.47570157426, 2015.49486652977, 2015.51403148528, 
    2015.53319644079, 2015.5523613963, 2015.57152635181, 2015.59069130732, 
    2015.60985626283, 2015.62902121834, 2015.64818617385, 2015.66735112936, 
    2015.68651608487, 2015.70568104038, 2015.72484599589, 2015.7440109514, 
    2015.76317590691, 2015.78234086242, 2015.80150581793, 2015.82067077344, 
    2015.83983572895, 2015.85900068446, 2015.87816563997, 2015.89733059548, 
    2015.91649555099, 2015.9356605065, 2015.95482546201, 2015.97399041752, 
    2015.99315537303, 2016.01232032854, 2016.03148528405, 2016.05065023956, 
    2016.06981519507, 2016.08898015058, 2016.10814510609, 2016.1273100616, 
    2016.14647501711, 2016.16563997262, 2016.18480492813, 2016.20396988364, 
    2016.22313483915, 2016.24229979466, 2016.26146475017, 2016.28062970568, 
    2016.29979466119, 2016.3189596167, 2016.33812457221, 2016.35728952772, 
    2016.37645448323, 2016.39561943874, 2016.41478439425, 2016.43394934976, 
    2016.45311430527, 2016.47227926078, 2016.49144421629, 2016.5106091718, 
    2016.52977412731, 2016.54893908282, 2016.56810403833, 2016.58726899384, 
    2016.60643394935, 2016.62559890486, 2016.64476386037, 2016.66392881588, 
    2016.68309377139, 2016.7022587269, 2016.72142368241, 2016.74058863792, 
    2016.75975359343, 2016.77891854894, 2016.79808350445, 2016.81724845996, 
    2016.83641341547, 2016.85557837098, 2016.87474332649, 2016.893908282, 
    2016.91307323751, 2016.93223819302, 2016.95140314853, 2016.97056810404, 
    2016.98973305955, 2017.00889801506, 2017.02806297057, 2017.04722792608, 
    2017.06639288159, 2017.0855578371, 2017.10472279261, 2017.12388774812, 
    2017.14305270363, 2017.16221765914, 2017.18138261465, 2017.20054757016, 
    2017.21971252567, 2017.23887748118, 2017.25804243669, 2017.2772073922, 
    2017.29637234771, 2017.31553730322, 2017.33470225873, 2017.35386721424, 
    2017.37303216975, 2017.39219712526, 2017.41136208077, 2017.43052703628, 
    2017.44969199179, 2017.4688569473, 2017.48802190281, 2017.50718685832, 
    2017.52635181383, 2017.54551676934, 2017.56468172485, 2017.58384668036, 
    2017.60301163587, 2017.62217659138, 2017.64134154689, 2017.6605065024, 
    2017.67967145791, 2017.69883641342, 2017.71800136893, 2017.73716632444, 
    2017.75633127995, 2017.77549623546, 2017.79466119097, 2017.81382614648, 
    2017.83299110199, 2017.85215605749, 2017.871321013, 2017.89048596851, 
    2017.90965092402, 2017.92881587953, 2017.94798083504, 2017.96714579055, 
    2017.98631074606, 2018.00547570157, 2018.02464065708, 2018.04380561259, 
    2018.0629705681, 2018.08213552361, 2018.12046543463, 2018.13963039014, 
    2018.15879534565, 2018.17796030116, 2018.19712525667, 2018.21629021218, 
    2018.23545516769, 2018.2546201232, 2018.27378507871, 2018.29295003422, 
    2018.31211498973, 2018.33127994524, 2018.35044490075, 2018.36960985626, 
    2018.38877481177, 2018.40793976728, 2018.42710472279, 2018.4462696783, 
    2018.46543463381, 2018.48459958932, 2018.50376454483, 2018.52292950034, 
    2018.54209445585, 2018.56125941136, 2018.58042436687, 2018.59958932238, 
    2018.61875427789, 2018.6379192334, 2018.65708418891, 2018.67624914442, 
    2018.69541409993, 2018.71457905544, 2018.73374401095, 2018.75290896646, 
    2018.77207392197, 2018.79123887748, 2018.81040383299, 2018.8295687885, 
    2018.84873374401, 2018.86789869952, 2018.88706365503, 2018.90622861054, 
    2018.92539356605, 2018.94455852156, 2018.96372347707, 2018.98288843258, 
    2019.00205338809, 2019.0212183436, 2019.04038329911, 2019.05954825462, 
    2019.07871321013, 2019.09787816564, 2019.11704312115, 2019.13620807666, 
    2019.15537303217, 2019.17453798768, 2019.19370294319, 2019.2128678987, 
    2019.23203285421, 2019.25119780972, 2019.27036276523, 2019.28952772074, 
    2019.30869267625, 2019.32785763176, 2019.34702258727, 2019.36618754278, 
    2019.38535249829, 2019.4045174538, 2019.42368240931, 2019.44284736482, 
    2019.46201232033, 2019.48117727584, 2019.50034223135, 2019.51950718686, 
    2019.53867214237, 2019.55783709788, 2019.57700205339, 2019.5961670089, 
    2019.61533196441, 2019.63449691992, 2019.65366187543, 2019.67282683094, 
    2019.69199178645, 2019.71115674196, 2019.73032169747, 2019.74948665298, 
    2019.76865160849, 2019.787816564, 2019.80698151951, 2019.82614647502, 
    2019.84531143053, 2019.86447638604, 2019.88364134155, 2019.90280629706, 
    2019.92197125257, 2019.94113620808, 2019.96030116359, 2019.9794661191
    ))

I am running a gam that looks like this:我正在运行一个看起来像这样的游戏:

CSA1.offset.gam.week<-gam(landings~ s(Date, bs = "tp") + s(month, bs = "cc", k=12) + offset(log(effort)),
                           data = CSA1.effort.land.week2, family = nb, method="REML")

I am looking to use predict.gam() to plot my data in ggplot but am having issues due the presence of an offset.我希望将 predict.gam() 用于 plot 我在 ggplot 中的数据,但由于存在偏移而遇到问题。

When I use predict.gam() like this to get add a fit and SE to my dataset it looks like this:当我像这样使用 predict.gam() 为我的数据集添加拟合和 SE 时,它看起来像这样:

cbind(CSA1.effort.land.week2,
                          predict.gam(CSA1.offset.gam.week, 
                                  se.fit=TRUE,
                                  type="response",
                                  terms="s(Date)"))

When I plot this fit it shows up as an extremely jagged linear model.当我使用 plot 时,它显示为非常锯齿状的线性 model。

这是我绘制数据时的样子

When I remove the offset, I see a GAM that alligns with what I expect a my GAM to look like but I need to include this offset in my data.当我删除偏移量时,我看到一个 GAM 与我期望的 GAM 的外观一致,但我需要在我的数据中包含这个偏移量。

This is the GAM without the offset:这是没有偏移量的 GAM:

CSA1.offset.gam.week<-gam(landings~ s(Date, bs = "tp")+s(month, bs = "cc", k=12), data = CSA1.effort.land.week2,family = nb, method="REML") CSA1.offset.gam.week<-gam(landings~ s(Date, bs = "tp")+s(month, bs = "cc", k=12), data = CSA1.effort.land.week2,family = nb,方法=“REML”)

This is what that GAM looks like with the same predict.gam function as the previous example这就是 GAM 的样子

How should I be using the predict.gam function with this GAM if I intend to keep the offset???如果我打算保留偏移量,我应该如何将 predict.gam function 与这个 GAM 一起使用???

The jaggedness is coming from the predictions using different (the observed) effort values.锯齿来自使用不同(观察到的)努力值的预测。 The data arose from different efforts so if you want to compare the model output with the data then you need to provide the observed offsets.数据来自不同的努力,因此如果您想将 model output 与数据进行比较,那么您需要提供观察到的偏移量。

It you want to show the model predictions for the same amount of effort per observation, generate some new data over the range of covariates and provide a constant offset:如果您想显示 model 预测,每次观察的工作量相同,在协变量范围内生成一些新数据并提供恒定偏移量:

newd <- with(CSA1.effort.land.week2,
             data.frame(Date = seq(min(Date), max(Date), length = 1000),
                        effort = 1))
newd <- transform(newd, month = as.numeric(format(Date, format = "%m")))

(not tested; am away from a computer with R right now.) (未经测试;我现在远离一台装有 R 的计算机。)

But don't expect this to go anywhere near the data because that would be an apples-to-oranges comparison for the same reason you modelled the data with the offset in the first place.但是不要指望 go 靠近数据的任何地方,因为这将是一个苹果与橘子的比较,原因与您首先使用偏移量建模数据的原因相同。

With effort = 1 , the units on the values given by predict() will be landings per unit effort.effort = 1的情况下, predict()给出的值的单位将是每单位努力的着陆。 If you set effort = 10 it would be landings per 10 efforts.如果您设置effort = 10 ,它将是每 10 次努力的着陆。 Whatever effort is measured in would give you the final units.无论用什么effort来衡量都会给你最终的单位。 If effort was in hours, then for effort = 10 the units would be landings per 10 hours.如果effort以小时为单位,那么对于effort = 10 ,单位将是每 10 小时着陆。

When you include an offset in a count model like this, the model becomes a rate per unit effort (in your case).当您像这样在计数 model 中包含偏移量时, model 将成为每单位effort的费率(在您的情况下)。 Hence you can get predicted counts out of the model model for any amount of effort, but you do need to supply it as a constant value if you want to compare predicted counts on a common basis.因此,您可以通过任何努力从 model model 中获得预测计数,但如果您想在通用基础上比较预测计数,则需要将其作为常数提供。

I don't know how you created the plot shown, but don't use the standard error when type = "response" ;我不知道您是如何创建显示的 plot 的,但是当type = "response"时不要使用标准错误; it's not wrong, it's just not useful for creating a confidence interval when on the response scale.这没有错,只是在响应量表上创建置信区间没有用。 If you didn't create the confidence this way — on the link scale and then back transform to the response scale using the inverse of the link function — then it is likely wrong.如果您没有以这种方式创建置信度——在链接尺度上,然后使用链接 function 的倒数返回到响应尺度——那么它可能是错误的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM