简体   繁体   English

R中的圆统计极坐标数据

[英]Circular statistics polar coordinates data in R

I am very much new to dealing with this type of data (polar coordinates!) and it would be great if someone could help me out.我对处理此类数据(极坐标!)非常陌生,如果有人能帮助我,那就太好了。

My data come from an experiment in which I had 66 different pairs of bacterial strains of which I wanted to study the interactions.我的数据来自一个实验,其中我有 66 对不同的细菌菌株,我想研究它们之间的相互作用。 In order to study the effect of each strain on the other I compared the abundance of each strain when it was alone if compared to when it was paired with the other strain.为了研究每个菌株对另一个菌株的影响,我比较了每个菌株单独时与与其他菌株配对时的丰度。 So for example I calculated the effect of strain A on strain B and the effect of strain B on strain A. This gave me a point with the coordinates corresponding to the effect of A on B and B on A for each of my pairs.例如,我计算了应变 A 对应变 B 的影响以及应变 B 对应变 A 的影响。这给了我一个点,其坐标对应于我的每一对的 A 对 B 和 B 对 A 的影响。 I converted these data into polar coordinates and I obtained this type of dataset:我将这些数据转换为极坐标,并获得了这种类型的数据集:

treatment   radius  theta
1   5.346605488 -53.42975695
1   4.781032074 -48.89982034
1   3.408335845 -45.32998294
1   1.594707376 -30.28159102
1   4.995105439 -47.46835867
2   2.182870308 -69.97527886
2   1.376227293 -82.86544789
2   1.996722475 -81.86548945
2   4.087804099 -89.21073708
2   5.665053864 -71.35803445
3   7.655837189 -80.95798067
3   2.689244996 -69.29463991
3   3.286408329 -88.82404786
3   2.372054818 -77.3849227
3   2.401522618 -73.50042193
4   1.957466401 -86.72672854
4   1.094525546 -78.37493516
4   10.39191001 -79.39487844
4   2.81619011  -55.33935439
4   2.768492768 -83.27824524
5   2.960390522 -83.01541004
5   8.667030807 -85.17497452
5   3.171949653 -85.45600376
5   1.261198824 -52.24672527
6   4.339405038 -55.69025966
6   4.66365939  -59.88091407
6   3.030254841 -67.97377372
6   2.353734464 -61.28096828
6   1.046854294 -32.56853164
7   6.588535649 -85.19534077
7   4.198267055 -62.49718747
7   9.515127289 -82.14133253
7   1.261910096 -63.06872102
8   1.233816215 -47.98689163
8   0.855861695 -59.11215779
8   2.397212184 -80.04916414
8   5.404919495 -81.97150648
9   2.688518935 -61.30467223
9   3.966309178 -69.84947341
9   2.432244246 -68.81819762
9   3.43740085  -55.16458675
9   0.997997694 -71.83281473
10  1.683307917 -42.95687293
10  0.820014414 -80.66580717
10  1.290828883 -83.42955371
10  1.465916446 -79.83509581
10  2.302205529 -86.0459686
11  5.308080093 -73.1243189
11  1.520872026 -88.34749575
11  6.454746366 -76.58588688
11  4.78895044  -86.06747421
11  3.257530999 -70.74498431
12  2.900747649 -31.43851989
12  7.20087566  -74.40240034
12  3.506042507 -45.99152964
12  4.185267099 -50.23151617
12  7.050000726 -53.81709571
13  1.384155427 -58.09424224
13  5.053845739 -71.55457806
13  4.735068509 -84.72403735
13  2.680085474 -79.49351393
13  3.14974405  -74.8777932
14  2.948549954 -62.8023809
14  4.127180564 -80.86173441
14  3.262360907 -50.07616196
14  1.696876591 -22.29395164
14  2.729769567 -74.38232362
15  1.2955073   -81.4032846
15  3.527414889 -69.86791141
15  2.784319251 -52.20531863
15  2.612819797 -36.47272335
15  1.842054162 -66.66059134
16  1.840787298 -72.57479843
16  6.8643205   -73.09682469
16  2.654893118 -83.12032406
16  0.966705258 17.92145538
17  2.486480981 -72.75194085
17  1.743896498 -6.866658211
17  1.67501909  -74.00470613
17  0.257437514 9.570402798
18  0.713744723 -89.85444042
18  2.66346159  -69.89457024
18  0.643897066 -46.87018048
18  3.227695962 -82.37095963
18  0.927908178 -81.8102089
19  1.419620687 -27.19633419
19  1.235456006 -48.95104975
19  2.341406093 -59.51153717
19  1.707978572 -42.20335283
20  2.64762226  -62.26528889
20  3.999628573 -80.97346898
20  1.343423811 -50.26800644
21  0.550719617 -21.44166023
21  0.998411135 -53.46021735
21  4.645733848 -89.46562929
21  1.184768725 -25.64563336
22  1.236062405 -48.25907998
22  1.62781082  26.24924794
22  0.482285052 -49.13934417
22  2.456873132 -84.42483449
22  0.633405353 -34.76443981
23  2.501732027 -77.5145514
23  1.553947876 -47.03314351
23  1.904313581 -20.48864195
23  1.417719503 -18.97532658
23  3.359978244 -65.98810342
24  1.841779957 -57.82423336
24  1.944168995 -83.72435556
24  1.723335563 -86.43854809
24  2.245607465 -18.05044439
24  1.71018206  -84.01572549
25  3.160911024 -25.89542425
25  1.884353194 -88.84667861
25  1.657340195 -57.91688887
25  3.244710974 -30.41742685
25  3.047461157 -57.65594863
26  4.142434092 -79.10775556
26  4.70885302  -84.38144988
26  3.871701704 -77.37403595
26  1.815104811 -80.07878221
26  5.756489628 -87.15817329
27  1.760229703 -23.71849938
27  1.619479137 -89.41313301
27  0.949475302 -52.16437553
27  0.566431907 -2.634253126
27  1.67229617  -78.27332119
28  1.327650364 47.90821531
28  1.740336854 -44.01261513
28  1.321542483 -46.98765031
28  1.333688986 11.19965187
28  1.419719047 -69.08867896
29  0.648536009 -48.88086991
29  2.112819841 -84.22410986
29  1.088339926 -60.49238911
29  0.446947519 -1.971477582
29  0.726254374 17.38780438
30  0.610318812 12.73868599
30  1.011102767 -18.69664112
30  2.357970381 83.45729602
30  3.075981632 -86.54599794
30  4.399281053 -79.66361213
31  9.682561002 -73.23143687
31  6.486798742 -51.40872403
31  4.744326098 -57.84898633
31  10.94679131 -49.77486765
31  6.288273977 -53.54395613
32  2.422406181 -66.66946557
32  0.920208692 -50.71386553
32  2.318672106 -30.51639453
32  1.18158908  -65.28441973
33  0.770702488 -34.5071325
33  4.809790703 -87.88054507
33  1.243396123 -57.43726582
33  0.826032874 -63.68191021
33  2.379570873 -78.77666128
34  4.58813844  -57.16272711
34  3.240458513 -74.18252573
34  1.450322312 -46.82466405
34  5.097538168 -88.4063221
34  0.933642832 -20.56162779
35  6.494675784 -85.81998773
35  2.982113314 -83.96232252
35  3.209362461 -88.53565823
36  1.101045576 -63.44748768
36  2.18134314  -79.1625091
36  3.329661735 -88.18964925
36  2.110430927 -87.22933857
36  3.124463519 -81.17945818
37  2.256008327 -88.78732993
37  1.311453668 -74.55941719
37  3.458028215 -58.63737495
37  2.287683009 -63.59969694
37  3.089712989 -50.56807704
38  1.164101757 -84.99593698
38  1.227273765 -74.89875991
38  2.568166667 79.64657422
38  0.717633728 -19.13324987
38  0.466430262 -87.88375091
39  2.619934245 -57.70555911
39  4.505659844 -76.79125763
39  7.912571121 -48.30617156
39  3.936037923 -64.58813369
39  4.71978189  -42.52477656
40  6.26191457  -84.24228913
40  5.672705474 -89.53114846
40  1.791731701 -33.36207675
40  3.379644282 -69.43863361
40  1.563490233 -51.96221695
41  1.120633049 67.59124584
41  3.851234728 -54.95557743
41  3.992934669 -64.04201801
41  3.963263793 -83.83257337
41  6.285734806 -58.94235124
42  1.037265768 -56.39585703
42  0.702067455 -78.28956554
42  1.874208904 -73.25338047
42  2.683350538 -63.40789813
42  0.822052527 -76.00088947
43  2.071759974 -78.17313857
43  2.689560915 -88.73055479
43  0.703831415 -62.30962246
43  2.341558274 -83.86680849
43  1.595247369 -80.6598101
44  1.047233184 -84.42259527
44  2.543651769 -86.51355692
44  1.577552784 -73.91302464
44  1.689553615 -74.79168199
45  0.99484303  -80.19967507
45  1.044169017 -43.04384197
45  1.164074471 -49.08276664
45  0.689804286 -44.37561371
46  1.083862964 -64.15217472
46  2.626979422 -88.64235148
46  2.454247469 -47.05093786
46  2.77983216  -89.26329048
46  2.636957485 -81.03972204
47  0.413704382 -56.55826312
47  1.145326012 -84.64626702
47  2.038399115 -81.85662372
47  2.253731222 -86.41566587
47  1.3133469   64.52243412
48  3.403932556 -80.36071198
48  2.04929866  -70.47795907
48  2.274349863 -77.81310408
48  0.13593279  46.38254256
48  0.702184063 -3.612045051
49  0.906223302 -78.31476515
49  0.554538317 50.33543382
49  0.089680453 23.33095578
49  2.2831634   -67.13247686
49  1.627864676 -72.52132829
50  0.34672496  66.11871934
50  1.160451029 -66.89054777
50  1.760678964 -58.30762395
50  1.254324633 -41.19404821
50  1.730734607 -67.09376641
51  0.842785032 81.82395052
51  1.974954473 -80.682135
51  0.865344327 -47.07894474
51  0.909533784 -76.02511259
51  1.685635709 78.20115372
52  2.392503081 -79.74664961
52  1.946110615 -70.94946049
52  8.574258316 -42.65386143
52  2.538714806 -48.03630574
52  2.529050979 -29.92328407
53  2.866661501 -78.54261642
53  2.590927316 -75.08833379
53  5.90479778  -52.57345606
53  5.799716577 -48.23386105
53  2.07192245  -88.00474074
54  1.680713598 -73.63319735
54  2.557497408 -67.94294466
54  2.393764255 -56.19194117
54  6.026774503 -82.18275762
54  0.788317053 -30.6404756
55  1.859034516 -77.09826262
55  3.817813613 -69.26816285
55  3.42856831  -71.12750351
55  2.454971668 -61.22096633
55  1.618509495 -63.11739719
56  4.582911236 -31.04536129
56  3.997256178 -53.91325239
56  1.219757833 -77.19780486
56  7.377663053 -76.85745566
56  2.881431405 -68.73534505
57  2.004327103 -74.8284809
57  2.748344386 -71.30903296
57  2.460209206 -69.62695585
57  1.775560107 -80.3723268
57  2.645612131 -89.09829133
58  1.54991856  -62.44032153
58  6.534223736 -85.84648469
58  0.422375885 62.16515901
58  0.610233226 -27.29862046
58  3.393479727 -56.14377871
59  1.601596058 -64.70208698
59  1.319995497 -74.36073404
59  0.383925829 -45.27272566
59  0.770918761 -14.94572655
59  2.768653593 -88.55860395
60  3.100129667 -79.77906075
60  0.754369481 -56.41040781
60  1.063078742 -38.06663906
60  4.430727193 -35.69757344
60  1.151375695 -58.31667216
61  0.222873197 56.11287704
61  0.693297704 -10.64168064
61  4.895399027 -80.48905585
61  1.476161129 -79.09424876
61  2.362055223 -82.82717397
62  0.271796128 -77.99977538
62  2.712343601 -83.45960915
62  2.640394011 -79.62228636
62  2.044795801 -65.63167684
62  0.977870455 -79.87283982
63  2.093999943 -34.17579722
63  2.066850888 -58.30139095
63  1.919951867 -72.04915327
63  1.847203007 -53.3080848
63  2.523697396 -86.41516485
64  3.396946374 -76.88253208
64  1.502111371 -56.46177994
64  3.447925893 -83.55087526
64  1.436601103 -78.3262211
65  0.973834501 30.49055
65  1.047860828 22.48877392
65  1.493635682 -5.403700435
65  1.468349931 27.39802239
65  3.062567844 -37.09222845
66  1.172876541 83.48805632
66  1.35911911  -68.34950195
66  0.828509975 3.070100175
66  1.171217644 -37.19274186
66  1.914431983 -35.89376613

The idea is that the radius gives me a measure of the intensity of the interaction between the two strains while the angle (theta) gives me an idea of the type of the interaction ongoing between the two strains (depending on where the angle falls, corresponding to the cartesian plot quadrants, eg if it falls in the quadrant (-,-) it means that the interaction is mutual inhibition between the two strains).这个想法是,半径给了我两个应变之间相互作用强度的度量,而角度 (theta) 给了我两个应变之间正在进行的相互作用类型的想法(取决于角度下降的位置,对应到笛卡尔图象限,例如,如果它落在象限 (-,-) 中,则意味着相互作用是两个菌株之间的相互抑制)。

I would like to do statistics to have a measure of the significance of the interactions between each strain.我想做统计来衡量每个菌株之间相互作用的重要性。 I thought I could determine if, for each treatment and across replicates, the distance from centre measure (radius) and the angle measure (theta) are significant (ie significantly different from 0).我想我可以确定对于每个处理和跨重复,距中心测量(半径)和角度测量(θ)的距离是否显着(即与 0 显着不同)。 I am aware of the package " circular " but I am not sure of how to use it/which test could be useful for my case.我知道包“ circular ”,但我不确定如何使用它/哪个测试对我的情况有用。

Any suggestion (also reading material!) would be very useful.任何建议(也是阅读材料!)将非常有用。

在此处输入图片说明 You may have many more data points, but scrutiny of a quick plot suggests您可能有更多的数据点,但对快速绘图的检查表明

  1. The observed range of angles is very limited, which often implies that you don't really need circular statistics.观察到的角度范围非常有限,这通常意味着您并不真正需要循环统计。 It is not even given that angle is the best way to record direction: sine or cosine may be closer to the underlying problem.甚至没有考虑到角度是记录方向的最佳方式:正弦或余弦可能更接近潜在问题。 Some kind of regression on angle or sine or cosine may suffice, with perhaps parameterisation so that $-90^\\circ$ defines an intercept.角度或正弦或余弦的某种回归可能就足够了,也许参数化使得$-90^\\circ$定义了一个截距。 (Perhaps you should just add $90^\\circ$ .) (也许你应该只添加$90^\\circ$ 。)

  2. There may well be hard limits in practice on what angles are possible or likely, and if so knowing them is very important to guide what makes scientific and statistical science.在实践中,可能或可能的角度可能存在严格的限制,如果是这样,了解它们对于指导什么是科学和统计科学非常重要。

  3. The question of whether radius is significantly different from zero is hard to understand.半径是否显着不同于零的问题很难理解。 First off, whatever is called a radius is usually a positive quantity, and if your definition implies otherwise, please explain.首先,任何被称为半径的东西通常都是一个正数,如果你的定义另有含义,请解释一下。 Second off, all reported values in the example are positive, so a significance test appears pointless for that reason.其次,示例中所有报告的值都是正值,因此显着性检验因此显得毫无意义。 Perhaps you mean something more like "are radius and angle related?"也许您的意思更像是“半径和角度是否相关?” to which the answer appears to be yes.答案似乎是肯定的。 Assuming that radius must be positive, analysis in terms of its logarithm seems indicated to me.假设半径必须是正数,对我来说似乎表明了对其对数的分析。

  4. It makes quite a difference to analysis whether the angle is in some sense given and the radius is the outcome to be explained (which with problems like this is often true) -- or the opposite -- or neither.分析角度是否在某种意义上是给定的,半径是否是要解释的结果(对于此类问题通常是正确的)——或者相反——或者两者都不是,分析会有很大的不同。

  5. Given the small subsamples I have not attempted an analysis in terms of treatments but the plot suggests that (eg) treatment 1 at least is quite distinctive.鉴于较小的子样本,我没有尝试在处理方面进行分析,但该图表明(例如)处理 1 至少是非常独特的。

Good statistical advice is very hard to give with no context whatsoever on what the numbers represent (other than something in polar coordinates).如果没有关于数字代表什么的任何上下文(除了极坐标中的某些内容),很难给出好的统计建议。

EDIT Given more information and the full dataset, I can try a little more.编辑给定更多信息和完整数据集,我可以尝试更多。

Disclaimer.免责声明。 I don't understand the science here and I don't even understand what kind of statistical problem this is.我不懂这里的科学,我什至不明白这是什么类型的统计问题。 So why say more?那为什么要多说呢? I have some experience with circular data, which many statistical people don't have at all, so perhaps something a little helpful can be said.我对循环数据有一些经验,许多统计人员根本没有这些经验,所以也许可以说一些有用的东西。

The full dataset is 315 observations on 66 treatments, the latter represented by 5 observations in most instances, but only 3 or 4 in some (thus that's why not 330).完整的数据集是对 66 个处理的 315 个观察,后者在大多数情况下由 5 个观察表示,但在某些情况下只有 3 或 4 个(因此不是 330)。

I can readily imagine that this represents a great deal of hard experimental work, but unfortunately from a statistical point of view 3, 4 or 5 observations is a very small sub-sample size for saying much reliably about individual treatments.我可以很容易地想象,这代表了大量艰苦的实验工作,但不幸的是,从统计的角度来看,3、4 或 5 次观察是非常小的子样本量,可以非常可靠地说明个别治疗。

Circular plots may seem natural given the outcome space, but there can be a chicken and egg question that you have to look at several before you can think easily about any one (unless perhaps you already have some experience of thinking in and about that space, as is often true of compass direction).考虑到结果空间,圆形图可能看起来很自然,但可能存在一个鸡和蛋的问题,您必须先查看几个问题,然后才能轻松思考任何一个问题(除非您可能已经有一些在该空间中思考和思考的经验,罗盘方向通常如此)。 That aside, I have found that linear plots can be very helpful too, which runs a little contrary to the advice in circular statistics texts and reviews.除此之外,我发现线性图也非常有用,这与循环统计文本和评论中的建议有点相反。

A plot of the full data shows that they too seem limited, but now to half the circle.完整数据的图显示它们似乎也很有限,但现在只有一半。 Is there any sense in which $90^\\circ = -90^\\circ$ , because that would be an important detail? $90^\\circ = -90^\\circ$是否有任何意义,因为这将是一个重要的细节? It affects what kind of test might make sense, but "typical" theta, however measured, is a long way from zero.它会影响什么样的测试可能有意义,但“典型的”θ,无论如何测量,距离零还有很长的路要走。

Another plot plots all of the data repeatedly, but a little arbitrarily picks out 4 treatments as extreme in either median or median absolute deviation on radius, theta or both.另一个图重复绘制所有数据,但有点随意地将 4 个处理挑选为半径、θ 或两者的中值或中值绝对偏差的极端。 That's cherry-picking and nothing to do any formal testing.这是挑剔的,与任何正式测试无关。

As before, logarithmic scale for radius might help.和以前一样,半径的对数刻度可能会有所帮助。

在此处输入图片说明

It is unclear what sort of statistical analysis you are doing, but you will most likely not need to bother yourself with circular statistics like circular distributions .目前还不清楚您在做什么类型的统计分析,但您很可能不需要为循环统计等循环统计而烦恼。 This is because you are using an angle to express/illustrate the effects, but the angle is not an underlying mechanism in the distribution of the observations.这是因为您正在使用角度来表达/说明效果,但角度并不是观测分布的潜在机制。


In the plot below you have the data in cartesian coordinates.在下面的图中,您有笛卡尔坐标中的数据。 The first six interactions are colored to give an impression of the variation within a single group.前六个交互被着色,以给出单个组内变化的印象。

  • You can see that the treatments have a large variation in magnitude but cluster around a particular angle.您可以看到这些处理的幅度变化很大,但聚集在特定角度周围。 So indeed this conversion to angle is good and seems a simple way to categorize and quantify and compare the symbiosis.因此,这种转换为角度确实很好,并且似乎是一种对共生进行分类、量化和比较的简单方法。

    Alternatively, you could assume a distribution of the error distribution of the points that has a correlation between the two axes, if the effect of 'a on b' is low/high, then this is also for the effect of 'b on a'.或者,您可以假设两个轴之间具有相关性的点的误差分布分布,如果“a 对 b”的影响低/高,那么这也是“b 对 a”的影响. But this will mean that you have to deal with correlated errors.但这意味着您必须处理相关错误。 The conversion to polar coordinates seems to deal easily and naturally with this issue.转换为极坐标似乎可以轻松自然地处理这个问题。

  • You do not need circular statistics .您不需要循环统计

    • The nature of the distribution of the angle is not such that the errors warp around.角度分布的性质并不是误差会扭曲。 The variation in the angle for a specific treatment is only small.特定治疗的角度变化很小。

    • The mechanism that determines the symbiosis is also not relating to the angle and something circular.决定共生的机制也与角度和圆形无关。

      You have the different strains that have some coefficient of interaction/effect 'a on b' and 'b on a'.你有不同的菌株,它们有一些相互作用/影响系数“a on b”和“b on a”。 These two coefficients (let's call them $\\beta_1$ and $\\beta_2$ ) may vary somewhat (due to all kinds of measurement errors or variations in the experiment), and the magnitude of the overall interaction $\\beta_m$ may vary (eg different times of incubation, treatment, temperature, or other factors that might influence the magnitude).这两个系数(我们称它们为 $\\beta_1$$\\beta_2$ )可能会有所不同(由于各种测量误差或实验中的变化),并且整体相互作用$\\beta_m$ 的大小可能会有所不同(例如不同的孵育时间、处理、温度或其他可能影响幅度的因素)。

      This gives you a simplistic model of the two effects $\\beta_1 \\cdot \\beta_m$ and $\\beta_2 \\cdot \\beta_m$ .这为您提供了$\\beta_1 \\cdot \\beta_m$$\\beta_2 \\cdot \\beta_m$两种效应的简单模型。 The errors/variations are not creating a circular effect.错误/变化不会产生循环效果。 For random increases in the coefficients (which might add up), you do not get the aspect that you return to the same point.对于系数的随机增加(可能加起来),您不会得到返回到同一点的方面。

      Comparison: Something circular is an arrow pointing in some direction on a plane (imagine for instance the arrows/hands on a clock).比较:圆形是指向平面上某个方向的箭头(想象一下时钟上的箭头/指针)。 If the arrow is randomly turned in some direction then multiple additions towards the same direction might bring the arrow/hand back to the original place.如果箭头在某个方向随机转动,则向同一方向添加多个箭头可能会将箭头/手带回原来的位置。 You do not have this behavior.你没有这种行为。 If there are some effects in your experiment that cause fluctuations in the coefficients of the model $\\beta_1$ , $\\beta_2$ then you do not get the effect that multiple additions in the same direction are gonna bring the angle of the effect back to the original position.如果您的实验中有一些影响导致模型$\\beta_1$$\\beta_2$的系数波动,那么您不会得到相同方向的多次添加会使效果的角度回到原来的位置。

阴谋

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM