简体   繁体   中英

Circular statistics polar coordinates data in R

I am very much new to dealing with this type of data (polar coordinates!) and it would be great if someone could help me out.

My data come from an experiment in which I had 66 different pairs of bacterial strains of which I wanted to study the interactions. In order to study the effect of each strain on the other I compared the abundance of each strain when it was alone if compared to when it was paired with the other strain. So for example I calculated the effect of strain A on strain B and the effect of strain B on strain A. This gave me a point with the coordinates corresponding to the effect of A on B and B on A for each of my pairs. I converted these data into polar coordinates and I obtained this type of dataset:

treatment   radius  theta
1   5.346605488 -53.42975695
1   4.781032074 -48.89982034
1   3.408335845 -45.32998294
1   1.594707376 -30.28159102
1   4.995105439 -47.46835867
2   2.182870308 -69.97527886
2   1.376227293 -82.86544789
2   1.996722475 -81.86548945
2   4.087804099 -89.21073708
2   5.665053864 -71.35803445
3   7.655837189 -80.95798067
3   2.689244996 -69.29463991
3   3.286408329 -88.82404786
3   2.372054818 -77.3849227
3   2.401522618 -73.50042193
4   1.957466401 -86.72672854
4   1.094525546 -78.37493516
4   10.39191001 -79.39487844
4   2.81619011  -55.33935439
4   2.768492768 -83.27824524
5   2.960390522 -83.01541004
5   8.667030807 -85.17497452
5   3.171949653 -85.45600376
5   1.261198824 -52.24672527
6   4.339405038 -55.69025966
6   4.66365939  -59.88091407
6   3.030254841 -67.97377372
6   2.353734464 -61.28096828
6   1.046854294 -32.56853164
7   6.588535649 -85.19534077
7   4.198267055 -62.49718747
7   9.515127289 -82.14133253
7   1.261910096 -63.06872102
8   1.233816215 -47.98689163
8   0.855861695 -59.11215779
8   2.397212184 -80.04916414
8   5.404919495 -81.97150648
9   2.688518935 -61.30467223
9   3.966309178 -69.84947341
9   2.432244246 -68.81819762
9   3.43740085  -55.16458675
9   0.997997694 -71.83281473
10  1.683307917 -42.95687293
10  0.820014414 -80.66580717
10  1.290828883 -83.42955371
10  1.465916446 -79.83509581
10  2.302205529 -86.0459686
11  5.308080093 -73.1243189
11  1.520872026 -88.34749575
11  6.454746366 -76.58588688
11  4.78895044  -86.06747421
11  3.257530999 -70.74498431
12  2.900747649 -31.43851989
12  7.20087566  -74.40240034
12  3.506042507 -45.99152964
12  4.185267099 -50.23151617
12  7.050000726 -53.81709571
13  1.384155427 -58.09424224
13  5.053845739 -71.55457806
13  4.735068509 -84.72403735
13  2.680085474 -79.49351393
13  3.14974405  -74.8777932
14  2.948549954 -62.8023809
14  4.127180564 -80.86173441
14  3.262360907 -50.07616196
14  1.696876591 -22.29395164
14  2.729769567 -74.38232362
15  1.2955073   -81.4032846
15  3.527414889 -69.86791141
15  2.784319251 -52.20531863
15  2.612819797 -36.47272335
15  1.842054162 -66.66059134
16  1.840787298 -72.57479843
16  6.8643205   -73.09682469
16  2.654893118 -83.12032406
16  0.966705258 17.92145538
17  2.486480981 -72.75194085
17  1.743896498 -6.866658211
17  1.67501909  -74.00470613
17  0.257437514 9.570402798
18  0.713744723 -89.85444042
18  2.66346159  -69.89457024
18  0.643897066 -46.87018048
18  3.227695962 -82.37095963
18  0.927908178 -81.8102089
19  1.419620687 -27.19633419
19  1.235456006 -48.95104975
19  2.341406093 -59.51153717
19  1.707978572 -42.20335283
20  2.64762226  -62.26528889
20  3.999628573 -80.97346898
20  1.343423811 -50.26800644
21  0.550719617 -21.44166023
21  0.998411135 -53.46021735
21  4.645733848 -89.46562929
21  1.184768725 -25.64563336
22  1.236062405 -48.25907998
22  1.62781082  26.24924794
22  0.482285052 -49.13934417
22  2.456873132 -84.42483449
22  0.633405353 -34.76443981
23  2.501732027 -77.5145514
23  1.553947876 -47.03314351
23  1.904313581 -20.48864195
23  1.417719503 -18.97532658
23  3.359978244 -65.98810342
24  1.841779957 -57.82423336
24  1.944168995 -83.72435556
24  1.723335563 -86.43854809
24  2.245607465 -18.05044439
24  1.71018206  -84.01572549
25  3.160911024 -25.89542425
25  1.884353194 -88.84667861
25  1.657340195 -57.91688887
25  3.244710974 -30.41742685
25  3.047461157 -57.65594863
26  4.142434092 -79.10775556
26  4.70885302  -84.38144988
26  3.871701704 -77.37403595
26  1.815104811 -80.07878221
26  5.756489628 -87.15817329
27  1.760229703 -23.71849938
27  1.619479137 -89.41313301
27  0.949475302 -52.16437553
27  0.566431907 -2.634253126
27  1.67229617  -78.27332119
28  1.327650364 47.90821531
28  1.740336854 -44.01261513
28  1.321542483 -46.98765031
28  1.333688986 11.19965187
28  1.419719047 -69.08867896
29  0.648536009 -48.88086991
29  2.112819841 -84.22410986
29  1.088339926 -60.49238911
29  0.446947519 -1.971477582
29  0.726254374 17.38780438
30  0.610318812 12.73868599
30  1.011102767 -18.69664112
30  2.357970381 83.45729602
30  3.075981632 -86.54599794
30  4.399281053 -79.66361213
31  9.682561002 -73.23143687
31  6.486798742 -51.40872403
31  4.744326098 -57.84898633
31  10.94679131 -49.77486765
31  6.288273977 -53.54395613
32  2.422406181 -66.66946557
32  0.920208692 -50.71386553
32  2.318672106 -30.51639453
32  1.18158908  -65.28441973
33  0.770702488 -34.5071325
33  4.809790703 -87.88054507
33  1.243396123 -57.43726582
33  0.826032874 -63.68191021
33  2.379570873 -78.77666128
34  4.58813844  -57.16272711
34  3.240458513 -74.18252573
34  1.450322312 -46.82466405
34  5.097538168 -88.4063221
34  0.933642832 -20.56162779
35  6.494675784 -85.81998773
35  2.982113314 -83.96232252
35  3.209362461 -88.53565823
36  1.101045576 -63.44748768
36  2.18134314  -79.1625091
36  3.329661735 -88.18964925
36  2.110430927 -87.22933857
36  3.124463519 -81.17945818
37  2.256008327 -88.78732993
37  1.311453668 -74.55941719
37  3.458028215 -58.63737495
37  2.287683009 -63.59969694
37  3.089712989 -50.56807704
38  1.164101757 -84.99593698
38  1.227273765 -74.89875991
38  2.568166667 79.64657422
38  0.717633728 -19.13324987
38  0.466430262 -87.88375091
39  2.619934245 -57.70555911
39  4.505659844 -76.79125763
39  7.912571121 -48.30617156
39  3.936037923 -64.58813369
39  4.71978189  -42.52477656
40  6.26191457  -84.24228913
40  5.672705474 -89.53114846
40  1.791731701 -33.36207675
40  3.379644282 -69.43863361
40  1.563490233 -51.96221695
41  1.120633049 67.59124584
41  3.851234728 -54.95557743
41  3.992934669 -64.04201801
41  3.963263793 -83.83257337
41  6.285734806 -58.94235124
42  1.037265768 -56.39585703
42  0.702067455 -78.28956554
42  1.874208904 -73.25338047
42  2.683350538 -63.40789813
42  0.822052527 -76.00088947
43  2.071759974 -78.17313857
43  2.689560915 -88.73055479
43  0.703831415 -62.30962246
43  2.341558274 -83.86680849
43  1.595247369 -80.6598101
44  1.047233184 -84.42259527
44  2.543651769 -86.51355692
44  1.577552784 -73.91302464
44  1.689553615 -74.79168199
45  0.99484303  -80.19967507
45  1.044169017 -43.04384197
45  1.164074471 -49.08276664
45  0.689804286 -44.37561371
46  1.083862964 -64.15217472
46  2.626979422 -88.64235148
46  2.454247469 -47.05093786
46  2.77983216  -89.26329048
46  2.636957485 -81.03972204
47  0.413704382 -56.55826312
47  1.145326012 -84.64626702
47  2.038399115 -81.85662372
47  2.253731222 -86.41566587
47  1.3133469   64.52243412
48  3.403932556 -80.36071198
48  2.04929866  -70.47795907
48  2.274349863 -77.81310408
48  0.13593279  46.38254256
48  0.702184063 -3.612045051
49  0.906223302 -78.31476515
49  0.554538317 50.33543382
49  0.089680453 23.33095578
49  2.2831634   -67.13247686
49  1.627864676 -72.52132829
50  0.34672496  66.11871934
50  1.160451029 -66.89054777
50  1.760678964 -58.30762395
50  1.254324633 -41.19404821
50  1.730734607 -67.09376641
51  0.842785032 81.82395052
51  1.974954473 -80.682135
51  0.865344327 -47.07894474
51  0.909533784 -76.02511259
51  1.685635709 78.20115372
52  2.392503081 -79.74664961
52  1.946110615 -70.94946049
52  8.574258316 -42.65386143
52  2.538714806 -48.03630574
52  2.529050979 -29.92328407
53  2.866661501 -78.54261642
53  2.590927316 -75.08833379
53  5.90479778  -52.57345606
53  5.799716577 -48.23386105
53  2.07192245  -88.00474074
54  1.680713598 -73.63319735
54  2.557497408 -67.94294466
54  2.393764255 -56.19194117
54  6.026774503 -82.18275762
54  0.788317053 -30.6404756
55  1.859034516 -77.09826262
55  3.817813613 -69.26816285
55  3.42856831  -71.12750351
55  2.454971668 -61.22096633
55  1.618509495 -63.11739719
56  4.582911236 -31.04536129
56  3.997256178 -53.91325239
56  1.219757833 -77.19780486
56  7.377663053 -76.85745566
56  2.881431405 -68.73534505
57  2.004327103 -74.8284809
57  2.748344386 -71.30903296
57  2.460209206 -69.62695585
57  1.775560107 -80.3723268
57  2.645612131 -89.09829133
58  1.54991856  -62.44032153
58  6.534223736 -85.84648469
58  0.422375885 62.16515901
58  0.610233226 -27.29862046
58  3.393479727 -56.14377871
59  1.601596058 -64.70208698
59  1.319995497 -74.36073404
59  0.383925829 -45.27272566
59  0.770918761 -14.94572655
59  2.768653593 -88.55860395
60  3.100129667 -79.77906075
60  0.754369481 -56.41040781
60  1.063078742 -38.06663906
60  4.430727193 -35.69757344
60  1.151375695 -58.31667216
61  0.222873197 56.11287704
61  0.693297704 -10.64168064
61  4.895399027 -80.48905585
61  1.476161129 -79.09424876
61  2.362055223 -82.82717397
62  0.271796128 -77.99977538
62  2.712343601 -83.45960915
62  2.640394011 -79.62228636
62  2.044795801 -65.63167684
62  0.977870455 -79.87283982
63  2.093999943 -34.17579722
63  2.066850888 -58.30139095
63  1.919951867 -72.04915327
63  1.847203007 -53.3080848
63  2.523697396 -86.41516485
64  3.396946374 -76.88253208
64  1.502111371 -56.46177994
64  3.447925893 -83.55087526
64  1.436601103 -78.3262211
65  0.973834501 30.49055
65  1.047860828 22.48877392
65  1.493635682 -5.403700435
65  1.468349931 27.39802239
65  3.062567844 -37.09222845
66  1.172876541 83.48805632
66  1.35911911  -68.34950195
66  0.828509975 3.070100175
66  1.171217644 -37.19274186
66  1.914431983 -35.89376613

The idea is that the radius gives me a measure of the intensity of the interaction between the two strains while the angle (theta) gives me an idea of the type of the interaction ongoing between the two strains (depending on where the angle falls, corresponding to the cartesian plot quadrants, eg if it falls in the quadrant (-,-) it means that the interaction is mutual inhibition between the two strains).

I would like to do statistics to have a measure of the significance of the interactions between each strain. I thought I could determine if, for each treatment and across replicates, the distance from centre measure (radius) and the angle measure (theta) are significant (ie significantly different from 0). I am aware of the package " circular " but I am not sure of how to use it/which test could be useful for my case.

Any suggestion (also reading material!) would be very useful.

在此处输入图片说明 You may have many more data points, but scrutiny of a quick plot suggests

  1. The observed range of angles is very limited, which often implies that you don't really need circular statistics. It is not even given that angle is the best way to record direction: sine or cosine may be closer to the underlying problem. Some kind of regression on angle or sine or cosine may suffice, with perhaps parameterisation so that $-90^\\circ$ defines an intercept. (Perhaps you should just add $90^\\circ$ .)

  2. There may well be hard limits in practice on what angles are possible or likely, and if so knowing them is very important to guide what makes scientific and statistical science.

  3. The question of whether radius is significantly different from zero is hard to understand. First off, whatever is called a radius is usually a positive quantity, and if your definition implies otherwise, please explain. Second off, all reported values in the example are positive, so a significance test appears pointless for that reason. Perhaps you mean something more like "are radius and angle related?" to which the answer appears to be yes. Assuming that radius must be positive, analysis in terms of its logarithm seems indicated to me.

  4. It makes quite a difference to analysis whether the angle is in some sense given and the radius is the outcome to be explained (which with problems like this is often true) -- or the opposite -- or neither.

  5. Given the small subsamples I have not attempted an analysis in terms of treatments but the plot suggests that (eg) treatment 1 at least is quite distinctive.

Good statistical advice is very hard to give with no context whatsoever on what the numbers represent (other than something in polar coordinates).

EDIT Given more information and the full dataset, I can try a little more.

Disclaimer. I don't understand the science here and I don't even understand what kind of statistical problem this is. So why say more? I have some experience with circular data, which many statistical people don't have at all, so perhaps something a little helpful can be said.

The full dataset is 315 observations on 66 treatments, the latter represented by 5 observations in most instances, but only 3 or 4 in some (thus that's why not 330).

I can readily imagine that this represents a great deal of hard experimental work, but unfortunately from a statistical point of view 3, 4 or 5 observations is a very small sub-sample size for saying much reliably about individual treatments.

Circular plots may seem natural given the outcome space, but there can be a chicken and egg question that you have to look at several before you can think easily about any one (unless perhaps you already have some experience of thinking in and about that space, as is often true of compass direction). That aside, I have found that linear plots can be very helpful too, which runs a little contrary to the advice in circular statistics texts and reviews.

A plot of the full data shows that they too seem limited, but now to half the circle. Is there any sense in which $90^\\circ = -90^\\circ$ , because that would be an important detail? It affects what kind of test might make sense, but "typical" theta, however measured, is a long way from zero.

Another plot plots all of the data repeatedly, but a little arbitrarily picks out 4 treatments as extreme in either median or median absolute deviation on radius, theta or both. That's cherry-picking and nothing to do any formal testing.

As before, logarithmic scale for radius might help.

在此处输入图片说明

It is unclear what sort of statistical analysis you are doing, but you will most likely not need to bother yourself with circular statistics like circular distributions . This is because you are using an angle to express/illustrate the effects, but the angle is not an underlying mechanism in the distribution of the observations.


In the plot below you have the data in cartesian coordinates. The first six interactions are colored to give an impression of the variation within a single group.

  • You can see that the treatments have a large variation in magnitude but cluster around a particular angle. So indeed this conversion to angle is good and seems a simple way to categorize and quantify and compare the symbiosis.

    Alternatively, you could assume a distribution of the error distribution of the points that has a correlation between the two axes, if the effect of 'a on b' is low/high, then this is also for the effect of 'b on a'. But this will mean that you have to deal with correlated errors. The conversion to polar coordinates seems to deal easily and naturally with this issue.

  • You do not need circular statistics .

    • The nature of the distribution of the angle is not such that the errors warp around. The variation in the angle for a specific treatment is only small.

    • The mechanism that determines the symbiosis is also not relating to the angle and something circular.

      You have the different strains that have some coefficient of interaction/effect 'a on b' and 'b on a'. These two coefficients (let's call them $\\beta_1$ and $\\beta_2$ ) may vary somewhat (due to all kinds of measurement errors or variations in the experiment), and the magnitude of the overall interaction $\\beta_m$ may vary (eg different times of incubation, treatment, temperature, or other factors that might influence the magnitude).

      This gives you a simplistic model of the two effects $\\beta_1 \\cdot \\beta_m$ and $\\beta_2 \\cdot \\beta_m$ . The errors/variations are not creating a circular effect. For random increases in the coefficients (which might add up), you do not get the aspect that you return to the same point.

      Comparison: Something circular is an arrow pointing in some direction on a plane (imagine for instance the arrows/hands on a clock). If the arrow is randomly turned in some direction then multiple additions towards the same direction might bring the arrow/hand back to the original place. You do not have this behavior. If there are some effects in your experiment that cause fluctuations in the coefficients of the model $\\beta_1$ , $\\beta_2$ then you do not get the effect that multiple additions in the same direction are gonna bring the angle of the effect back to the original position.

阴谋

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM