简体   繁体   中英

How can I calculate the inter-pair correlation of a variable according to id in the whole dataframe?

I have a twin-dataset, in which there is one column called wpsum , another column is family-id , which is the same for corresponding twin pairs.

        wpsum    family-id
twin 1     14          220    
twin 2     18          220

I want to calculate the correlation between wpsum of those with the same family-id, while there are also some single family id's , if one twin did not take part in the re-survey. family-id is a character.

There's no correlation between wpsum of those with the same family-id, as you put it, mainly because there's no third variable with which to correlate wpsum within the family-id groups (see my comment), but you can get the difference in wpsum scores within the groups. Maybe that's what you meant by correlation. Here's how to get those (I changed and expanded your example):

dat <- data.frame(wpsum = c(14, 18, 20, 5, 10, NA, 1), 
              family_id = c("220","220","221","221","222","222","223"))
dat
  wpsum family_id
1    14       220
2    18       220
3    20       221
4     5       221
5    10       222
6    NA       222
7     1       223

diffs <- by(dat, dat$family_id, function(x) abs(x$wpsum[1] - x$wpsum[2]))
diffs
dat$family_id: 220
[1] 4
------------------------------ 
dat$family_id: 221
[1] 15
------------------------------
dat$family_id: 222
[1] NA
------------------------------
dat$family_id: 223
[1] NA

You can make a data.frame with this new variable of differences like so:

diff.frame <- data.frame(diffs = as.numeric(diffs), family_id = names(diffs))
diff.frame
  diffs family_id
1     4       220
2    15       221
3    NA       222
4    NA       223

Note that neither missing values nor missing observations are a (coding) problem here - they just result in missing differences without error. If you started having more than two observations within each family ID, though, then you'd need to do something different.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM