I have a data frame of the following:
> head(casted)
ID nobs sulfate nitrate
1 1 117 3.880701 0.5481368
2 2 1041 4.460811 0.9474492
3 3 243 4.327613 0.6585144
4 4 474 4.214956 0.8701622
5 5 402 4.210072 1.0939005
6 6 228 4.102132 0.5206404
I would like to add a column "cor" with that uses the cor() on the sulfate and nitrate column per ID, but when I used the following code, I just get 1 value populating the entire column:
casted$cor <- cor(casted$sulfate, casted$nitrate)
> head(casted)
ID nobs sulfate nitrate cor
1 1 117 3.880701 0.5481368 0.00940941
2 2 1041 4.460811 0.9474492 0.00940941
3 3 243 4.327613 0.6585144 0.00940941
4 4 474 4.214956 0.8701622 0.00940941
5 5 402 4.210072 1.0939005 0.00940941
6 6 228 4.102132 0.5206404 0.00940941
I know I'm doing something wrong, but I can still can't figure it out.
Thanks! Meera
first you should know what a correlation is. Correlation is a statistical method to find the relationship between two samples. So to calculate the correlation, you need to have two series instead of two numbers. For example, we cannot tell the correlation between 1 and 2 cause we don't have enough information here, or we can say we cannot create a covariance matrix based on two numbers. What you did is right, the cor
column is the correlation between sulfate column and nitrate column. The calculation uses every number in two columns, so it will give only one result.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.