I have multiple different phenotypes and xy coordinates for each cell. What would be the easiest way to calculate distances between each of my cells within the same slide? My dataset has 100,000+ cells so I'm trying to figure out the most efficient way to do this.
An example dataframe would be:
Xposition <- c(1,6,4,7,9,4,8,6,4)
Yposition <- c(6,3,2,6,3,6,1,3,7)
Phenotype <- c("A", "A", "B", "C", "C", "A", "A", "B", "B")
SlideID <- c(111,111,111,111,111,112,112,112,112)
df <- data.frame(Xposition, Yposition, Phenotype, SlideID)
I'm looking for something that could give me a dataframe where the outputs are something like:
CellType1 <- c("A", "A", "A", "A", "A", "A", "A", "B", "B", "C", "A", "A", "A", "A", "A", "B")
Celltype2 <- c("A", "B", "C", "C", "B", "C", "C", "C", "C", "C", "A", "B", "B", "B", "B", "B")
Distance <- c("5.83", "5", "6", "8.54", "2.23", "3.16", "3", "5", "5.09", "3.6", "6.4", "3.6", "1", "2.82", "7.21", "4.47")
SlideID <- c("111", "111", "111", "111", "111", "111", "111", "111", "111", "111", "112", "112", "112", "112", "112", "112")
distancedf <- data.frame(CellType1, Celltype2, Distance, SlideID)
Thanks for your help!
I think there is room for ambiguity here, but...
res <- as.data.frame.table(as.matrix(dist(df[,1:2])))
res$Var2 <- df$Phenotype[res$Var2]
res$SlideID <- df$SlideID[res$Var1]
res$Var1 <- df$Phenotype[res$Var1]
head(res)
# Var1 Var2 Freq SlideID
# 1 A A 0.000000 111
# 2 A A 5.830952 111
# 3 B A 5.000000 111
# 4 C A 6.000000 111
# 5 C A 8.544004 111
# 6 A A 3.000000 112
From this, you should be able to filter out the 0
s fairly easily, but I wanted to keep it here to show what is actually happening. Effectively, that as.data.frame.table(...)
is going from this
dist(df[,1:2])
# 1 2 3 4 5 6 7 8
# 2 5.830952
# 3 5.000000 2.236068
# 4 6.000000 3.162278 5.000000
# 5 8.544004 3.000000 5.099020 3.605551
# 6 3.000000 3.605551 4.000000 3.000000 5.830952
# 7 8.602325 2.828427 4.123106 5.099020 2.236068 6.403124
# 8 5.830952 0.000000 2.236068 3.162278 3.000000 3.605551 2.828427
# 9 3.162278 4.472136 5.000000 3.162278 6.403124 1.000000 7.211103 4.472136
through this:
as.matrix(dist(df[,1:2]))
# 1 2 3 4 5 6 7 8 9
# 1 0.000000 5.830952 5.000000 6.000000 8.544004 3.000000 8.602325 5.830952 3.162278
# 2 5.830952 0.000000 2.236068 3.162278 3.000000 3.605551 2.828427 0.000000 4.472136
# 3 5.000000 2.236068 0.000000 5.000000 5.099020 4.000000 4.123106 2.236068 5.000000
# 4 6.000000 3.162278 5.000000 0.000000 3.605551 3.000000 5.099020 3.162278 3.162278
# 5 8.544004 3.000000 5.099020 3.605551 0.000000 5.830952 2.236068 3.000000 6.403124
# 6 3.000000 3.605551 4.000000 3.000000 5.830952 0.000000 6.403124 3.605551 1.000000
# 7 8.602325 2.828427 4.123106 5.099020 2.236068 6.403124 0.000000 2.828427 7.211103
# 8 5.830952 0.000000 2.236068 3.162278 3.000000 3.605551 2.828427 0.000000 4.472136
# 9 3.162278 4.472136 5.000000 3.162278 6.403124 1.000000 7.211103 4.472136 0.000000
ultimately to this
head(as.data.frame.table(as.matrix(dist(df[,1:2]))))
# Var1 Var2 Freq
# 1 1 1 0.000000
# 2 2 1 5.830952
# 3 3 1 5.000000
# 4 4 1 6.000000
# 5 5 1 8.544004
# 6 6 1 3.000000
and the 0.000
s are the diagonals of the distance matrix (that are masked in the default representation of dist(...)
).
Per SlideID
:
lapply(split(df, df$SlideID), function(x) {
res <- as.data.frame.table(as.matrix(dist(x[,1:2])))
res$Var2 <- x$Phenotype[res$Var2]
res$SlideID <- x$SlideID[res$Var1]
res$Var1 <- x$Phenotype[res$Var1]
res
})
# $`111`
# Var1 Var2 Freq SlideID
# 1 A A 0.000000 111
# 2 A A 5.830952 111
# 3 B A 5.000000 111
# 4 C A 6.000000 111
# 5 C A 8.544004 111
# 6 A A 5.830952 111
# 7 A A 0.000000 111
# 8 B A 2.236068 111
# 9 C A 3.162278 111
# 10 C A 3.000000 111
# 11 A B 5.000000 111
# 12 A B 2.236068 111
# 13 B B 0.000000 111
# 14 C B 5.000000 111
# 15 C B 5.099020 111
# 16 A C 6.000000 111
# 17 A C 3.162278 111
# 18 B C 5.000000 111
# 19 C C 0.000000 111
# 20 C C 3.605551 111
# 21 A C 8.544004 111
# 22 A C 3.000000 111
# 23 B C 5.099020 111
# 24 C C 3.605551 111
# 25 C C 0.000000 111
# $`112`
# Var1 Var2 Freq SlideID
# 1 A A 0.000000 112
# 2 A A 6.403124 112
# 3 B A 3.605551 112
# 4 B A 1.000000 112
# 5 A A 6.403124 112
# 6 A A 0.000000 112
# 7 B A 2.828427 112
# 8 B A 7.211103 112
# 9 A B 3.605551 112
# 10 A B 2.828427 112
# 11 B B 0.000000 112
# 12 B B 4.472136 112
# 13 A B 1.000000 112
# 14 A B 7.211103 112
# 15 B B 4.472136 112
# 16 B B 0.000000 112
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.