简体   繁体   English

如何将茎叶图转换为 R 中的数据集?

[英]How to convert a stem and leaf plot into a data set in R?

The stem and leaf plot that I need to convert is given below-我需要转换的茎叶图如下-

24|9
23|
22|1
21|7
20|2, 2, 5, 5, 6, 9, 9, 9
19|0, 0, 0, 0, 0, 1, 1, 2, 4, 4, 5, 8
18|0, 1, 1, 2, 2, 2, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 6, 7, 9, 9, 9
17|1, 1, 1, 2, 3, 3, 4, 4, 4, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 9
16|0, 0, 1, 1, 1, 1, 2, 4, 5, 5, 6, 6, 8, 8, 8, 8
15|0, 1, 1, 1, 1, 1, 1, 5, 5, 5, 5, 6, 6, 6, 7, 7, 8, 9
14|0, 0, 0, 1, 2, 3, 4, 5, 6, 7, 7, 7, 8, 9, 9
13|0, 0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 4, 5, 5, 6, 6, 6, 6, 7, 7, 8, 8, 8, 9, 9, 9
12|1, 1, 1, 2, 2, 2, 3, 4, 4, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 8, 8, 9, 9, 9
11|0, 1, 1, 2, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 9, 9
10|0, 2, 3, 3, 3, 4, 4, 5, 7, 7, 8
9|0, 0, 9
8|6

Here's maybe one way.这可能是一种方法。 If your data looks like this如果你的数据看起来像这样

stem <- "24|9
23|
22|1
21|7
20|2, 2, 5, 5, 6, 9, 9, 9
19|0, 0, 0, 0, 0, 1, 1, 2, 4, 4, 5, 8
18|0, 1, 1, 2, 2, 2, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 6, 7, 9, 9, 9
17|1, 1, 1, 2, 3, 3, 4, 4, 4, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 9
16|0, 0, 1, 1, 1, 1, 2, 4, 5, 5, 6, 6, 8, 8, 8, 8
15|0, 1, 1, 1, 1, 1, 1, 5, 5, 5, 5, 6, 6, 6, 7, 7, 8, 9
14|0, 0, 0, 1, 2, 3, 4, 5, 6, 7, 7, 7, 8, 9, 9
13|0, 0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 4, 5, 5, 6, 6, 6, 6, 7, 7, 8, 8, 8, 9, 9, 9
12|1, 1, 1, 2, 2, 2, 3, 4, 4, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 8, 8, 9, 9, 9
11|0, 1, 1, 2, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 9, 9
10|0, 2, 3, 3, 3, 4, 4, 5, 7, 7, 8
9|0, 0, 9
8|6"

Then we can split up the rows and for each row we split by the pipe.然后我们可以拆分行,对于每一行,我们通过管道拆分。 Then we split the right side by commas and join each of those values to the value to the left of the pipe.然后我们用逗号分割右侧,并将每个值连接到管道左侧的值。

rows <- strsplit(stem,"\n")[[1]]
values <- unlist(lapply(strsplit(rows,"\\|"), function(x) {
  end_digits <- strsplit(x[2], ", ")[[1]]
  if (!all(is.na(end_digits))) {
    paste0(x[1], end_digits)
  } else {
    NULL
  }
}
))

This will return character values, but you could convert to numeric with这将返回字符值,但您可以使用

as.numeric(values)

Here is a different approach.这是一种不同的方法。 Using @MrFlick's stem and rows objects:使用@MrFlick 的stemrows对象:

rows <- strsplit(stem,"\n")[[1]]
rows.lst <- strsplit(rows,"\\|")
tens <- as.numeric(sapply(rows.lst, "[", 1)) * 10
ones <- sapply(strsplit(sapply(rows.lst, "[", 2), ","), as.numeric)
vals <- unlist(mapply("+", tens, ones))
vals <- vals[!is.na(vals)]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM