简体   繁体   English

如何从R中的原始数据生成频率表

[英]How to generate frequency table from raw data in R

I am new to R. I want to generate a frequency table from raw data (decimals) like: 我是R.的新手。我想从原始数据(小数)生成频率表,如:

x
      V1
1  10.10
2  46.65
3  53.60
4  38.50
5  45.95
6  12.25
7  59.60
8  23.30
9  11.05
10 58.35
11 40.20
12 11.05
13 10.45
14 26.45
15 13.25
16 21.15
17 35.00
18 29.05
19 25.40
20 47.20
21 42.45
22 57.30
23 55.65
24 56.50
25 26.95
26 59.65
27 32.10
28 29.00
29 34.75
30 21.65

into something like this: 进入这样的事情:

Class            Frequency
(10.00 - 19.99)         6
(20.00 - 29.99)         8
(30.00 - 39.99)         4
(40.00 - 49.99)         5
(50.00 - 59.99)         7

I use the code below: 我使用下面的代码:

factorx<-factor(cut(x, breaks=nclass.Sturges(x)))

but I get something like this: 但我得到这样的东西:

Error in cut.default(x, breaks = nclass.Sturges(x)) : 'x' must be numeric cut.default中的错误(x,breaks = nclass.Sturges(x)):'x'必须是数字

How should I make 'x' numeric? 我该怎么做'x'数字?

As requested: 按照要求:

h <- dput(x) structure(list(V1 = c(10.1, 46.65, 53.6, 38.5, 45.95, 12.25, 59.6, 23.3, 11.05, 58.35, 40.2, 11.05, 10.45, 26.45, 13.25, 21.15, 35, 29.05, 25.4, 47.2, 42.45, 57.3, 55.65, 56.5, 26.95, 59.65, 32.1, 29, 34.75, 21.65)), .Names = "V1", class = "data.frame", row.names = c(NA, -30L)) h < - dput(x)结构(列表(V1 = c)(10.1,46.65,53.6,38.5,45.95,12.25,59.6,23.3,11.05,58.35,40.2,11.05,10.45,26.45,13.25,21.15,35,29.05 ,25.4,47.2,42.45,57.3,55.65,56.5,26.95,59.65,32.1,29,34.75,21.65)),. Name =“V1”,class =“data.frame”,row.names = c(NA, -30L))

x is a data frame. x是数据帧。 x$V1 is numeric. x $ V1是数字。

factor(cut(x$V1, breaks=nclass.Sturges(x$V1)))

If you know what breakpoints you are using you can just use hist with plot=FALSE 如果你知道你正在使用什么断点,你可以使用hist with plot=FALSE

hist will return histogram class object ( h in below example). hist将返回histogram类对象(以下示例中的h )。 h$counts gives you frequency for given histogram cells defined by breaks argument. h$counts为您提供由breaks参数定义的给定直方图单元格的频率。

> x
 [1] 10.10 46.65 53.60 38.50 45.95 12.25 59.60 23.30 11.05 58.35 40.20 11.05 10.45 26.45 13.25 21.15 35.00 29.05 25.40 47.20
[21] 42.45 57.30 55.65 56.50 26.95 59.65 32.10 29.00 34.75 21.65
> h <- hist(x, plot=FALSE, breaks = c(10,20,30,40,50,60))
> h
$breaks
[1] 10 20 30 40 50 60

$counts
[1] 6 8 4 5 7

$intensities
[1] 0.02000000 0.02666667 0.01333333 0.01666667 0.02333333

$density
[1] 0.02000000 0.02666667 0.01333333 0.01666667 0.02333333

$mids
[1] 15 25 35 45 55

$xname
[1] "x"

$equidist
[1] TRUE

attr(,"class")
[1] "histogram"
> h$counts 
[1] 6 8 4 5 7

Even if you don't know breaks, you can use hist with plot=FALSE and get decent results as default for breaks is "Sturges" 即使你不知道休息时间,也可以使用带有plot=FALSE hist并获得不错的结果,因为休息的默认值是“Sturges”

> h2 <- hist(x, plot=FALSE)
> h2$breaks
[1] 10 20 30 40 50 60
> h2$counts
[1] 6 8 4 5 7

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM