简体   繁体   English

在 R 的数据框中计算不同列中的独立值

[英]count independent values in different columns in a data frame in R

I have a data frame with several different columns and I want to get the frequency of different variables, check how the frequency of data changes depending on a parameter or two and compare the changes from the known or available ids to the missing ids where I have NA .我有一个包含几个不同列的数据框,我想获取不同变量的频率,检查数据频率如何根据一个或两个参数变化,并将已知或可用ids的变化与我拥有的缺失ids进行比较NA

The objects Id is always known but there are cases where rq_ind is missing and those are interesting.对象Id始终是已知的,但在某些情况下rq_ind丢失并且这些情况很有趣。

Basically I need to get the Nieseln area class of available object / Nielsen area of all ( missing and not missing objects ( which is the inq_onr_id==NA but their object_id is available)基本上我需要获取可用 object / Nielsen 区域的 Nieseln 区域 class / Nielsen 区域(丢失和未丢失的对象(这是inq_onr_id==NA但它们的object_id可用)

    rq_id , rq_object_id , inq_onr_id,  inq_id,  Nielsen class, age_class,   revnue-class , employee_class                              
    157467  19750137    19750137    NA  3   3   4   2
    157467  19750137    19750137    NA  3   3   4   2
    423008  19750137    NA          NA  3   3   4   2   
    423008  19750137    NA          NA  3   3   4   2   
    157467  19750137    NA          NA  3   2   4   2   

    B1_fourth3month19short<-data.frame(rq_id,
                                     rq_object_id,
                                      inq_onr_id,inq_id,
                                      nielsen_area,Employeeclass)

All info are factors in principle.原则上,所有信息都是因素。

What I want to take out is to find out how the fre(rq_object_id) vs Nieslen area changes in the case where the onr-id is missing vs where the onr _id is available.我想了解的是,在onr-id缺失与onr _id可用的情况下, fre(rq_object_id) vs Nieslen 区域如何变化。

What you need to use is table :您需要使用的是table

out <- table(df[,c(2,3,5)],useNA = "ifany")

where df is your initial data frame.其中df是您的初始数据框。

Output: Output:

> out
, , Nielsen_class = 3

            inq_onr_id
rq_object_id 19750137 <NA>
    19750137        2    3

To get the percent of each row, do the following:要获取每行的百分比,请执行以下操作:

out.percent <- prop.table(table(df[,c(2,3,5)],useNA = "ifany"))*100

Output in percent: Output 百分比:

> out.percent
, , Nielsen_class = 3

            inq_onr_id
rq_object_id 19750137 <NA>
    19750137       40   60

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM