简体   繁体   English

R中变量的值的频率和直方图

[英]Frequencies and histogram of values in a variable in R

First, I have four variables distribuited in columns, coded "0,1", each "1" is the TRUE value con the condition, I need to label each "1", get all those in a columns and make an histogram that show the data with the labels. 首先,我有四个分布在列中的变量,编码为“0,1”,每个“1”是条件的TRUE值,我需要标记每个“1”,得到列中的所有变量并制作直方图带标签的数据。

I've been working with "Ethnic identification" variable , there are four possible options: "MAYA", "LADINO", "GARIFUNA", "XINKA", "EXTRANJERO"; 我一直在使用“种族识别”变量,有四种可能的选择:“MAYA”,“LADINO”,“GARIFUNA”,“XINKA”,“EXTRANJERO”; in my base each option is in a different column with "0,1", I've tried to change those "1s" for different values as follows: "MAYA=1", "LADINO=2", "GARIFUNA=3" etc., to differenciate each value but i got lost in what to do next. 在我的基地中,每个选项都在“0,1”的不同列中,我试图为不同的值更改那些“1”,如下所示:“MAYA = 1”,“LADINO = 2”,“GARIFUNA = 3”等,以区分每个值,但我迷失了下一步做什么。

#ID_ETNICO<- CAPITAL  
   class(ID_ETNICO):


    $ IEE_MAYA                               : int  1 0 0 0 1 1 0 1 
    $ IEE_LADINO                             : int  0 1 0 0 0 0 1 0 
    $ IEE_GARIFUNA                           : int  0 0 1 0 0 0 0 0 
    $ IEE_XINKA                              : int  0 0 0 1 0 0 0 0 
    $ IEE_EXTRANJERO                         : int  0 0 0 0 0 0 0 0 



        ID_ETNICO$IEE_LADINO[ID_ETNICO$IEE_LADINO=="1"] <- 2  
        ID_ETNICO$IEE_GARIFUNA[ID_ETNICO$IEE_GARIFUNA=="1"] <- 3  
        ID_ETNICO$IEE_XINKA[ID_ETNICO$IEE_XINKA=="1"] <- 4  
        ID_ETNICO$IEE_EXTRANJERO[ID_ETNICO$IEE_EXTRANJERO=="1"] <- 5  


          $IEE_MAYA                                           : int  1 0 0 0 1 1 0 
          $ IEE_LADINO                                         : num  0 2 0 0 0 0 2 
          $ IEE_GARIFUNA                                       : num  0 0 3 0 0 0 0 
          $ IEE_XINKA                                          : num  0 0 0 4 0 0 0 
          $ IEE_EXTRANJERO                                     : num  0 0 0 0 0 0 0 


           table(ID_ETNICO$IEE_MAYA)
           table(ID_ETNICO$IEE_LADINO)
           table(ID_ETNICO$IEE_GARIFUNA)
           table(ID_ETNICO$IEE_XINKA)
           table(ID_ETNICO$IEE_EXTRANJERO)


               table(ID_ETNICO$IEE_MAYA)

0     1 

27533 5263 27533 5263

table(ID_ETNICO$IEE_LADINO) 表(ID_ETNICO $ IEE_LADINO)

0     2 

6354 26442 6354 26442

table(ID_ETNICO$IEE_GARIFUNA) 表(ID_ETNICO $ IEE_GARIFUNA)

0     3 

32593 203 32593 203

table(ID_ETNICO$IEE_XINKA) 表(ID_ETNICO $ IEE_XINKA)

0     4 

32649 147 32649 147

table(ID_ETNICO$IEE_EXTRANJERO) 表(ID_ETNICO $ IEE_EXTRANJERO)

0     5 

32576 220 32576 220

Now, I need to label "1=MAYA", "2=LADINO", "3=GARIFUNA", "4=XINKA", "5=EXTRANJERO", merge in a single column and obtain the frequencies of each label and make a histogram. 现在,我需要标记“1 = MAYA”,“2 = LADINO”,“3 = GARIFUNA”,“4 = XINKA”,“5 = EXTRANJERO”,合并成一列并获得每个标签的频率并制作直方图。

Assuming data is coded for one ethnic identification at a time, you can convert multiple dummy-coded variables into a single factor. 假设一次为一个种族标识编码数据,您可以将多个虚拟编码变量转换为单个因子。 Let me know if this is what you had in mind. 如果这是您的想法,请告诉我。

ID_ETNICO <- data.frame(
  IEE_MAYA = c(1,0,0,0,1,1,0,1),
  IEE_LADINO = c(0,1,0,0,0,0,1,0),
  IEE_GARIFUNA = c(0,0,1,0,0,0,0,0),
  IEE_XINKA = c(0,0,0,1,0,0,0,0),
  IEE_EXTRANJERO = c(0,0,0,0,0,0,0,0)
)

# Remove IEE_ from column names
names(ID_ETNICO) <- substring(names(ID_ETNICO), 5)

# Change dummy variables to factor
TIPO_ETNICO <- factor(names(ID_ETNICO)[max.col(ID_ETNICO)])

# Show frequency table and bar plot
table(TIPO_ETNICO)
barplot(table(TIPO_ETNICO))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM