[英]Frequencies and histogram of values in a variable in R
First, I have four variables distribuited in columns, coded "0,1", each "1" is the TRUE value con the condition, I need to label each "1", get all those in a columns and make an histogram that show the data with the labels. 首先,我有四个分布在列中的变量,编码为“0,1”,每个“1”是条件的TRUE值,我需要标记每个“1”,得到列中的所有变量并制作直方图带标签的数据。
I've been working with "Ethnic identification" variable , there are four possible options: "MAYA", "LADINO", "GARIFUNA", "XINKA", "EXTRANJERO"; 我一直在使用“种族识别”变量,有四种可能的选择:“MAYA”,“LADINO”,“GARIFUNA”,“XINKA”,“EXTRANJERO”; in my base each option is in a different column with "0,1", I've tried to change those "1s" for different values as follows: "MAYA=1", "LADINO=2", "GARIFUNA=3" etc., to differenciate each value but i got lost in what to do next.
在我的基地中,每个选项都在“0,1”的不同列中,我试图为不同的值更改那些“1”,如下所示:“MAYA = 1”,“LADINO = 2”,“GARIFUNA = 3”等,以区分每个值,但我迷失了下一步做什么。
#ID_ETNICO<- CAPITAL
class(ID_ETNICO):
$ IEE_MAYA : int 1 0 0 0 1 1 0 1
$ IEE_LADINO : int 0 1 0 0 0 0 1 0
$ IEE_GARIFUNA : int 0 0 1 0 0 0 0 0
$ IEE_XINKA : int 0 0 0 1 0 0 0 0
$ IEE_EXTRANJERO : int 0 0 0 0 0 0 0 0
ID_ETNICO$IEE_LADINO[ID_ETNICO$IEE_LADINO=="1"] <- 2
ID_ETNICO$IEE_GARIFUNA[ID_ETNICO$IEE_GARIFUNA=="1"] <- 3
ID_ETNICO$IEE_XINKA[ID_ETNICO$IEE_XINKA=="1"] <- 4
ID_ETNICO$IEE_EXTRANJERO[ID_ETNICO$IEE_EXTRANJERO=="1"] <- 5
$IEE_MAYA : int 1 0 0 0 1 1 0
$ IEE_LADINO : num 0 2 0 0 0 0 2
$ IEE_GARIFUNA : num 0 0 3 0 0 0 0
$ IEE_XINKA : num 0 0 0 4 0 0 0
$ IEE_EXTRANJERO : num 0 0 0 0 0 0 0
table(ID_ETNICO$IEE_MAYA)
table(ID_ETNICO$IEE_LADINO)
table(ID_ETNICO$IEE_GARIFUNA)
table(ID_ETNICO$IEE_XINKA)
table(ID_ETNICO$IEE_EXTRANJERO)
table(ID_ETNICO$IEE_MAYA)
0 1
27533 5263 27533 5263
table(ID_ETNICO$IEE_LADINO)
表(ID_ETNICO $ IEE_LADINO)
0 2
6354 26442 6354 26442
table(ID_ETNICO$IEE_GARIFUNA)
表(ID_ETNICO $ IEE_GARIFUNA)
0 3
32593 203 32593 203
table(ID_ETNICO$IEE_XINKA)
表(ID_ETNICO $ IEE_XINKA)
0 4
32649 147 32649 147
table(ID_ETNICO$IEE_EXTRANJERO)
表(ID_ETNICO $ IEE_EXTRANJERO)
0 5
32576 220 32576 220
Now, I need to label "1=MAYA", "2=LADINO", "3=GARIFUNA", "4=XINKA", "5=EXTRANJERO", merge in a single column and obtain the frequencies of each label and make a histogram. 现在,我需要标记“1 = MAYA”,“2 = LADINO”,“3 = GARIFUNA”,“4 = XINKA”,“5 = EXTRANJERO”,合并成一列并获得每个标签的频率并制作直方图。
Assuming data is coded for one ethnic identification at a time, you can convert multiple dummy-coded variables into a single factor. 假设一次为一个种族标识编码数据,您可以将多个虚拟编码变量转换为单个因子。 Let me know if this is what you had in mind.
如果这是您的想法,请告诉我。
ID_ETNICO <- data.frame(
IEE_MAYA = c(1,0,0,0,1,1,0,1),
IEE_LADINO = c(0,1,0,0,0,0,1,0),
IEE_GARIFUNA = c(0,0,1,0,0,0,0,0),
IEE_XINKA = c(0,0,0,1,0,0,0,0),
IEE_EXTRANJERO = c(0,0,0,0,0,0,0,0)
)
# Remove IEE_ from column names
names(ID_ETNICO) <- substring(names(ID_ETNICO), 5)
# Change dummy variables to factor
TIPO_ETNICO <- factor(names(ID_ETNICO)[max.col(ID_ETNICO)])
# Show frequency table and bar plot
table(TIPO_ETNICO)
barplot(table(TIPO_ETNICO))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.