[英]Plots overwriting, so at the end of the R loop all have pulled from last element of list. What am I doing wrong?
I've been stuck on this loop for a while now (as seen by my question history), but I think I'm getting close to fixing it, thanks a lot to the help I've gotten on stack overflow.我已经在这个循环上停留了一段时间(从我的问题历史记录可以看出),但我认为我已经接近修复它了,非常感谢我在堆栈溢出方面获得的帮助。
I noticed that in my plots, every plot uses data_percentage_list[391], the last element in the list.我注意到在我的图中,每个 plot 都使用 data_percentage_list[391],即列表中的最后一个元素。 I've done a bunch of things to try to stop that from occurring, but using the below code:
我已经做了很多事情来试图阻止这种情况的发生,但是使用下面的代码:
# Create graphs in list
# Create titles for plots
titlenames <- c(harps)
for (i in 1:length(harps)){
counts <- table(Y[[i]][[5]], Y[[i]][[3]])
nam <- paste("data_percentage_", i, sep ="")
assign(nam, apply(counts, 2, function(x){x*100/sum(x,na.rm=T)}))
}
data_percentage_list <- lapply(paste0("data_percentage_",1:length(harps)), get)
# Create pdf of score breakdown
for (i in 1:length(harps)){ for(j in titlenames) {
# For Hotel Name Subtitle
hotelname <- hotel_report$`Hotel (Q15 1)`[hotel_report$`Harp Number`==j]
# Plot the Data
pdf(file = paste0(j, ".pdf"), paper = "USr", width=8, height=7)
par(mar = c(5.1, 7, 4.1, 2.1))
nam <- paste("breakdown_", i, sep ="")
assign(nam, barplot(data_percentage_list[[i]], main = "Breakdown of Property Score Distribution", sub = hotelname,
col = coul, las = 1, cex.names = .6, horiz = TRUE, yaxs="i", xlab = "Percentage",
cex.axis = .8, cex.lab = .8, cex.main = .8, cex.sub = .8))
dev.off()
}}
where length(harps) is 391, so there are 391 plots, the plots are overwriting as they go. So when I open a plot in pdf and refresh it, it's changing to the last iteration of the loop until at the end they all end up being the 391st property's data, with the correct hotel name since that's pulled from j.其中长度(竖琴)为 391,因此有 391 个地块,地块被覆盖为 go。因此,当我在 pdf 中打开 plot 并刷新它时,它变为循环的最后一次迭代,直到最后它们都结束up 是第 391 个属性的数据,具有正确的酒店名称,因为它是从 j 中提取的。
Does anyone know how I need to alter my code to get each plot to correspond to the correct data?有谁知道我需要如何更改我的代码以使每个 plot 对应于正确的数据? Meaning, breakdown_54 should use data_percentage_list[54], and save as a pdf of that data, breakdown_55 should be data_percentage_list[55], and so on?
意思是,breakdown_54 应该使用 data_percentage_list[54],并保存为该数据的 pdf,breakdown_55 应该是 data_percentage_list[55],等等?
Thank you!谢谢!
Edit: Following up after working on it some more.编辑:在进行更多工作后跟进。
The code below makes 391 different graphs, but each of the 391 pdfs has all 391 graphs instead of just their own respective graph like they should.下面的代码制作了 391 个不同的图表,但是 391 个 pdf 文件中的每一个都有所有 391 个图表,而不是像它们应该的那样只有它们自己的图表。
Is it easier to split up these pdfs correctly in this code versus fixing the code above?与修复上面的代码相比,在此代码中正确拆分这些 pdf 是否更容易?
# Create graphs in list
# Create titles for plots
titlenames <- c(harps)
for (i in 1:length(harps)){
counts <- table(Y[[i]][[5]], Y[[i]][[3]])
nam <- paste("data_percentage_", i, sep ="")
assign(nam, apply(counts, 2, function(x){x*100/sum(x,na.rm=T)}))
}
data_percentage_list <- lapply(paste0("data_percentage_",1:length(harps)), get)
# Create pdf of score breakdown
for (i in 1:length(harps)){ for(j in titlenames) {
# For Hotel Name Subtitle
hotelname <- hotel_report$`Hotel (Q15 1)`[hotel_report$`Harp Number`==j]
# Plot the Data
pdf(file = paste0(j, ".pdf"), paper = "USr", width=8, height=7)
par(mar = c(5.1, 7, 4.1, 2.1))
nam <- paste("breakdown_", i, sep ="")
breakdown_list <- lapply(1:length(harps), function(i){
assign(nam, barplot(data_percentage_list[[i]], main = "Breakdown of Property Score Distribution", sub = hotelname,
col = coul, las = 1, cex.names = .6, horiz = TRUE, yaxs="i", xlab = "Percentage",
cex.axis = .8, cex.lab = .8, cex.main = .8, cex.sub = .8))})
dev.off()
}}
Thanks again!再次感谢!
Edit 2: In attempt to make this more reproducible编辑 2:试图使它更具可重现性
Y is a list of 391 dataframes Y 是 391 个数据帧的列表
And below in code is dput of one of the 391 dataframes in Y.
下面的代码是 Y 中 391 个数据帧之一的输出。
structure(list(`Hotel (Q15 1)` = c("HILTON, SAN PEDRO, BELIZE",
"HILTON, SAN PEDRO, BELIZE", "HILTON, SAN PEDRO, BELIZE", "HILTON, SAN PEDRO, BELIZE",
"HILTON, SAN PEDRO, BELIZE", "HILTON, SAN PEDRO, BELIZE", "HILTON, SAN PEDRO, BELIZE",
"HILTON, SAN PEDRO, BELIZE", "HILTON, SAN PEDRO, BELIZE", "HILTON, SAN PEDRO, BELIZE",
"HILTON, SAN PEDRO, BELIZE", "HILTON, SAN PEDRO, BELIZE", "HILTON, SAN PEDRO, BELIZE",
"HILTON, SAN PEDRO, BELIZE", "HILTON, SAN PEDRO, BELIZE", "HILTON, SAN PEDRO, BELIZE",
"HILTON, SAN PEDRO, BELIZE", "HILTON, SAN PEDRO, BELIZE", "HILTON, SAN PEDRO, BELIZE",
"HILTON, SAN PEDRO, BELIZE", "HILTON, SAN PEDRO, BELIZE", "HILTON, SAN PEDRO, BELIZE",
"HILTON, SAN PEDRO, BELIZE"), `Metro Area State (Q10 1)` = c("OCONUS",
"OCONUS", "OCONUS", "OCONUS", "OCONUS", "OCONUS", "OCONUS", "OCONUS",
"OCONUS", "OCONUS", "OCONUS", "OCONUS", "OCONUS", "OCONUS", "OCONUS",
"OCONUS", "OCONUS", "OCONUS", "OCONUS", "OCONUS", "OCONUS", "OCONUS",
"OCONUS"), `Question ID` = c("Room Work Area", "Staff Knowledge",
"Add'tl Item Working Order", "Property Maintenance", "Property Appearance",
"Staff Knowledge", "Property Appearance", "Staff Interaction",
"Safety/Security", "Add'tl Item Working Order", "Room Work Area",
"Bed Quality", "Check In/Out", "Invoice Accuracy", "Staff Interaction",
"Safety/Security", "Bed Quality", "Invoice Accuracy", "Check In/Out",
"Safety/Security", "Invoice Accuracy", "Bed Quality", "Property Maintenance"
), `Question ID (group)` = c("Question 4 Items", "Question 4 Items",
"Question 4 Items", "Question 4 Items", "Question 4 Items", "Question 4 Items",
"Question 4 Items", "Question 4 Items", "Question 4 Items", "Question 4 Items",
"Question 4 Items", "Question 4 Items", "Question 4 Items", "Question 4 Items",
"Question 4 Items", "Question 4 Items", "Question 4 Items", "Question 4 Items",
"Question 4 Items", "Question 4 Items", "Question 4 Items", "Question 4 Items",
"Question 4 Items"), `Score Label` = c("7 Extremely Good", "7 Extremely Good",
"7 Extremely Good", "7 Extremely Good", "7 Extremely Good", "6 Quite Good",
"6 Quite Good", "6 Quite Good", "6 Quite Good", "6 Quite Good",
"6 Quite Good", "6 Quite Good", "6 Quite Good", "7 Extremely Good",
"7 Extremely Good", "5 Slightly Good", "7 Extremely Good", "6 Quite Good",
"7 Extremely Good", "7 Extremely Good", "3 Slightly Poor", "5 Slightly Good",
"6 Quite Good"), `Harp Number` = c("1111", "1111", "1111", "1111",
"1111", "1111", "1111", "1111", "1111", "1111", "1111", "1111",
"1111", "1111", "1111", "1111", "1111", "1111", "1111", "1111",
"1111", "1111", "1111")), row.names = c(9380L, 9381L, 9383L,
9384L, 9385L, 9387L, 9388L, 9389L, 9390L, 9391L, 9392L, 9393L,
9394L, 9395L, 9396L, 9399L, 9402L, 9403L, 9404L, 9405L, 9407L,
9408L, 9411L), class = "data.frame")
And below, is dput(harps)下面是 dput(harps)
dput(harps)
c("1111", "1696", "3279", "5646", "5724", "5938", "6887", "8859",
"9368", "9508", "11569", "11644", "18661", "21418", "22460",
"23317", "25755", "26076", "26336", "28917", "29497", "29498",
"30465", "30619", "30629", "32784", "35578", "35588", "40390",
"40866", "47493", "47677", "47866", "48064", "48294", "50432",
"50667", "50773", "51857", "52125", "52146", "52383", "52432",
"52451", "52755", "53589", "53620", "56939", "57784", "59571",
"61276", "61283", "62329", "62666", "66058", "66553", "66741",
"66763", "67092", "67169", "67214", "67373", "67840", "69494",
"71343", "73906", "74550", "75285", "76253", "76335", "76361",
"76393", "76396", "76898", "76949", "78501", "78800", "80079",
"81035", "81620", "85043", "87026", "87219", "87304", "88683",
"89650", "92759", "94380", "94427", "95043", "95255", "96061",
"96677", "97269", "100135", "109591", "109743", "109971", "110414",
"110856", "110884", "110899", "110926", "111032", "111384", "111605",
"123136", "123411", "124380", "124753", "124848", "127565", "135185",
"135999", "136005", "138251", "140027", "140074", "140091", "140095",
"140159", "145523", "148284", "149639", "153676", "154790", "157239",
"158213", "158259", "159248", "159343", "159401", "159842", "161219",
"161725", "163154", "163653", "167172", "170199", "171936", "172095",
"172272", "172273", "172340", "172868", "173429", "173816", "175033",
"177012", "177150", "177361", "177383", "177692", "177892", "177965",
"179887", "180495", "182189", "182979", "183174", "183717", "183879",
"184076", "185191", "185341", "185675", "185961", "189276", "190279",
"190896", "192388", "192984", "193387", "193441", "193526", "193534",
"193605", "193613", "193614", "194274", "194794", "196133", "196546",
"197075", "197647", "198115", "200996", "201627", "202124", "202992",
"205802", "206405", "206880", "206990", "207423", "207483", "207723",
"208210", "208943", "209614", "210006", "211605", "211985", "212714",
"213707", "213803", "213842", "215961", "216533", "217963", "218029",
"218348", "218376", "221745", "222179", "222299", "222399", "222736",
"222882", "224539", "224624", "225339", "225346", "225368", "225553",
"225565", "225572", "225573", "226003", "228325", "229582", "229614",
"230871", "231228", "231402", "235196", "235538", "239409", "241353",
"244587", "244654", "245353", "246093", "246311", "247209", "251084",
"253732", "254388", "256996", "258464", "260958", "261655", "262754",
"263192", "263444", "265835", "269872", "270285", "271683", "271687",
"272664", "275922", "276312", "279909", "287731", "291167", "291988",
"296004", "297975", "298318", "298401", "300962", "301940", "302250",
"302702", "304896", "308049", "311490", "312027", "313227", "313603",
"315536", "319957", "320049", "320270", "320352", "327521", "330319",
"331054", "332070", "332426", "334213", "341876", "345820", "346263",
"346723", "347340", "352596", "354486", "396465", "445549", "473263",
"482701", "496665", "503123", "503365", "528259", "538396", "539834",
"540896", "546228", "546290", "546652", "546922", "548916", "550479",
"552466", "709416", "714793", "714861", "716337", "719021", "728913",
"731082", "732346", "733242", "735165", "735348", "735473", "749296",
"757777", "761782", "762104", "770251", "808540", "809896", "809951",
"812527", "816275", "837926", "842678", "843836", "847737", "857277",
"864044", "864495", "865468", "865951", "866108", "866502", "866547",
"867803", "867809", "868374", "868420", "868593", "868793", "869746",
"869748", "870953", "872490", "872579", "875200", "875288", "878016",
"878858", "879328", "879640", "882643", "882781", "883894", "886067",
"886876", "888522", "888560", "888820", "889693", "890261", "890264",
"891171", "894931", "896794", "896840", "899485", "901218", "903465",
"904381", "912517", "913354", "918968", "921083")
Consider the following general tips in R and maybe even programming:考虑以下 R 中的一般提示,甚至可能是编程:
Variables : Avoid use of too many variables but interact directly on existing objects.变量:避免使用太多变量,而是直接与现有对象交互。 This enhances the maintainability of environment variables.
这增强了环境变量的可维护性。 Some examples of redundancy include:
冗余的一些示例包括:
titlenames <- c(harps) nam <- paste("data_percentage_", i, sep ="") data_percentage_list <- lapply(paste0("data_percentage_",1:length(harps)), get)
Names : Use more informative names for objects as Y
does not inform code readers or yourself in the future.名称:为对象使用更具信息性的名称,因为
Y
不会在将来通知代码读者或您自己。 It appears to be a list that contains subsets of larger data frame hotel_report
.它似乎是一个包含较大数据框
hotel_report
子集的列表。 More informative names like hotel_reports_df_list
quickly detail its contents and type (ie, data frames within a list).更多信息的名称,如
hotel_reports_df_list
,可快速详细说明其内容和类型(即,列表中的数据框)。
Indentation : Always indent code in for
loops (which can be automated in RStudio with keys: Ctrl / cmd + i ) and even inside context managers like pdf
, with
, etc. This enhances readability and maintainability.缩进:始终在
for
循环中缩进代码(可以在 RStudio 中使用键自动缩进: Ctrl / cmd + i ),甚至在上下文管理器内部,如pdf
, with
等。这增强了可读性和可维护性。
Assign/Get : Avoid assign
and get
which usually are not recommended in R. Instead, directly save your objects as items in lists. Assign/Get : 避免
assign
和get
,R 中通常不推荐。相反,直接将您的对象保存为列表中的项目。 First loop can bypass the need to assign child items as separate variables:第一个循环可以绕过将子项目分配为单独变量的需要:
data_pct_matrix_list <- lapply(seq_along(harps), function(i) { counts <- table(Y[[i]][[5]], Y[[i]][[3]]) pct_matrix <- apply(counts, 2, function(x) { x*100/sum(x, na.rm=TRUE)}) return(pct_matrix) })
Also last assign
wrapped around barplot
can also be refactored:还可以重构围绕
barplot
的最后一个assign
:
plot_list <- lapply(data_percentage_matrix_list, function(mat) { barplot(mat, main = "Breakdown of Property Score Distribution", sub = hotelname, col = coul, las = 1, cex.names =.6, horiz = TRUE, yaxs="i", xlab = "Percentage", cex.axis =.8, cex.lab =.8, cex.main =.8, cex.sub =.8)) })
Loops : Avoid multiple for
or nested loops as much as possible.循环:尽可能避免多个
for
或嵌套循环。 In R, lapply
is a hidden loop.在R中,
lapply
是一个隐藏循环。 Your issues of 391 plots in each of the 391 PDFs likely is due to nested lapply
within a for
loop. 391 个 PDF 中每个 391 个图的问题可能是由于
lapply
嵌套在for
循环中。 Consider these steps:考虑以下步骤:
R's apply family includes more than just apply
and lapply
such as mapply
that can run elementwise looping to flatten your nested iterations or by
(object-oriented wrapper to tapply
) that can subset data frames by factor columns and run operations on them. R 的 apply 系列不仅包括
apply
和lapply
,例如mapply
可以运行逐元素循环以展平嵌套迭代,或者by
(面向对象的包装器到tapply
)可以按因子列对数据帧进行子集化并对其运行操作。
Without seeing sample data, consider following approaches which will need to be tested against data.在看不到示例数据的情况下,请考虑以下需要针对数据进行测试的方法。 Below assumes
Y
is defined as list of subsets from hotel_report
data frame by Harp Number
.下面假设
Y
被Harp Number
定义为hotel_report
数据帧的子集列表。
mapply
/ Map
approach mapply
/ Map
方法Iterate elementwise between equal-length objects, data_pct_matrix_list
and harps
.在等长对象
data_pct_matrix_list
和harps
之间逐元素迭代。
data_pct_matrix_list <- lapply(seq_along(harps), function(i) {
counts <- table(Y[[i]]$`Score Label`, Y[[i]]$`Question ID`)
pct_matrix <- apply(counts, 2, function(x) { x*100/sum(x, na.rm=TRUE) })
return(pct_matrix)
})
build_pdf <- function(data, harp) {
# For Hotel Name Subtitle
hotelname <- hotel_report$`Hotel (Q15 1)`[hotel_report$`Harp Number` == harp]
# Plot the Data
pdf(file = paste0(harp, ".pdf"), paper = "USr", width=8, height=7)
par(mar = c(5.1, 7, 4.1, 2.1))
hotel_plot <- barplot(data, main = "Breakdown of Property Score Distribution", sub = hotelname,
col = coul, las = 1, cex.names = .6, horiz = TRUE, yaxs="i", xlab = "Percentage",
cex.axis = .8, cex.lab = .8, cex.main = .8, cex.sub = .8))
dev.off()
return(hotel_plot)
}
plot_list <- Map(build_pdf, data_pct_matrix_list, harps)
# EQUIVALENTLY:
plot_list <- mapply(build_pdf, data_pct_matrix_list, harps, SIMPLIFY=FALSE)
by
approach by
方法Subset hotel_report
data frame by unique Harp Number
and iteratively run on each subset to build pct_matrix
and hotel_plot
.通过唯一的
Harp Number
对hotel_report
数据框进行子集化,并在每个子集上迭代运行以构建pct_matrix
和hotel_plot
。 This approach combines matrix build and plot steps.这种方法结合了矩阵构建和 plot 个步骤。
build_pdf <- function(sub_df) {
# Matrix build
counts <- table(sub_df$`Score Label`, sub_df$`Question ID`)
pct_matrix <- apply(counts, 2, function(x) { x*100/sum(x, na.rm=TRUE) })
# For Hotel Name Subtitle
hotelname <- sub_df$`Hotel (Q15 1)`[1]
harp <- sub_df$`Harp Number`[1]
# Plot the Data
pdf(file = paste0(harp, ".pdf"), paper = "USr", width=8, height=7)
par(mar = c(5.1, 7, 4.1, 2.1))
hotel_plot <- barplot(pct_matrix, main = "Breakdown of Property Score Distribution", sub = hotelname,
col = coul, las = 1, cex.names = .6, horiz = TRUE, yaxs="i", xlab = "Percentage",
cex.axis = .8, cex.lab = .8, cex.main = .8, cex.sub = .8))
dev.off()
return(hotel_plot)
}
plot_list <- by(hotel_report, hotel_report$`Harp Number`, build_pdf)
# NEAR EQUIVALENT
plot_list <- lapply(split(hotel_report, hotel_report$`Harp Number`), build_pdf)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.