简体   繁体   English

绘制覆盖,所以在 R 循环结束时,所有元素都从列表的最后一个元素中提取出来。 我究竟做错了什么?

[英]Plots overwriting, so at the end of the R loop all have pulled from last element of list. What am I doing wrong?

I've been stuck on this loop for a while now (as seen by my question history), but I think I'm getting close to fixing it, thanks a lot to the help I've gotten on stack overflow.我已经在这个循环上停留了一段时间(从我的问题历史记录可以看出),但我认为我已经接近修复它了,非常感谢我在堆栈溢出方面获得的帮助。

I noticed that in my plots, every plot uses data_percentage_list[391], the last element in the list.我注意到在我的图中,每个 plot 都使用 data_percentage_list[391],即列表中的最后一个元素。 I've done a bunch of things to try to stop that from occurring, but using the below code:我已经做了很多事情来试图阻止这种情况的发生,但是使用下面的代码:

# Create graphs in list

# Create titles for plots
titlenames <- c(harps)

 for (i in 1:length(harps)){

counts <- table(Y[[i]][[5]], Y[[i]][[3]])
nam <- paste("data_percentage_", i, sep ="")
assign(nam, apply(counts, 2, function(x){x*100/sum(x,na.rm=T)}))
 }

data_percentage_list <- lapply(paste0("data_percentage_",1:length(harps)), get)

# Create pdf of score breakdown
for (i in 1:length(harps)){ for(j in titlenames) {

# For Hotel Name Subtitle
hotelname <- hotel_report$`Hotel (Q15 1)`[hotel_report$`Harp Number`==j]

# Plot the Data 

pdf(file = paste0(j, ".pdf"), paper = "USr", width=8, height=7)
par(mar = c(5.1, 7, 4.1, 2.1))
nam <- paste("breakdown_", i, sep ="")
assign(nam, barplot(data_percentage_list[[i]], main = "Breakdown of Property Score Distribution", sub = hotelname, 
        col = coul, las = 1, cex.names = .6, horiz = TRUE, yaxs="i", xlab = "Percentage",
        cex.axis = .8, cex.lab = .8, cex.main = .8, cex.sub = .8))
dev.off()
}}

where length(harps) is 391, so there are 391 plots, the plots are overwriting as they go. So when I open a plot in pdf and refresh it, it's changing to the last iteration of the loop until at the end they all end up being the 391st property's data, with the correct hotel name since that's pulled from j.其中长度(竖琴)为 391,因此有 391 个地块,地块被覆盖为 go。因此,当我在 pdf 中打开 plot 并刷新它时,它变为循环的最后一次迭代,直到最后它们都结束up 是第 391 个属性的数据,具有正确的酒店名称,因为它是从 j 中提取的。

Does anyone know how I need to alter my code to get each plot to correspond to the correct data?有谁知道我需要如何更改我的代码以使每个 plot 对应于正确的数据? Meaning, breakdown_54 should use data_percentage_list[54], and save as a pdf of that data, breakdown_55 should be data_percentage_list[55], and so on?意思是,breakdown_54 应该使用 data_percentage_list[54],并保存为该数据的 pdf,breakdown_55 应该是 data_percentage_list[55],等等?

Thank you!谢谢!

Edit: Following up after working on it some more.编辑:在进行更多工作后跟进。

The code below makes 391 different graphs, but each of the 391 pdfs has all 391 graphs instead of just their own respective graph like they should.下面的代码制作了 391 个不同的图表,但是 391 个 pdf 文件中的每一个都有所有 391 个图表,而不是像它们应该的那样只有它们自己的图表。

Is it easier to split up these pdfs correctly in this code versus fixing the code above?与修复上面的代码相比,在此代码中正确拆分这些 pdf 是否更容易?

# Create graphs in list

# Create titles for plots
titlenames <- c(harps)

 for (i in 1:length(harps)){

counts <- table(Y[[i]][[5]], Y[[i]][[3]])
nam <- paste("data_percentage_", i, sep ="")
assign(nam, apply(counts, 2, function(x){x*100/sum(x,na.rm=T)}))
 }

data_percentage_list <- lapply(paste0("data_percentage_",1:length(harps)), get)

# Create pdf of score breakdown
for (i in 1:length(harps)){ for(j in titlenames) {

# For Hotel Name Subtitle
hotelname <- hotel_report$`Hotel (Q15 1)`[hotel_report$`Harp Number`==j]

# Plot the Data 

pdf(file = paste0(j, ".pdf"), paper = "USr", width=8, height=7)
par(mar = c(5.1, 7, 4.1, 2.1))
nam <- paste("breakdown_", i, sep ="")
breakdown_list <- lapply(1:length(harps), function(i){
assign(nam, barplot(data_percentage_list[[i]], main = "Breakdown of Property Score Distribution", sub = hotelname, 
        col = coul, las = 1, cex.names = .6, horiz = TRUE, yaxs="i", xlab = "Percentage",
        cex.axis = .8, cex.lab = .8, cex.main = .8, cex.sub = .8))})
dev.off()
}}

Thanks again!再次感谢!

Edit 2: In attempt to make this more reproducible编辑 2:试图使它更具可重现性

Y is a list of 391 dataframes Y 是 391 个数据帧的列表在此处输入图像描述 And below in code is dput of one of the 391 dataframes in Y.下面的代码是 Y 中 391 个数据帧之一的输出。

structure(list(`Hotel (Q15 1)` = c("HILTON, SAN PEDRO, BELIZE", 
"HILTON, SAN PEDRO, BELIZE", "HILTON, SAN PEDRO, BELIZE", "HILTON, SAN PEDRO, BELIZE", 
"HILTON, SAN PEDRO, BELIZE", "HILTON, SAN PEDRO, BELIZE", "HILTON, SAN PEDRO, BELIZE", 
"HILTON, SAN PEDRO, BELIZE", "HILTON, SAN PEDRO, BELIZE", "HILTON, SAN PEDRO, BELIZE", 
"HILTON, SAN PEDRO, BELIZE", "HILTON, SAN PEDRO, BELIZE", "HILTON, SAN PEDRO, BELIZE", 
"HILTON, SAN PEDRO, BELIZE", "HILTON, SAN PEDRO, BELIZE", "HILTON, SAN PEDRO, BELIZE", 
"HILTON, SAN PEDRO, BELIZE", "HILTON, SAN PEDRO, BELIZE", "HILTON, SAN PEDRO, BELIZE", 
"HILTON, SAN PEDRO, BELIZE", "HILTON, SAN PEDRO, BELIZE", "HILTON, SAN PEDRO, BELIZE", 
"HILTON, SAN PEDRO, BELIZE"), `Metro Area State (Q10 1)` = c("OCONUS", 
"OCONUS", "OCONUS", "OCONUS", "OCONUS", "OCONUS", "OCONUS", "OCONUS", 
"OCONUS", "OCONUS", "OCONUS", "OCONUS", "OCONUS", "OCONUS", "OCONUS", 
"OCONUS", "OCONUS", "OCONUS", "OCONUS", "OCONUS", "OCONUS", "OCONUS", 
"OCONUS"), `Question ID` = c("Room Work Area", "Staff Knowledge", 
"Add'tl Item Working Order", "Property Maintenance", "Property Appearance", 
"Staff Knowledge", "Property Appearance", "Staff Interaction", 
"Safety/Security", "Add'tl Item Working Order", "Room Work Area", 
"Bed Quality", "Check In/Out", "Invoice Accuracy", "Staff Interaction", 
"Safety/Security", "Bed Quality", "Invoice Accuracy", "Check In/Out", 
"Safety/Security", "Invoice Accuracy", "Bed Quality", "Property Maintenance"
), `Question ID (group)` = c("Question 4 Items", "Question 4 Items", 
"Question 4 Items", "Question 4 Items", "Question 4 Items", "Question 4 Items", 
"Question 4 Items", "Question 4 Items", "Question 4 Items", "Question 4 Items", 
"Question 4 Items", "Question 4 Items", "Question 4 Items", "Question 4 Items", 
"Question 4 Items", "Question 4 Items", "Question 4 Items", "Question 4 Items", 
"Question 4 Items", "Question 4 Items", "Question 4 Items", "Question 4 Items", 
"Question 4 Items"), `Score Label` = c("7 Extremely Good", "7 Extremely Good", 
"7 Extremely Good", "7 Extremely Good", "7 Extremely Good", "6 Quite Good", 
"6 Quite Good", "6 Quite Good", "6 Quite Good", "6 Quite Good", 
"6 Quite Good", "6 Quite Good", "6 Quite Good", "7 Extremely Good", 
"7 Extremely Good", "5 Slightly Good", "7 Extremely Good", "6 Quite Good", 
"7 Extremely Good", "7 Extremely Good", "3 Slightly Poor", "5 Slightly Good", 
"6 Quite Good"), `Harp Number` = c("1111", "1111", "1111", "1111", 
"1111", "1111", "1111", "1111", "1111", "1111", "1111", "1111", 
"1111", "1111", "1111", "1111", "1111", "1111", "1111", "1111", 
"1111", "1111", "1111")), row.names = c(9380L, 9381L, 9383L, 
9384L, 9385L, 9387L, 9388L, 9389L, 9390L, 9391L, 9392L, 9393L, 
9394L, 9395L, 9396L, 9399L, 9402L, 9403L, 9404L, 9405L, 9407L, 
9408L, 9411L), class = "data.frame")

And below, is dput(harps)下面是 dput(harps)

dput(harps)
c("1111", "1696", "3279", "5646", "5724", "5938", "6887", "8859", 
"9368", "9508", "11569", "11644", "18661", "21418", "22460", 
"23317", "25755", "26076", "26336", "28917", "29497", "29498", 
"30465", "30619", "30629", "32784", "35578", "35588", "40390", 
"40866", "47493", "47677", "47866", "48064", "48294", "50432", 
"50667", "50773", "51857", "52125", "52146", "52383", "52432", 
"52451", "52755", "53589", "53620", "56939", "57784", "59571", 
"61276", "61283", "62329", "62666", "66058", "66553", "66741", 
"66763", "67092", "67169", "67214", "67373", "67840", "69494", 
"71343", "73906", "74550", "75285", "76253", "76335", "76361", 
"76393", "76396", "76898", "76949", "78501", "78800", "80079", 
"81035", "81620", "85043", "87026", "87219", "87304", "88683", 
"89650", "92759", "94380", "94427", "95043", "95255", "96061", 
"96677", "97269", "100135", "109591", "109743", "109971", "110414", 
"110856", "110884", "110899", "110926", "111032", "111384", "111605", 
"123136", "123411", "124380", "124753", "124848", "127565", "135185", 
"135999", "136005", "138251", "140027", "140074", "140091", "140095", 
"140159", "145523", "148284", "149639", "153676", "154790", "157239", 
"158213", "158259", "159248", "159343", "159401", "159842", "161219", 
"161725", "163154", "163653", "167172", "170199", "171936", "172095", 
"172272", "172273", "172340", "172868", "173429", "173816", "175033", 
"177012", "177150", "177361", "177383", "177692", "177892", "177965", 
"179887", "180495", "182189", "182979", "183174", "183717", "183879", 
"184076", "185191", "185341", "185675", "185961", "189276", "190279", 
"190896", "192388", "192984", "193387", "193441", "193526", "193534", 
"193605", "193613", "193614", "194274", "194794", "196133", "196546", 
"197075", "197647", "198115", "200996", "201627", "202124", "202992", 
"205802", "206405", "206880", "206990", "207423", "207483", "207723", 
"208210", "208943", "209614", "210006", "211605", "211985", "212714", 
"213707", "213803", "213842", "215961", "216533", "217963", "218029", 
"218348", "218376", "221745", "222179", "222299", "222399", "222736", 
"222882", "224539", "224624", "225339", "225346", "225368", "225553", 
"225565", "225572", "225573", "226003", "228325", "229582", "229614", 
"230871", "231228", "231402", "235196", "235538", "239409", "241353", 
"244587", "244654", "245353", "246093", "246311", "247209", "251084", 
"253732", "254388", "256996", "258464", "260958", "261655", "262754", 
"263192", "263444", "265835", "269872", "270285", "271683", "271687", 
"272664", "275922", "276312", "279909", "287731", "291167", "291988", 
"296004", "297975", "298318", "298401", "300962", "301940", "302250", 
"302702", "304896", "308049", "311490", "312027", "313227", "313603", 
"315536", "319957", "320049", "320270", "320352", "327521", "330319", 
"331054", "332070", "332426", "334213", "341876", "345820", "346263", 
"346723", "347340", "352596", "354486", "396465", "445549", "473263", 
"482701", "496665", "503123", "503365", "528259", "538396", "539834", 
"540896", "546228", "546290", "546652", "546922", "548916", "550479", 
"552466", "709416", "714793", "714861", "716337", "719021", "728913", 
"731082", "732346", "733242", "735165", "735348", "735473", "749296", 
"757777", "761782", "762104", "770251", "808540", "809896", "809951", 
"812527", "816275", "837926", "842678", "843836", "847737", "857277", 
"864044", "864495", "865468", "865951", "866108", "866502", "866547", 
"867803", "867809", "868374", "868420", "868593", "868793", "869746", 
"869748", "870953", "872490", "872579", "875200", "875288", "878016", 
"878858", "879328", "879640", "882643", "882781", "883894", "886067", 
"886876", "888522", "888560", "888820", "889693", "890261", "890264", 
"891171", "894931", "896794", "896840", "899485", "901218", "903465", 
"904381", "912517", "913354", "918968", "921083")

Consider the following general tips in R and maybe even programming:考虑以下 R 中的一般提示,甚至可能是编程:

  • Variables : Avoid use of too many variables but interact directly on existing objects.变量:避免使用太多变量,而是直接与现有对象交互。 This enhances the maintainability of environment variables.这增强了环境变量的可维护性。 Some examples of redundancy include:冗余的一些示例包括:

     titlenames <- c(harps) nam <- paste("data_percentage_", i, sep ="") data_percentage_list <- lapply(paste0("data_percentage_",1:length(harps)), get)
  • Names : Use more informative names for objects as Y does not inform code readers or yourself in the future.名称:为对象使用更具信息性的名称,因为Y不会在将来通知代码读者或您自己。 It appears to be a list that contains subsets of larger data frame hotel_report .它似乎是一个包含较大数据框hotel_report子集的列表。 More informative names like hotel_reports_df_list quickly detail its contents and type (ie, data frames within a list).更多信息的名称,如hotel_reports_df_list ,可快速详细说明其内容和类型(即,列表中的数据框)。

  • Indentation : Always indent code in for loops (which can be automated in RStudio with keys: Ctrl / cmd + i ) and even inside context managers like pdf , with , etc. This enhances readability and maintainability.缩进:始终在for循环中缩进代码(可以在 RStudio 中使用键自动缩进: Ctrl / cmd + i ),甚至在上下文管理器内部,如pdfwith等。这增强了可读性和可维护性。

  • Assign/Get : Avoid assign and get which usually are not recommended in R. Instead, directly save your objects as items in lists. Assign/Get : 避免assignget ,R 中通常不推荐。相反,直接将您的对象保存为列表中的项目。 First loop can bypass the need to assign child items as separate variables:第一个循环可以绕过将子项目分配为单独变量的需要:

     data_pct_matrix_list <- lapply(seq_along(harps), function(i) { counts <- table(Y[[i]][[5]], Y[[i]][[3]]) pct_matrix <- apply(counts, 2, function(x) { x*100/sum(x, na.rm=TRUE)}) return(pct_matrix) })

    Also last assign wrapped around barplot can also be refactored:还可以重构围绕barplot的最后一个assign

     plot_list <- lapply(data_percentage_matrix_list, function(mat) { barplot(mat, main = "Breakdown of Property Score Distribution", sub = hotelname, col = coul, las = 1, cex.names =.6, horiz = TRUE, yaxs="i", xlab = "Percentage", cex.axis =.8, cex.lab =.8, cex.main =.8, cex.sub =.8)) })
  • Loops : Avoid multiple for or nested loops as much as possible.循环:尽可能避免多个for或嵌套循环。 In R, lapply is a hidden loop.在R中, lapply是一个隐藏循环。 Your issues of 391 plots in each of the 391 PDFs likely is due to nested lapply within a for loop. 391 个 PDF 中每个 391 个图的问题可能是由于lapply嵌套for循环中。 Consider these steps:考虑以下步骤:

    1. First, think about your process on one data frame object. Even generalize it in a separate function.首先,考虑你在一个数据帧 object 上的过程。甚至将它概括为一个单独的 function。
    2. Then, think about what exactly changes that can be iterated.然后,考虑可以迭代的确切更改。

    R's apply family includes more than just apply and lapply such as mapply that can run elementwise looping to flatten your nested iterations or by (object-oriented wrapper to tapply ) that can subset data frames by factor columns and run operations on them. R 的 apply 系列不仅包括applylapply ,例如mapply可以运行逐元素循环以展平嵌套迭代,或者by (面向对象的包装器到tapply )可以按因子列对数据帧进行子集化并对其运行操作。


Without seeing sample data, consider following approaches which will need to be tested against data.在看不到示例数据的情况下,请考虑以下需要针对数据进行测试的方法。 Below assumes Y is defined as list of subsets from hotel_report data frame by Harp Number .下面假设YHarp Number定义为hotel_report数据帧的子集列表。

mapply / Map approach mapply / Map方法

Iterate elementwise between equal-length objects, data_pct_matrix_list and harps .在等长对象data_pct_matrix_listharps之间逐元素迭代。

data_pct_matrix_list <- lapply(seq_along(harps), function(i) {
    counts <- table(Y[[i]]$`Score Label`, Y[[i]]$`Question ID`)
    pct_matrix <- apply(counts, 2, function(x) { x*100/sum(x, na.rm=TRUE) }) 

    return(pct_matrix)
})

build_pdf <- function(data, harp) {
    # For Hotel Name Subtitle
    hotelname <- hotel_report$`Hotel (Q15 1)`[hotel_report$`Harp Number` == harp]

    # Plot the Data 
    pdf(file = paste0(harp, ".pdf"), paper = "USr", width=8, height=7)
        par(mar = c(5.1, 7, 4.1, 2.1))
        
        hotel_plot <- barplot(data, main = "Breakdown of Property Score Distribution", sub = hotelname, 
                              col = coul, las = 1, cex.names = .6, horiz = TRUE, yaxs="i", xlab = "Percentage",
                              cex.axis = .8, cex.lab = .8, cex.main = .8, cex.sub = .8))
    dev.off()
    
    return(hotel_plot)
}

plot_list <- Map(build_pdf, data_pct_matrix_list, harps)

# EQUIVALENTLY:
plot_list <- mapply(build_pdf, data_pct_matrix_list, harps, SIMPLIFY=FALSE)  

by approach by方法

Subset hotel_report data frame by unique Harp Number and iteratively run on each subset to build pct_matrix and hotel_plot .通过唯一的Harp Numberhotel_report数据框进行子集化,并在每个子集上迭代运行以构建pct_matrixhotel_plot This approach combines matrix build and plot steps.这种方法结合了矩阵构建和 plot 个步骤。

build_pdf <- function(sub_df) {
    # Matrix build
    counts <- table(sub_df$`Score Label`, sub_df$`Question ID`)
    pct_matrix <- apply(counts, 2, function(x) { x*100/sum(x, na.rm=TRUE) }) 

    # For Hotel Name Subtitle
    hotelname <- sub_df$`Hotel (Q15 1)`[1]
    harp <- sub_df$`Harp Number`[1]

    # Plot the Data 
    pdf(file = paste0(harp, ".pdf"), paper = "USr", width=8, height=7)
        par(mar = c(5.1, 7, 4.1, 2.1))
        
        hotel_plot <- barplot(pct_matrix, main = "Breakdown of Property Score Distribution", sub = hotelname, 
                              col = coul, las = 1, cex.names = .6, horiz = TRUE, yaxs="i", xlab = "Percentage",
                              cex.axis = .8, cex.lab = .8, cex.main = .8, cex.sub = .8))
    dev.off()
    
    return(hotel_plot)
}

plot_list <- by(hotel_report, hotel_report$`Harp Number`, build_pdf)

# NEAR EQUIVALENT
plot_list <- lapply(split(hotel_report, hotel_report$`Harp Number`), build_pdf) 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 我正在尝试在 R 中创建一个带有 while 循环的 Collat​​z 序列。我在这个 while 循环中做错了什么? - I am trying to create a Collatz sequence with while loop in R. What am i doing wrong in this while loop here? R:我在这个带有 if-else 语句的 for 循环中做错了什么? - R: What am I doing wrong in this for-loop with an if-else statement? 我正在尝试绘制比例而不是我在R中的ggplot2中绘制的比例,但是我不确定如何去做 - I'm trying to plot proportions instead of what I have in ggplot2 in R but I am unsure how to go about doing so R中的快速傅立叶变换。我在做什么错? - Fast Fourier Transform in R. What am I doing wrong? 我在做什么错(data.table,R)? - What am I doing wrong (data.table, R)? dplyr (R) 中的 stderr:我做错了什么? - stderr in dplyr (R): What am I doing wrong? 我在 R 到 python 的翻译中做错了什么? - What I am doing wrong in R to python translation? R 中的错误还是我将 ggplot 保存到列表错误? - Bug in R or am I doing the ggplot saving to a list wrong? 使用 R,如何将 tibble 复制到列表的元素。 例如,像 ff[i] 这样的每个元素在每个 i 处都有一个半字节 - Using R, How do I copy the tibble to an element of the list. for example, each element like ff[i] have a nibble at each i 我在这个情节里做错了什么? - What am I doing wrong in this plot?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM