简体   繁体   English

在单张xlsx文件中附加r输出

[英]Appending r output in a single sheet of xlsx file

How can i append my R outputs in a single sheet of xlsx file? 如何将我的R输出附加到一张xlsx文件中? I am currently working on web crawling wherein i need to scrap the user reviews from website and save it in my deskstop in xlsx format. 我目前正在进行网络爬网,其中我需要从网站上抓取用户评论,并将其以xlsx格式保存在我的桌面停顿中。 I need to every time change the website url(as user reviews are in different pages) in my code and save the output in one sheet of xlsx file. 我每次都需要在代码中更改网站网址(因为用户评论位于不同的页面中),并将输出保存在一张xlsx文件中。

Can you please help me with the code of appending outputs in a single sheet of xlsx file? 您能帮我在一张xlsx文件中附加输出的代码吗? Below is the code which i am using: Every time i need to change the website url and run the same below code and save the corresponding output in a single sheet of mydata.xlsx 下面是我正在使用的代码:每当我需要更改网站URL并运行下面的相同代码并将相应的输出保存在mydata.xlsx的单个表中

library("rvest")
htmlpage <- html("http://www.glassdoor.com/GD/Reviews/Symphony-Teleca-Reviews-E28614_P2.htm?sort.sortType=RD&sort.ascending=false&filter.employmentStatus=REGULAR&filter.employmentStatus=PART_TIME&filter.employmentStatus=UNKNOWN")
proshtml <- html_nodes(htmlpage, ".pros")
pros <- html_text(proshtml)
pros

data=data.frame(pros)

library(xlsx)
write.xlsx(data, "D:/mydata.xlsx", append=TRUE)

A trivial, but super-slow way: 一个琐碎但超慢的方式:

If you only need to add (a few) row(s) to an existing Excel file, and it only has one sheet to which you want to append, you can just do a simple read => overwrite step: 如果您只需要在现有的Excel文件中添加几行,并且只需要添加一张纸,则只需执行一个简单的read =>覆盖步骤:

SHEET.NAME <- '...' # fill in with yours
existing.data <- read.xlsx(file, sheetName = SHEET.NAME)
new.data <- rbind(existing.data, data)
write.xlsx(new.data, file, sheetName = SHEET.NAME, row.names = F, append = F)

Note: 注意:

  • It's quite slow in general, will work only for small scale 总的来说它很慢,仅适用于小规模
  • read.xlsx is a slow function. read.xlsx是一个缓慢的函数。 Try read.xlsx2 to make it much faster (see the difference in the docs) 尝试使用read.xlsx2使其速度更快(请参阅文档中的区别)
  • If your R process is run once and keeps working for a long time, obviously don't do it this way (reading and overwriting a file is ridiculous in that case) 如果您的R进程只运行一次并且可以长时间工作,那么显然不要这样做(在这种情况下,读取和覆盖文件很可笑)

look at package xlsx . 看一下包xlsx

?write.xlsx will show you what you want. ?write.xlsx将显示您想要的内容。 append=TRUE is the key. append=TRUE是关键。

========= EDIT TO CORRECT ========= =========编辑正确=========

As @Jakub pointed out, append=TRUE adds another worksheet to the file. 正如@Jakub指出的那样, append=TRUE将向文件添加另一个工作表。

========= EDIT TO ADD: ANOTHER METHOD ========== =========编辑:另一种方法==========

Another method is to save the data to a .csv file, which could easily open from excel. 另一种方法是将数据保存到.csv文件,该文件可以从excel中轻松打开。 In this case, the append=T works as expected (adding to the existing sheet): 在这种情况下, append=T可以按预期工作(添加到现有工作表中):

write.table(df,"D:/MyFile.csv",append=T,sep=",")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM