简体   繁体   English

读取多个文本文件

[英]Read multiple text files

I am reading the attached .txt file using the R code below. 我正在使用下面的R代码阅读附件的.txt文件。 I have 2200 txt files like this with different station IDs. 我有2200个具有不同工作站ID的txt文件。 I need to output only the year for peak flow data available. 我只需要输出可用峰值流量数据的年份。 For example, 例如,

Year     Peak 
1929   4050 
1940   7000 
1958   4050 
... 

Can somebody help me to modify this code to achive this. 有人可以帮我修改此代码以实现此目的。

My R code is shown below. 我的R代码如下所示。

rm(list=ls(all=TRUE)) 
iPath <- 'C:/Desktop/flow_raw/Region-03/' 
mydata <- read.table("02053200-PeakFlow-uptoWY2015.txt", sep="\t", header=TRUE) 
out <- mydata[c(3,5)] 

I cannot see any attached file. 我看不到任何附件。

There are various options to accomplish the task. 有多种选项可以完成任务。

library(plyr)   #you only need these packages if you follow my first Option
library(dplyr)

files <- dir("C:/Desktop/flow_raw/Region-03", 
             full.names = TRUE)


# OPT. 1: if you need a Data Frame
df <- lapply(files, function(x) 
      read.table(x, sep = '\t', header = FALSE)[c(3,5)]) %>% 
      plyr::ldply()    #the '.id' argument might be useful

# OPT. 2: if you need a list
listTxt <- lapply(files, function(x) 
           read.table(x, sep = '\t', header = FALSE)[c(3,5)])

NB: If you need a FAST reading function, please, take a look at 注意:如果您需要快速阅读功能,请看一下

data.table::fread data.table :: fread

在此处输入图片说明

在此处输入图片说明

在此处输入图片说明

If I am understanding your question correctly, you want to import 2200 text files at once. 如果我正确理解了您的问题,则希望一次导入2200个文本文件。 For some reason I can't see the attachment, but you should be able to read in the data using the function Corpus from the tm package. 出于某种原因,我看不到附件,但是您应该可以使用tm包中的Corpus函数读取数据。

In your case: (the path should lead to a folder where all the text files live) 在您的情况下:(路径应指向所有文本文件所在的文件夹)

TextCorpus <- Corpus(DirSource("C:/Desktop/flow_raw/Region-03"))
TextCorpus$content 

You should be able to subset these documents. 您应该能够对这些文档进行子集化。 I usually make a list of the documents' content so that you would have a list of 2200 elements containing the original text. 我通常会列出文档的内容,以便您可以找到包含原始文本的2200个元素的列表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM