[英]Removing NA values from a data frame or time-series object in R
I read in some data via: 我通过以下方式读取了一些数据:
it.data <- read.csv("inputData/rstar.data.it.csv", header = T, sep = ",")
then the second and fourth columns are inflation resp. it.data <- read.csv("inputData/rstar.data.it.csv", header = T, sep = ",")
则第二和第四列分别是充气。 interest: 利益:
inflation.it <- it.data[2]
and 和
interest.it <- it.data[4]
. interest.it <- it.data[4]
。
However, the trouble starts when I am trying to reform the data into a time-series object, because there are leading and trailing NA values in the columns. 但是,当我尝试将数据重整为时间序列对象时,麻烦就开始了,因为列中有前导和尾随的NA值。 I have tried
na.omit()
, it.data[complete.cases(it.data),]
, na.contiguous
, without luck. 我已经尝试过
na.omit()
, it.data[complete.cases(it.data),]
, na.contiguous
,但是没有运气。 What happens now is that when I try to transform the data into a TS object, 现在发生的是,当我尝试将数据转换为TS对象时,
inflation.ts.it <- ts(inflation.it, frequency = 4, interest.start)
I get very strange values which do not match with the original data. 我得到非常奇怪的值,这些值与原始数据不匹配。
Thanks. 谢谢。
PS. PS。 The data (I did not post everything, but just to get an idea):
数据(我没有发布所有内容,只是想知道一个主意):
gdp.log inflation inflation.expectations interest
1 . 2.4361259655 . .
2 . 2.9997029997 . .
3 . 1.5169194865 . .
4 . 1.5059368664 2.11467132957868 .
5 . 2.0591647331 2.02043102148892 .
6 . 1.9896193771 1.76791011585382 .
7 . 2.6436781609 2.04959978443843 .
8 . 3.3951497860 2.52190301432020 .
9 . 4.5467462347 3.14379838970698 .
10 . 5.0890585241 3.91865817645959 .
11 . 5.7110862262 4.68551019278066 .
12 . 7.7262693156 5.76829007519398 .
13 . 7.5292198967 6.51390849069030 .
14 . 6.9679849340 6.98364009316870 .
15 . 7.6006355932 7.45602743492283 .
16 . 5.6352459016 6.93327158141434 .
17 . 5.4853387259 6.42230128873304 .
18 . 6.6649899396 6.34655254012084 .
19 . 5.8577405857 5.91082878825926 .
20 . 5.5528612997 5.89023263777669 .
21 . 4.9125329499 5.74703119375926 .
22 . 4.2442820089 5.14185421108985 .
Assuming the dots are in the original CSV, you can fix it by specifying "."
假设这些点在原始CSV中,则可以通过指定
"."
来进行修复"."
as na.string
upon read-in. 读
na.string
作为na.string
。
read.csv(text=
"gdp.log,inflation,inflation.expectations,interest
.,2.4361259655377,.,.
.,2.99970299970301,.,.
.,1.5169194865811,.,.
.,1.50593686649291,2.11467132957868,.
.,2.05916473317866,2.02043102148892,.
.,1.9896193771626,1.76791011585382,.
.,2.64367816091953,2.04959978443843,.",
header=TRUE, na.string=c(".", "NA"))
na.string
can be a vector of character strings, in case several codes are used for missing values. na.string
可以是字符串的向量,以防丢失值使用多个代码。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.