简体   繁体   English

R-使用日期收集重塑

[英]R - reshape using gather on dates

I have the below in csv file and being read into R. 我在csv文件中具有以下内容,并被读入R。

    1/10/2018   1/11/2018
AUD 42984.41    42984.41
HKD 214412.85   214412.85
INR 1862292.95  1862292.95
KRW 86.00   86.00
NZD 37144.18    37144.18
PHP 409894.32   409894.32
SGD 241245.08   241245.08
THB 148725.77   148725.77
TWD 2169925.00  2169925.00
USD 873297.93   873297.93


> cash_forecast = read.csv("cash_forecast.csv", header=TRUE, stringsAsFactors=TRUE)
> cash_forecast
     X X1.10.2018 X1.11.2018
1  AUD   42984.41   42984.41
2  HKD  214412.85  214412.85
3  INR 1862292.95 1862292.95
4  KRW      86.00      86.00
5  NZD   37144.18   37144.18
6  PHP  409894.32  409894.32
7  SGD  241245.08  241245.08
8  THB  148725.77  148725.77
9  TWD 2169925.00 2169925.00
10 USD  873297.93  873297.93

I do not know why there is a X being added to each column, ie X X1.10.2018 X1.11.2018 我不知道为什么在每列中都添加一个X,即X X1.10.2018 X1.11.2018

I did a reshape using gather and the column, Day has X appended to the dates. 我使用collect和列进行了重塑,Day在日期后附加了X。 Any idea why is this so? 知道为什么会这样吗?

> no_of_col <- ncol(cash_forecast) #no of columns
> long_cash_forecast <- cash_forecast %>% gather(Day, Balance, 2:no_of_col)
> long_cash_forecast
     X        Day    Balance
1  AUD X1.10.2018   42984.41
2  HKD X1.10.2018  214412.85
3  INR X1.10.2018 1862292.95
4  KRW X1.10.2018      86.00
5  NZD X1.10.2018   37144.18
6  PHP X1.10.2018  409894.32
7  SGD X1.10.2018  241245.08
8  THB X1.10.2018  148725.77
9  TWD X1.10.2018 2169925.00
10 USD X1.10.2018  873297.93
11 AUD X1.11.2018   42984.41
12 HKD X1.11.2018  214412.85
13 INR X1.11.2018 1862292.95
14 KRW X1.11.2018      86.00
15 NZD X1.11.2018   37144.18
16 PHP X1.11.2018  409894.32
17 SGD X1.11.2018  241245.08
18 THB X1.11.2018  148725.77
19 TWD X1.11.2018 2169925.00
20 USD X1.11.2018  873297.93

It's recommended that column names follow the same rules as object names - don't start with a number, no special symbols that could be confused with math operations ( - , / , * , + , etc.), no spaces.... 建议列名遵循与对象名相同的规则-不要以数字开头,不要将可能与数学运算( -/*+等)相混淆的特殊符号,不要使用空格。

When you use read.csv (or any base importing functions), by default it runs a function called make.names to fix up the names. 当您使用read.csv (或任何基本导入功能)时,默认情况下它将运行一个名为make.names的函数来修正名称。 You can stop that by setting check.names = FALSE in read.csv . 您可以通过在read.csv设置check.names = FALSE来停止该read.csv See the help page ?read.csv for details. 有关详细信息,请参见帮助页面?read.csv

That said, you don't need to skip the names check. 也就是说,您无需跳过名称检查。 Almost surely, your next step should be to convert your column to a proper Date class, which is as easy to do with the X as without the X : 几乎可以肯定,下一步应该是将列转换为适当的Date类,使用X就像不使用X一样容易:

# with the X
as.Date(long_cash_forecast$Day, format = "X%d.%M.%Y")

# without the X, and with / instead of . 
as.Date(long_cash_forecast$Day, format = "%d/%M/%Y")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM