在R中添加dataframe中的月數

Question

我有一個這樣的 dataframe：

customer= c('1530','1530','1530','1531','1531','1532')  
month =  c('2021-10-01','2021-11-01','2021-12-01','2021-11-01','2021-12-01','2021-12-01')  
month_number = c(1,2,3,1,2,1)  
df <- data.frame('customer_id'=customer, entry_month=month)  
df

| customer_id| entry_month|
| ---------- | ---------- |
1|      1530 | 2021-10-01 |
2|      1530 | 2021-11-01 |
3|      1530 | 2021-12-01 |
4|      1531 | 2021-11-01 |
5|      1531 | 2021-12-01 |
6|      1532 | 2021-12-01 |

我需要創建一個列來指示客戶加入后的月份數。 這是我想要的 output：

new_df <- data.frame('customer_id'=customer, 'month'=month, 'month_number'=month_number)  
new_df

| customer_id| entry_month| month_number |
| ---------- | ---------- |--------------|
1|      1530 | 2021-10-01 | 1            |
2|      1530 | 2021-11-01 | 2            |
3|      1530 | 2021-12-01 | 3            |
4|      1531 | 2021-11-01 | 1            |
5|      1531 | 2021-12-01 | 2            |
6|      1532 | 2021-12-01 | 1            |

Answer 1

您可以將entry_month轉換為date格式，然后只需使用first ：

library(dplyr)
df %>%
  group_by(customer_id) %>%
  mutate(
    entry_month = as.Date(entry_month),
    nmonth = round(as.numeric(entry_month - first(entry_month)) / 30) + 1,
  )

# A tibble: 6 x 3
# Groups:   customer_id [3]
  customer_id entry_month nmonth
  <chr>       <date>       <dbl>
1 1530        2021-10-01       1
2 1530        2021-11-01       2
3 1530        2021-12-01       3
4 1531        2021-11-01       1
5 1531        2021-12-01       2
6 1532        2021-12-01       1

請注意，如果entry_month始終是一個月中的第一天，則此方法有效。 否則，您將必須具體說明一個月的確切含義。 例如，如果第一個條目在2021-10-20中，第二個條目在 2021-11-10 中， 2021-11-10的期望結果是nmonth ？

Answer 2

這需要日期的年月部分並計算不同的值。

我擴展了示例以包括重復的月份。

library(dplyr)

df %>% 
  group_by(customer_id) %>% 
  arrange(entry_month, .by_group=T) %>% 
  mutate(month_number = cumsum(
           !duplicated(strftime(entry_month, "%Y-%m")))) %>% 
  ungroup()
# A tibble: 7 × 3
  customer_id entry_month month_number
  <chr>       <chr>              <int>
1 1530        2021-10-01             1
2 1530        2021-10-12             1
3 1530        2021-11-01             2
4 1530        2021-12-01             3
5 1531        2021-11-01             1
6 1531        2021-12-01             2
7 1532        2021-12-01             1

數據

df <- structure(list(customer_id = c("1530", "1530", "1530", "1530",
"1531", "1531", "1532"), entry_month = c("2021-10-01", "2021-10-12",
"2021-11-01", "2021-12-01", "2021-11-01", "2021-12-01", "2021-12-01"
)), row.names = c(NA, -7L), class = "data.frame")

Answer 3

您可以選擇使用data.table package：

library(data.table)

dt <- setDT(df)

dt[, entry_month := as.IDate(entry_month)] # Tranform the column as "IDate"

dt2 <- dt[, seq_along(entry_month), by = customer_id] # Create the sequence

dt[, mont_number := dt2$V1] # Include into the datatable

dt

Output：

 customer_id entry_month mont_number
1:        1530  2021-10-01           1
2:        1530  2021-11-01           2
3:        1530  2021-12-01           3
4:        1531  2021-11-01           1
5:        1531  2021-12-01           2
6:        1532  2021-12-01           1

在R中添加dataframe中的月數

問題描述

3 個解決方案

解決方案1
2 已采納 2022-02-23 14:47:02

解決方案2
1 2022-02-23 15:08:27

數據

解決方案3
0 2022-02-23 15:25:46

在R中添加dataframe中的月數

問題描述

3 個解決方案

解決方案1 2 已采納 2022-02-23 14:47:02

解決方案2 1 2022-02-23 15:08:27

數據

解決方案3 0 2022-02-23 15:25:46

解決方案1
2 已采納 2022-02-23 14:47:02

解決方案2
1 2022-02-23 15:08:27

解決方案3
0 2022-02-23 15:25:46