[英]Reading names with special characters using R
我有一張excel(xlsx)表格,在“玩家”列中,歐洲玩家的名字帶有星號,而南美人則沒有。 像這樣
PLAYERS
Neymar
*Bale*
Messi
*Ronaldo*
*Benzema*
*Iniesta*
DiMaria
我有什么方法可以使用R(或Excel本身)將此數據集拆分為一個歐洲人(帶星號)和另一個南美人? 當然,數據集包含其他列,例如“ SALARY”,“ SCORED GOALS”,“ OFFSITE”,“ AGE”等,等等。
謝謝,迭戈。
您可以檢查玩家名稱中是否有“ *”,並在新列中寫上“ European”或“ South American”,然后,如果需要,您可以將數據框拆分為包含兩個data.frames的列表。歐洲人,南美人:
df <- data.frame(PLAYERS = c("Neymar", "*Ronaldo*", "Messi"), SALARY = 5:7)
df
# PLAYERS SALARY
#1 Neymar 5
#2 *Ronaldo* 6
#3 Messi 7
# check if there's a * in the PLAYERS column
df$Location <- ifelse(grepl("\\*", df$PLAYERS), "European", "South American")
df
# PLAYERS SALARY Location
#1 Neymar 5 South American
#2 *Ronaldo* 6 European
#3 Messi 7 South American
#split the data based on location:
dflist <- split(df, df$Location)
dflist
#$European
# PLAYERS SALARY Location
#2 *Ronaldo* 6 European
#
#$`South American`
# PLAYERS SALARY Location
#1 Neymar 5 South American
#3 Messi 7 South American
現在,您可以通過鍵入以下內容來訪問每個列表元素(即data.frame)
dflist[["European"]] # or "South American" instead
# PLAYERS SALARY Location
#2 *Ronaldo* 6 European
您可以拆分此特定列,並使用split
和setNames
命名結果列表
> dat <- structure(list(PLAYERS = structure(c(6L, 1L, 5L, 7L, 2L, 4L, 3L),
.Label = c("*Bale*", "*Benzema*", "DiMaria", "*Iniesta*",
"Messi", "Neymar", "*Ronaldo*"), class = "factor")),
.Names = "PLAYERS", class = "data.frame", row.names = c(NA,-7L))
> setNames(split(dat, grepl("[*]", dat$PLAYERS)), nm = c("Euro", "SoAm"))
#$Euro
# PLAYERS
# 1 Neymar
# 3 Messi
# 7 DiMaria
#
# $SoAm
# PLAYERS
# 2 *Bale*
# 4 *Ronaldo*
# 5 *Benzema*
# 6 *Iniesta*
使用PLAYERS
for ROWS從源數據創建數據透視表。 使用標簽過濾器過濾,包含... ~*
,然后單擊Grand Total
。 返回PT,選擇“不包含...”,然后再次單擊“ Grand Total
。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.