簡體   English   中英

使用R讀取帶有特殊字符的名稱

[英]Reading names with special characters using R

我有一張excel(xlsx)表格,在“玩家”列中,歐洲玩家的名字帶有星號,而南美人則沒有。 像這樣

  PLAYERS
   Neymar
   *Bale*
    Messi
*Ronaldo*
*Benzema*
*Iniesta*
  DiMaria  

我有什么方法可以使用R(或Excel本身)將此數據集拆分為一個歐洲人(帶星號)和另一個南美人? 當然,數據集包含其他列,例如“ SALARY”,“ SCORED GOALS”,“ OFFSITE”,“ AGE”等,等等。

謝謝,迭戈。

您可以檢查玩家名稱中是否有“ *”,並在新列中寫上“ European”或“ South American”,然后,如果需要,您可以將數據框拆分為包含兩個data.frames的列表。歐洲人,南美人:

df <- data.frame(PLAYERS = c("Neymar", "*Ronaldo*", "Messi"), SALARY = 5:7)
df
#    PLAYERS SALARY
#1    Neymar      5
#2 *Ronaldo*      6
#3     Messi      7

# check if there's a * in the PLAYERS column
df$Location <- ifelse(grepl("\\*", df$PLAYERS), "European", "South American")
df
#    PLAYERS SALARY       Location
#1    Neymar      5 South American
#2 *Ronaldo*      6       European
#3     Messi      7 South American

#split the data based on location:
dflist <- split(df, df$Location)

dflist
#$European
#    PLAYERS SALARY Location
#2 *Ronaldo*      6 European
#
#$`South American`
#  PLAYERS SALARY       Location
#1  Neymar      5 South American
#3   Messi      7 South American

現在,您可以通過鍵入以下內容來訪問每個列表元素(即data.frame)

dflist[["European"]]  # or "South American" instead
#    PLAYERS SALARY Location
#2 *Ronaldo*      6 European

您可以拆分此特定列,並使用splitsetNames命名結果列表

> dat <- structure(list(PLAYERS = structure(c(6L, 1L, 5L, 7L, 2L, 4L, 3L), 
                 .Label = c("*Bale*", "*Benzema*", "DiMaria", "*Iniesta*",   
                            "Messi", "Neymar", "*Ronaldo*"), class = "factor")),
                 .Names = "PLAYERS", class = "data.frame", row.names = c(NA,-7L))

> setNames(split(dat, grepl("[*]", dat$PLAYERS)), nm = c("Euro", "SoAm"))
#$Euro
#   PLAYERS
# 1  Neymar
# 3   Messi
# 7 DiMaria
#
# $SoAm
#     PLAYERS
# 2    *Bale*
# 4 *Ronaldo*
# 5 *Benzema*
# 6 *Iniesta*

使用PLAYERS for ROWS從源數據創建數據透視表。 使用標簽過濾器過濾,包含... ~* ,然后單擊Grand Total 返回PT,選擇“不包含...”,然后再次單擊“ Grand Total

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM