I have a data frame that contains 6181 rows (one for each player in a large fantasy football contest), one of the columns in this data frame has a list of the 9 different football players that make up each player's roster.
I want R to give me all the different football player names that show up in this column (there's hundreds) and count how many times each of these individual names shows up.
Here is an example of a cell in the column:
QB Dane Evans QB Jaquez Johnson RB Zack Langer RB Greg Howell WR Keyarris Garrett WR Jenson Stoshak WR Keevan Lucas FLEX Jordan Howard FLEX Sony Michel
For this I would like the output of (if I were just working with 1 row instead of 6181):
QB Dane Evans - 1
QB Jaquez Johnson - 1
RB Zack Langer - 1
RB Greg Howell - 1
WR Keyarris Garrett - 1
WR Jenson Stoshak - 1
WR Keevan Lucas - 1
FLEX Jordan Howard - 1
FLEX Sony Michel - 1
Or 100% instead of 1.
Most of my searches for answers to this question, I think, seem to be showing me ways that I could count how many times any specific combination of 9 players, listed in a specific order, is showing up, not counts of individual names across all rows.
My humble Solution
# Data Frame
my.players <- data.frame( name = "QB Dane Evans QB Jaquez Johnson RB Zack Langer RB Greg Howell WR Keyarris Garrett WR Jenson Stoshak WR Keevan Lucas FLEX Jordan Howard FLEX Sony Miche")
# Position dictionary. Add all positions here in that format.
pos.dic <- c( "\ *QB\ *"
, "\ *RB\ *"
, "\ *WR\ *"
, "\ *FLEX\ *"
)
# Regex for positions
pos.regex <- paste( pos.dic, collapse = "|" )
# Remove Positions
play.names <- gsub( pattern = pos.regex
, replacement = ","
, x = my.players$name
)
# Split
play.names <- strsplit( x = play.names, split = ",")
# Unlist
play.names <- unlist( x = play.names )
# Remove first space
play.names <- play.names[ -1 ]
# Result
[1] "Dane Evans" "Jaquez Johnson" "Zack Langer" "Greg Howell" "Keyarris Garrett" "Jenson Stoshak" "Keevan Lucas" "Jordan Howard"
[9] "Sony Miche"
Then, make use of the table function, it will return a frequency table. Description:
‘table’ uses the cross-classifying factors to build a contingency
table of the counts at each combination of factor levels.
Example:
freq.table <- table(x = play.names )
Dane Evans Greg Howell Jaquez Johnson Jenson Stoshak Jordan Howard Keevan Lucas Keyarris Garrett Sony Miche Zack Langer
1 1 1 1 1 1 1 1 1
Then, if you prefer percentages, use prop.table :) :
prop.table <- prop.table( x = freq.table )
prop.table <- round( x = prop.table * 100
, digits = 2
)
Dane Evans Greg Howell Jaquez Johnson Jenson Stoshak Jordan Howard Keevan Lucas Keyarris Garrett Sony Miche Zack Langer
11.11 11.11 11.11 11.11 11.11 11.11 11.11 11.11 11.11
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.