简体   繁体   English

如何计算 R 中 dataframe 中一组其他唯一值的最小/最大项?

[英]How can I count the min/max item for for a set of otherwise unique values within a dataframe in R?

I have data set in R from Billboard top hits.我有来自 Billboard 热门歌曲的 R 数据集。 I am able to count the number of unique hits for a given artist (see code below) but having trouble figuring out how to find the highest point on the charts a song went.我能够计算给定艺术家的唯一点击数(请参见下面的代码),但无法弄清楚如何在歌曲排行榜上找到最高点。 The only thing I can think of is that before I filter out the unique values to run a loop for each song and calculate the min value.我唯一能想到的是,在我过滤掉唯一值之前为每首歌曲运行一个循环并计算最小值。 I am new to R so unaware of other easier ways.我是 R 的新手,所以不知道其他更简单的方法。

mydata=read.csv("Hot100.csv")
mydata <- mydata[order(mydata$artist, mydata$song, mydata$date),]
head(mydata)

# date position        song  artist
# 218482 2000-07-01       40 Bye Bye Bye 'N Sync
# 226912 2002-02-09       70  Girlfriend 'N Sync
# 226997 2002-02-16       55  Girlfriend 'N Sync
# 227072 2002-02-23       30  Girlfriend 'N Sync
# 227164 2002-03-02       22  Girlfriend 'N Sync
# 227260 2002-03-09       18  Girlfriend 'N Sync

# to remove some cols - leaves artist and song.  Has duplicates
mysub = subset(mydata, select = -c(date, position))

# now to make unique
mysub_u = unique(mysub[,c(1,2)])
View(mysub_u)

# put into table form
mytable = table(mysub_u$artist)

# but this is table form , not df
df=as.data.frame(mytable)

head(df)

# Var1 Freq
# 1                  'N Sync    7
# 2 'N Sync & Gloria Estefan    1
# 3  'N Sync Featuring Nelly    1
# 4             'Til Tuesday    1
# 5      "Weird Al" Yankovic    2
# 6                    (+44)    1

How could I great a table which would list the artist, song and highest number (position) it went to, with 1 being the highest?我怎样才能制作出一张能列出艺术家、歌曲和最高数字(位置)的表格,其中 1 是最高的?

It would have been nice to have a bigger dataset (or given any usable data) to play with.拥有一个更大的数据集(或给定任何可用的数据)来玩会很好。 However, here is a way to do demonstrated on the small data you provided.但是,这是一种在您提供的小数据上进行演示的方法。

library(readr)
library(dplyr)

mydata <- read_table2("index date position song artist
218482 2000-07-01 40 Bye_Bye_Bye 'N_Sync
226912 2002-02-09 70  Girlfriend 'N_Sync
226997 2002-02-16 55  Girlfriend 'N_Sync
227072 2002-02-23 30  Girlfriend 'N_Sync
227164 2002-03-02 22  Girlfriend 'N_Sync
227260 2002-03-09 18  Girlfriend 'N_Sync")

out <- mydata %>% 
  group_by(artist,song) %>% 
  mutate(highest_position = min(position)) %>% 
  select(-index,-date,-position) %>% 
  unique(.)

Output: Output:

> out
# A tibble: 2 x 3
# Groups:   artist, song [2]
  song        artist  highest_position
  <chr>       <chr>              <dbl>
1 Bye_Bye_Bye 'N_Sync               40
2 Girlfriend  'N_Sync               18

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何计算和标记r数据帧中的唯一值 - How to count and flag unique values in r dataframe 如何计算 R 中一行内的唯一值 - How to count unique values within a row in R 如何在R中找到dataframe的字符串行的最大值和最小值? - How to find the max and min values of string rows of a dataframe in R? 如何在R的一列中的值序列中找到最大值和最小值? - How to find max and min within sequence of values in a column in R? R:当 max &lt;= min 时,如何替换(切换)数据帧中一行中的最大值和最小值? - R: How to to replace(switch) the max and min values in a row in a dataframe when max <= min? R - 如何使用 ggplot2 为我的散点图的最大值和最小值点着色和 label? - R - How can I color and label the max and min values points of my scatterplot using ggplot2? R:添加组内唯一值的计数,忽略 dataframe 内的其他变量 - R: Add count for unique values within Group, disregarding other variables within dataframe 在r中,如何计算一年内重复值的唯一出现次数? - In r, how to count the number of unique occurrences within a year with repeated values? 如何在 R 中的 dataframe 中的每一行中提取唯一值 - How to extract unique values within each row in dataframe in R 表示数据帧中的行值,不包括R中的最小值和最大值 - mean from row values in a dataframe excluding min and max values in R
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM