简体   繁体   English

R:根据条件从数据框中选择行

[英]R: Select Rows from Data Frame based on condition

I have the following data frames 我有以下数据框

User_Details: 用户详细信息:

+-------------+-----------+-----------+
|   Name      |  Address  |   Phone   |
+-------------+-----------+-----------+
| John Doe    | Somewhere | 123456789 |
| Jane Doe    | Somewhere | 234567891 |
| Jack Russel | Somewhere | 234567891 |
+-------------+-----------+-----------+

User_Transaction_Count: User_Transaction_Count:

+-------------+-----------+
|   Name      | Frequency |
+-------------+-----------+
| John Doe    | 2         |
| Jane Doe    | 5         |
| Jack Russel | 2         |
+-------------+-----------+

What I want to do is get the details of the user with the most transactions. 我要做的是获取交易最多的用户的详细信息。 So in the above case, Jane Doe has the most transactions, so I need to fetch her details into a data frame. 因此,在上述情况下,Jane Doe的交易最多,因此我需要将其详细信息提取到数据框中。

I tried the following code: 我尝试了以下代码:

User_details[which(user_details$Name = User_Transaction_Count[(which.max(User_Transaction_Count$Frequency)),]$Name)]

But I get this error: 但是我得到这个错误:

Error: unexpected '=' in "ad_maxState <- accidental_deaths[which(accidental_deaths$State ="

我对T.Ciffréo的答案进行了一些更改,并找到了解决方案:

User_details[User_details$Name==as.character(User_transaction_Count[which.max(User_transaction_Count$Frequency),]$Name),]

To determine the user with the maximum Frequency, we can use: 要确定具有最高频率的用户,我们可以使用:

with(User_Transaction_Count,Name[[which.max(Frequency)]])

However, if the User column is using the factor() datatype (which is usually the default), we need to convert it to a string to be used for lookup. 但是,如果“ User列使用factor()数据类型(通常是默认值),则需要将其转换为用于查找的字符串。 Otherwise internal value for "John Doe" in one data.frame may not be the same as "John Doe" in the other. 否则,一个data.frame “ John Doe”的内部值可能与另一个data.frame中的“ John Doe”不同。

maxUser <- as.character(with(User_Transaction_Count,Name[[which.max(Frequency)]]))

Then we can perform the lookup in the other data.frame . 然后,我们可以在另一个data.frame执行查找。

result <- User_Details[User_Details$Name == maxUser,]

This may take a long time if the table is very large, so it may be best to create an index for this 如果表很大,可能会花费很长时间,因此最好为此创建一个索引

#build index
library(hash)
userIdx <- hash(as.character(User_Details$Name),1:nrow(User_Details))

#use index
maxUser <- as.character(with(User_Transaction_Count,Name[[which.max(Frequency)]]))
result <- User_Details[userIdx[[maxUser]],]

Output: 输出:

> result
      Name   Address     Phone
2 Jane Doe Somewhere 234567891

码:

User_details[User_details$Name==User_transaction_Count[max(User_transaction_Count$Frequency),]$Name,]$Name

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM