简体   繁体   English

如何获取满足特定条件的 R 数据帧中的第一行?

[英]How to get the first rows in an R dataframe that meet a specific condition?

I have a dataframe with many thousands of rows.我有一个包含数千行的数据框。 Every row is a hospitalization record;每一行都是住院记录; it contains the ID of the patient and a lot of health information (diagnosis, date of admission, date of dismissal, and so on).它包含患者的 ID 和许多健康信息(诊断、入院日期、解雇日期等)。

Every patient can have more than a hospitalization record, but I need only the first hospitalization of every patient, eg the first record for each patient ID according to the date of admission.每个病人可以有多个住院记录,但我只需要每个病人的第一次住院,例如根据入院日期的每个病人ID的第一个记录。 How can I get this result in R?我怎样才能在 R 中得到这个结果?

Thank you in advance.先感谢您。

I think I have a solution, but there's probably a smoother way to do this.我想我有一个解决方案,但可能有更顺畅的方法来做到这一点。

Try this using dplyr .使用dplyr试试这个。 Note, I assume that when you say 'first' record you mean oldest record.请注意,我假设当您说“第一条”记录时,您指的是最旧的记录。 If you want the most recent record, use max() instead.如果您想要最新的记录,请改用max()

install.packages('dplyr')
library(dplyr)

your_data <- group_by(your_data, patientID)
## This gives you a data frame with all dates and IDs for first visits
first_records <- summarise(your_data, min(admit_date))

## Create ID to match 
first_records$matchID <- paste(first_records$patientID, first_records$admit_date)
your_data$matchID <- paste(your_data$patientID, your_data$admit_date)

## Get complete records
first_records <- your_data[your_data$matchID %in% first_records$matchID, ]

Lemme know how this goes.让我知道这是怎么回事。

EDIT: Definitely looks like an easier solution that @alistaire posted:编辑:绝对看起来像@alistaire发布的更简单的解决方案:

your_data <- group_by(your_data, patientID)
first_records <- filter(your_data, adm_date == min(admission_date))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM