简体   繁体   English

删除 r 中数据集的重复列字符

[英]Removing duplicated column characters of dataset in r

I am new to r and I have problems with removing duplicated characters.我是 r 的新手,我在删除重复字符时遇到问题。

Here is my code:这是我的代码:

library(RCurl)
x <- getURL("https://raw.githubusercontent.com/eparker12/nCoV_tracker/master/input_data/coronavirus.csv")
y <- read.csv(text = x)
z <- duplicated(y$jhuID)

I tried something like z <-... but it did not work.我尝试了类似 z <-... 的东西,但它没有用。 For the column jhuID in the dataframe it is the class character but there are many name of countries that repeat multiple times and my goal is to delete those duplicated name of country and make sure that it remain only one time with the same class character对于jhuID中的 jhuID 列,它是 class character ,但是有很多国家的名称重复多次,我的目标是删除那些重复的国家名称,并确保它只保留一次具有相同的 ZA2F2ED4F8EBC2CBBD4C2A character

For example if I view data by y$jhuID , I will see all the names of the country that appear multiple time.例如,如果我通过y$jhuID查看数据,我将看到多次出现的所有国家/地区名称。 I want new dataframe for example z when I view z$jhulD I will see the name of country appear only one time each.我想要新的 dataframe 例如z当我查看z$jhulD时,我会看到每个国家的名称只出现一次。

Any help for this would be much appreciated!!对此的任何帮助将不胜感激! Thanks in advance提前致谢

An option with h distinct and arrange具有 h distinctarrange的选项

library(dplyr)
y %>%
     distinct(jhu_ID, .keep_all = TRUE) %>%
     arrange(jhu_ID)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM