[英]Removing duplicated column characters of dataset in r
I am new to r and I have problems with removing duplicated characters.我是 r 的新手,我在删除重复字符时遇到问题。
Here is my code:这是我的代码:
library(RCurl)
x <- getURL("https://raw.githubusercontent.com/eparker12/nCoV_tracker/master/input_data/coronavirus.csv")
y <- read.csv(text = x)
z <- duplicated(y$jhuID)
I tried something like z <-... but it did not work.我尝试了类似 z <-... 的东西,但它没有用。 For the column jhuID
in the dataframe it is the class character
but there are many name of countries that repeat multiple times and my goal is to delete those duplicated name of country and make sure that it remain only one time with the same class character
对于jhuID
中的 jhuID 列,它是 class character
,但是有很多国家的名称重复多次,我的目标是删除那些重复的国家名称,并确保它只保留一次具有相同的 ZA2F2ED4F8EBC2CBBD4C2A character
For example if I view data by y$jhuID
, I will see all the names of the country that appear multiple time.例如,如果我通过y$jhuID
查看数据,我将看到多次出现的所有国家/地区名称。 I want new dataframe for example z
when I view z$jhulD
I will see the name of country appear only one time each.我想要新的 dataframe 例如z
当我查看z$jhulD
时,我会看到每个国家的名称只出现一次。
Any help for this would be much appreciated!!对此的任何帮助将不胜感激! Thanks in advance提前致谢
An option with h distinct
and arrange
具有 h distinct
和arrange
的选项
library(dplyr)
y %>%
distinct(jhu_ID, .keep_all = TRUE) %>%
arrange(jhu_ID)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.