Removing duplicated column characters of dataset in r

Question

I am new to r and I have problems with removing duplicated characters.

Here is my code:

library(RCurl)
x <- getURL("https://raw.githubusercontent.com/eparker12/nCoV_tracker/master/input_data/coronavirus.csv")
y <- read.csv(text = x)
z <- duplicated(y$jhuID)

I tried something like z <-... but it did not work. For the column jhuID in the dataframe it is the class character but there are many name of countries that repeat multiple times and my goal is to delete those duplicated name of country and make sure that it remain only one time with the same class character

For example if I view data by y$jhuID , I will see all the names of the country that appear multiple time. I want new dataframe for example z when I view z$jhulD I will see the name of country appear only one time each.

Any help for this would be much appreciated!! Thanks in advance

Answer 1

An option with h distinct and arrange

library(dplyr)
y %>%
     distinct(jhu_ID, .keep_all = TRUE) %>%
     arrange(jhu_ID)

Removing duplicated column characters of dataset in r

Question

1 answers

solution1
1 ACCPTED 2020-05-14 21:17:05

Removing duplicated column characters of dataset in r

Question

1 answers

solution1 1 ACCPTED 2020-05-14 21:17:05

solution1
1 ACCPTED 2020-05-14 21:17:05