I have a CSV file like
LocationList,Identity,Category
"New York,New York,United States","42","S"
"NA,California,United States","89","lyt"
"Hartford,Connecticut,United States","879","polo"
"San Diego,California,United States","45454","utyr"
"Seattle,Washington,United States","uytr","69"
"NA,NA,United States","87","tree"
I want to remove all 'NA' from the 'LocationList' Column
The Desired Result -
LocationList,Identity,Category
"New York,New York,United States","42","S"
"California,United States","89","lyt"
"Hartford,Connecticut,United States","879","polo"
"San Diego,California,United States","45454","utyr"
"Seattle,Washington,United States","uytr","69"
"United States","87","tree"
The number of columns are not fixed and they may increase or decrease. Also I want to write to the CSV file without quotes and without escaping for the 'LocationList' column.
How to achieve the following in R? New to R any help is appreciated.
In this case, you just want to replace the NA,
with nothing. However, this is not the standard way to remove NA
values.
Assuming dat
is your data, use
dat$LocationList <- gsub("^(NA,)+", "", dat$LocationList)
Try:
my.data <- read.table(text='LocationList,Identity,Category
"New York,New York,United States","42","S"
"NA,California,United States","89","lyt"
"Hartford,Connecticut,United States","879","polo"
"San Diego,California,United States","45454","utyr"
"Seattle,Washington,United States","uytr","69"
"NA,NA,United States","87","tree"', header=T, sep=",")
my.data$LocationList <- gsub("NA,", "", my.data$LocationList)
my.data
# LocationList Identity Category
# 1 New York,New York,United States 42 S
# 2 California,United States 89 lyt
# 3 Hartford,Connecticut,United States 879 polo
# 4 San Diego,California,United States 45454 utyr
# 5 Seattle,Washington,United States uytr 69
# 6 United States 87 tree
If you get rid of the quotes when you write to a conventional csv file, you will have trouble reading the data in later. This is because you have commas already inside each value in the LocationList
variable, so you would have commas both in the middle of fields and marking the break between fields. You might try using write.csv2()
instead, which will indicate new fields with a semicolon ;
. You could use:
write.csv2(my.data, file="myFile.csv", quote=FALSE, row.names=FALSE)
Which yields the following file:
LocationList;Identity;Category
New York,New York,United States;42;S
California,United States;89;lyt
Hartford,Connecticut,United States;879;polo
San Diego,California,United States;45454;utyr
Seattle,Washington,United States;uytr;69
United States;87;tree
( I now notice that the values for Identity
and Category
for row 5
are presumably messed up. You may want to switch those before writing to file. )
x <- my.data[5, 2]
my.data[5, 2] <- my.data[5, 3]
my.data[5, 2] <- x
rm(x)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.