简体   繁体   中英

using utf8toInt on a dataframe

Very new to R. I'm trying to use utf8toInt against a dataframe.
I want to scan the data frame for the Int value of 101 (e) and replace it with 69 (E).

My issue here is how to run through each value in the dataframe and execute the utf8ToInt function. {`

Build out UTF fuction here

a <- c("red", "blue", "yellow", "black")
b<- c("blue", "yellow", "red", "pink")
c<- c("white", "black", "red", "blue")

df = data.frame(a,b,c)

df
abc
1 red blue white
2 blue yellow black
3 yellow red red
4 black pink blue

When trying to run against a single value

utf8ToInt(df[1]) Error in utf8ToInt(df[1]) : argument must be a character vector of length 1

'}

never use c as variable name since c() is also a function.

a <- c("red", "blue", "yellow", "black");b<- c("blue", "yellow", "red", "pink");c1<- c("white", "black", "red", "blue")

note the stringsAsFactors = F part.

df = data.frame(a,b,c1,stringsAsFactors = F)

lapply(df[[1]],utf8ToInt)

result

# [[1]]
# [1] 114 101 100
# 
# [[2]]
# [1]  98 108 117 101
# 
# [[3]]
# [1] 121 101 108 108 111 119
# 
# [[4]]
# [1]  98 108  97  99 107

Side note: The reason why it did not work is because factor variables are internally coded as integer values:

utf8ToInt("red")  #works
utf8ToInt(factor("red")) #does not work
utf8ToInt(1) #does not work

To transform the whole dataset you could convert to matrix .

lapply(as.matrix(df),utf8ToInt)

Reading you question: If you want to replace "e" with "E" why don't you simply use regEx?

df

       a      b    c1
1    red   blue white
2   blue yellow black
3 yellow    red   red
4  black   pink  blue

Then use:

df[] <- sapply(as.matrix(df),gsub,pattern="e",replacement="E")

       a      b    c1
1    rEd   bluE whitE
2   bluE yEllow black
3 yEllow    rEd   rEd
4  black   pink  bluE

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM