R: Frequency table that is case insensitive

Question

Here is one column of my df: [df$City]
(I have other columns, but I'm just showing one column for simplicity.)

City        
Seattle     
San Diego   
Bern       
SEATTLE
SEATTLE
BERN

I want to do a frequency count on the cities. I want both "Seattle" and "SEATTLE" to be considered the same - basically, I want the frequency table calculation to be case insensitive.

If I use table(df) it gives me "Seattle" and "SEATTLE" as two different items. I tried to overcome this by using toupper(df) before doing table(df)

However, I get the error: invalid multibyte string.

I checked the encoding of my file and it seems to be UTF-8 - I could be wrong - is there a way for me to check the encoding?

Does anyone know how I can get a frequency table that is case insensitive? It doesn't have to be using my approach.

Thanks in advance!!

Answer 1

You'll want to look into iconv() for the UTF-8 conversion. Also, with the strings, you will probably have to use toupper() or tolower() to standardize them, and maybe stringr::str_trim() to take care of extra white-space...

R: Frequency table that is case insensitive

Question

1 answers

solution1
3 2015-06-01 16:59:33

R: Frequency table that is case insensitive

Question

1 answers

solution1 3 2015-06-01 16:59:33

solution1
3 2015-06-01 16:59:33