I have a data.frame that contains state names and I would like to create a new variable called "region" in which a value is assigned based on the state that is found under the "state" variable.
For example, if the state variable has "Alabama" or "Georgia", I would like to have "Region" assigned as "South". If state is "Washington" or "California", I would like it assigned to "West". I have to do this for each of the 48 contiguous US states, and I'm having difficulty figuring out the best way to do this. Any help in this (I'm sure simple) procedure would be great. What I am looking for is something like this in the end:
State Region
Wyoming West
Michigan Midwest
Alabama South
Georgia South
California West
Texas Central
And to be clear, I don't have the regions in a separate file, i have to create this as a new variable and create the region names myself. I'm just looking for a way that the code can go through all 3000 lines that I have and can automatically assign the region name once I tell it how to do so.
Rather than type the region for every state, you can use the built-in "state.name" and "state.region" variables from the 'datasets' package (like Jon Spring suggests in his comment), eg
library(tidyverse)
library(datasets)
state_lookup_table <- data.frame(name = state.name,
region = state.region)
my_df <- data.frame(place = c("Washington", "California"),
value = c(1000, 2000))
my_df
#> place value
#> 1 Washington 1000
#> 2 California 2000
my_df %>%
left_join(state_lookup_table, by = c("place" = "name"))
#> place value region
#> 1 Washington 1000 West
#> 2 California 2000 West
Created on 2022-09-02 by the reprex package (v2.0.1)
I would go this way:
df <- data.frame(name = c("john", "will", "thomas", "Ali"),
state = c("California", "Alabama", "Washington", "Georgia"))
region_df <- data.frame(state= c("Alabama", "Georgia", "Washington"),
region = c("south", "south", "west"))
merged.df <- merge(df, region_df, all.x = TRUE, on= "state")
I think you need a reference to do so. For your specific question, a dict would be the best solution.
ref_ge <- {}
ref_ge["Georgia"]="South"
ref_ge["Alabama"]="South"
ref_ge["California"]="West"
ref1["Georgia"]
#Or, if you could read the state->region information from an excel to a dataframe
df=data.frame(state=c("Georgia","Alabama","California"),region=c("South","South","West"))
ref2 <- df$region
names(ref2) <- df$state
ref2["Georgia"]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.