I have a large python list consisting of many strings in the format
list = ['state1', 'town1','town2','town3', 'state2', 'town4', 'state3', 'town5','town6']
There are a variable number of towns for each state.
How can I get this to be nested so it looks like:
list = [['state1', 'town1','town2','town3'], ['state2', 'town4'],['state3', 'town5','town6']
And then from there make this list into a dataframe with the states as the indexes and the towns as a single column?
Let's take the list as:
lst = [['state', 'town','town','town'], ['state', 'town'],['state', 'town','town']]
To convert it into dataframe with state as index:
df=pd.DataFrame(lst).set_index(0, drop=True)
Output:
0 1 2 3
state town town town
state town None None
state town town None
So let's first look at some list examples:
state_lst = ['California', 'New Mexico', 'Arizona', 'etc.']
state_town_lst = ['California', 'San Francisco', 'Los Angeles', 'San Diego', 'New Mexico', 'Albuquerque', 'Santa Fe', 'Arizona', 'Tucson']
town_lst =[]
So as you can see there should be three cities for California, two for New Mexico and one for Arizona. So we go through the state_town_lst
and check if it the items appear in the state_lst
.
for item in state_town_lst:
if item in state_lst:
state = item
continue
else:
town = item
town_item = (state, town)
town_lst.append(town_item)
df = pd.DataFrame(town_lst, columns = ["State", "Town"])
This gives you:
State Town
0 California San Francisco
1 California Los Angeles
2 California San Diego
3 New Mexico Albuquerque
4 New Mexico Santa Fe
5 Arizona Tucson
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.