简体   繁体   中英

Convert Python List into Dataframe with Index

I have a large python list consisting of many strings in the format

list = ['state1', 'town1','town2','town3', 'state2', 'town4', 'state3', 'town5','town6']

There are a variable number of towns for each state.

How can I get this to be nested so it looks like:

list = [['state1', 'town1','town2','town3'], ['state2', 'town4'],['state3', 'town5','town6']

And then from there make this list into a dataframe with the states as the indexes and the towns as a single column?

Let's take the list as:

lst = [['state', 'town','town','town'], ['state', 'town'],['state', 'town','town']]

To convert it into dataframe with state as index:

df=pd.DataFrame(lst).set_index(0, drop=True)

Output:

0        1       2       3
            
state   town    town    town
state   town    None    None
state   town    town    None

So let's first look at some list examples:

state_lst = ['California', 'New Mexico', 'Arizona', 'etc.']
state_town_lst = ['California', 'San Francisco', 'Los Angeles', 'San Diego', 'New Mexico', 'Albuquerque', 'Santa Fe', 'Arizona', 'Tucson']
town_lst =[]

So as you can see there should be three cities for California, two for New Mexico and one for Arizona. So we go through the state_town_lst and check if it the items appear in the state_lst .

for item in state_town_lst:
    if item in state_lst:
        state = item
        continue
    else:
        town = item
        
    town_item = (state, town)
    town_lst.append(town_item)
    
df = pd.DataFrame(town_lst, columns = ["State", "Town"])

This gives you:

    State       Town
0   California  San Francisco
1   California  Los Angeles
2   California  San Diego
3   New Mexico  Albuquerque
4   New Mexico  Santa Fe
5   Arizona     Tucson

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM