[英]convert python pandas dataframe with column names and index names to list of lists
[英]Convert Python List into Dataframe with Index
我有一个大的 python 列表,其中包含许多格式的字符串
list = ['state1', 'town1','town2','town3', 'state2', 'town4', 'state3', 'town5','town6']
每个 state 都有不同数量的城镇。
我怎样才能让它嵌套,所以它看起来像:
list = [['state1', 'town1','town2','town3'], ['state2', 'town4'],['state3', 'town5','town6']
然后从那里把这个列表变成一个 dataframe ,其中州作为索引,城镇作为单列?
让我们将列表视为:
lst = [['state', 'town','town','town'], ['state', 'town'],['state', 'town','town']]
要将其转换为 dataframe 并以 state 作为索引:
df=pd.DataFrame(lst).set_index(0, drop=True)
Output:
0 1 2 3
state town town town
state town None None
state town town None
因此,让我们首先看一些列表示例:
state_lst = ['California', 'New Mexico', 'Arizona', 'etc.']
state_town_lst = ['California', 'San Francisco', 'Los Angeles', 'San Diego', 'New Mexico', 'Albuquerque', 'Santa Fe', 'Arizona', 'Tucson']
town_lst =[]
如您所见,加州应该有三个城市,新墨西哥州应该有两个城市,亚利桑那州应该有一个。 所以我们通过state_town_lst
检查项目是否出现在state_lst
中。
for item in state_town_lst:
if item in state_lst:
state = item
continue
else:
town = item
town_item = (state, town)
town_lst.append(town_item)
df = pd.DataFrame(town_lst, columns = ["State", "Town"])
这给了你:
State Town
0 California San Francisco
1 California Los Angeles
2 California San Diego
3 New Mexico Albuquerque
4 New Mexico Santa Fe
5 Arizona Tucson
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.