[英]How can manipulate a dataframe such that i access every element in a list inside a cell and group them according to another column?
this might be confusing so this is a copy of the first 5 rows of the data frame.这可能会令人困惑,所以这是数据框前 5 行的副本。
number cap words
0 ['Ages', 'Online', 'Python', 'Coding', 'CoursesAdwwwcodetodaycoukLearn', 'Python', 'Live', 'Taught', 'Experts', 'Making', 'Coding', 'Fun', 'Courses', 'Summer', 'Weekly', 'EthosSimple', 'Low', 'Cost', 'PricingFAQAccess', 'Free', 'Content']
1 ['Become', 'Python', 'Programmer', 'Study', 'Python', 'Online', 'FreeAdwwwpythoninstituteorgLearn', 'Python', 'Become', 'Python', 'Certified', 'Take', 'Your', 'Career', 'Next', 'Level', 'Kostenfreie', 'Lernplattform', 'Tausende', 'Studenten', 'Lass', 'Dich', 'Highlights', 'Offering', 'SelfStudy', 'Courses', 'Free', 'Courses', 'Available', 'Flexible', 'DeadlinesResources', 'Free', 'Education', 'Platform', 'Get', 'Certification', 'About']
2 ['Python', 'For', 'Beginners', 'Pythonorgwwwpythonorg', 'Python', 'Its', 'NonProgrammers', 'Python', 'Programmers', 'Python', 'Frequently', 'Asked', 'Books']
3 ['People', 'Python', 'I', 'PythonIs', 'Python', 'Python']
4 ['PythonHighlevel', 'Created', 'Guido', 'Rossum', 'Pythons', 'WikipediaTyping', 'Duck', 'July', 'August', 'Guido', 'RossumOS', 'Linux', 'Windows', 'Vista', 'IDEsIDLEPyCharmMicrosoft', 'Visual', 'StudioSpyderEclipsePyDevPeople']
5 ['Welcome', 'PythonorgwwwpythonorgThe', 'Python', 'Programming', 'Language', 'Python', 'For', 'Beginners', 'Beginners', 'Guide', 'Python', 'Docs', 'Python', 'Books']
6 ['BeginnersGuide', 'Python', 'Wikiwikipythonorg', 'BeginnersGuide4', 'Jul', 'New', 'Python', 'This', 'Chinese']
7 ['Learn', 'Python', 'Codecademywwwcodecademycom', 'Python', 'By']
8 ['Python', 'Wikipediaenwikipediaorg', 'PythonprogramminglanguagePython', 'Created', 'Guido', 'Rossum', 'Pythons', 'History', 'Features', 'Syntax', 'Python', 'Developer', 'Python', 'Software', 'Foundation', 'Paradigm', 'Multiparadigm', 'Designed', 'Guido', 'Rossum', 'Typing', 'Duck']
9 ['Related']
I am trying to unpack every word from the list inside the cell and group them all under their shared index number.我试图从单元格内的列表中解压缩每个单词,并将它们全部分组在它们的共享索引号下。
so like that就这样
0
0 Ages
0 Online
0 Python
0 Coding
0 CoursesAdwwwcodetodaycoukLearn
0 Python
0 Live
0 Taught
0 Experts
0 Making
0 Coding
0 Fun
0 Courses
0 Summer
0 Weekly
0 EthosSimple
0 Low
0 Cost
0 PricingFAQAccess
0 Free
0 Content
and below follows 1 for the words 'Become', 'Python', 'Programmer', 'Study', 'Python', 'Online' etc...下面的 1 表示“Become”、“Python”、“Programmer”、“Study”、“Python”、“Online”等词...
I hope this is clear.我希望这很清楚。
Thanks谢谢
You can use explode
你可以使用
explode
x = [['Ages', 'Online', 'Python', 'Coding', 'CoursesAdwwwcodetodaycoukLearn', 'Python', 'Live', 'Taught', 'Experts', 'Making', 'Coding', 'Fun', 'Courses', 'Summer', 'Weekly', 'EthosSimple', 'Low', 'Cost', 'PricingFAQAccess', 'Free', 'Content']
,['Become', 'Python', 'Programmer', 'Study', 'Python', 'Online', 'FreeAdwwwpythoninstituteorgLearn', 'Python', 'Become', 'Python', 'Certified', 'Take', 'Your', 'Career', 'Next', 'Level', 'Kostenfreie', 'Lernplattform', 'Tausende', 'Studenten', 'Lass', 'Dich', 'Highlights', 'Offering', 'SelfStudy', 'Courses', 'Free', 'Courses', 'Available', 'Flexible', 'DeadlinesResources', 'Free', 'Education', 'Platform', 'Get', 'Certification', 'About']
,['Python', 'For', 'Beginners', 'Pythonorgwwwpythonorg', 'Python', 'Its', 'NonProgrammers', 'Python', 'Programmers', 'Python', 'Frequently', 'Asked', 'Books']
,['People', 'Python', 'I', 'PythonIs', 'Python', 'Python']
,['PythonHighlevel', 'Created', 'Guido', 'Rossum', 'Pythons', 'WikipediaTyping', 'Duck', 'July', 'August', 'Guido', 'RossumOS', 'Linux', 'Windows', 'Vista', 'IDEsIDLEPyCharmMicrosoft', 'Visual', 'StudioSpyderEclipsePyDevPeople']
,['Welcome', 'PythonorgwwwpythonorgThe', 'Python', 'Programming', 'Language', 'Python', 'For', 'Beginners', 'Beginners', 'Guide', 'Python', 'Docs', 'Python', 'Books']
,['BeginnersGuide', 'Python', 'Wikiwikipythonorg', 'BeginnersGuide4', 'Jul', 'New', 'Python', 'This', 'Chinese']
,['Learn', 'Python', 'Codecademywwwcodecademycom', 'Python', 'By']
,['Python', 'Wikipediaenwikipediaorg', 'PythonprogramminglanguagePython', 'Created', 'Guido', 'Rossum', 'Pythons', 'History', 'Features', 'Syntax', 'Python', 'Developer', 'Python', 'Software', 'Foundation', 'Paradigm', 'Multiparadigm', 'Designed', 'Guido', 'Rossum', 'Typing', 'Duck']
,['Related']]
df = pd.DataFrame({
'number': np.arange(10),
'cap words' :pd.Series(x)
})
df.explode('cap words').reset_index(drop=True)
Out:出去:
number cap words
0 0 Ages
1 0 Online
2 0 Python
3 0 Coding
4 0 CoursesAdwwwcodetodaycoukLearn
.. ... ...
140 8 Guido
141 8 Rossum
142 8 Typing
143 8 Duck
144 9 Related
[145 rows x 2 columns]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.