如何操作数据框，以便我访问单元格内列表中的每个元素并根据另一列对它们进行分组？

Question

this might be confusing so this is a copy of the first 5 rows of the data frame.这可能会令人困惑，所以这是数据框前 5 行的副本。

number  cap words
0   ['Ages', 'Online', 'Python', 'Coding', 'CoursesAdwwwcodetodaycoukLearn', 'Python', 'Live', 'Taught', 'Experts', 'Making', 'Coding', 'Fun', 'Courses', 'Summer', 'Weekly', 'EthosSimple', 'Low', 'Cost', 'PricingFAQAccess', 'Free', 'Content']
1   ['Become', 'Python', 'Programmer', 'Study', 'Python', 'Online', 'FreeAdwwwpythoninstituteorgLearn', 'Python', 'Become', 'Python', 'Certified', 'Take', 'Your', 'Career', 'Next', 'Level', 'Kostenfreie', 'Lernplattform', 'Tausende', 'Studenten', 'Lass', 'Dich', 'Highlights', 'Offering', 'SelfStudy', 'Courses', 'Free', 'Courses', 'Available', 'Flexible', 'DeadlinesResources', 'Free', 'Education', 'Platform', 'Get', 'Certification', 'About']
2   ['Python', 'For', 'Beginners', 'Pythonorgwwwpythonorg', 'Python', 'Its', 'NonProgrammers', 'Python', 'Programmers', 'Python', 'Frequently', 'Asked', 'Books']
3   ['People', 'Python', 'I', 'PythonIs', 'Python', 'Python']
4   ['PythonHighlevel', 'Created', 'Guido', 'Rossum', 'Pythons', 'WikipediaTyping', 'Duck', 'July', 'August', 'Guido', 'RossumOS', 'Linux', 'Windows', 'Vista', 'IDEsIDLEPyCharmMicrosoft', 'Visual', 'StudioSpyderEclipsePyDevPeople']
5   ['Welcome', 'PythonorgwwwpythonorgThe', 'Python', 'Programming', 'Language', 'Python', 'For', 'Beginners', 'Beginners', 'Guide', 'Python', 'Docs', 'Python', 'Books']
6   ['BeginnersGuide', 'Python', 'Wikiwikipythonorg', 'BeginnersGuide4', 'Jul', 'New', 'Python', 'This', 'Chinese']
7   ['Learn', 'Python', 'Codecademywwwcodecademycom', 'Python', 'By']
8   ['Python', 'Wikipediaenwikipediaorg', 'PythonprogramminglanguagePython', 'Created', 'Guido', 'Rossum', 'Pythons', 'History', 'Features', 'Syntax', 'Python', 'Developer', 'Python', 'Software', 'Foundation', 'Paradigm', 'Multiparadigm', 'Designed', 'Guido', 'Rossum', 'Typing', 'Duck']
9   ['Related']

I am trying to unpack every word from the list inside the cell and group them all under their shared index number.我试图从单元格内的列表中解压缩每个单词，并将它们全部分组在它们的共享索引号下。

so like that就这样

    0
0   Ages
0   Online
0   Python
0   Coding
0   CoursesAdwwwcodetodaycoukLearn
0   Python
0   Live
0   Taught
0   Experts
0   Making
0   Coding
0   Fun
0   Courses
0   Summer
0   Weekly
0   EthosSimple
0   Low
0   Cost
0   PricingFAQAccess
0   Free
0   Content

and below follows 1 for the words 'Become', 'Python', 'Programmer', 'Study', 'Python', 'Online' etc...下面的 1 表示“Become”、“Python”、“Programmer”、“Study”、“Python”、“Online”等词...

I hope this is clear.我希望这很清楚。

Thanks谢谢

Answer 1

You can use explode你可以使用explode

x = [['Ages', 'Online', 'Python', 'Coding', 'CoursesAdwwwcodetodaycoukLearn', 'Python', 'Live', 'Taught', 'Experts', 'Making', 'Coding', 'Fun', 'Courses', 'Summer', 'Weekly', 'EthosSimple', 'Low', 'Cost', 'PricingFAQAccess', 'Free', 'Content']
,['Become', 'Python', 'Programmer', 'Study', 'Python', 'Online', 'FreeAdwwwpythoninstituteorgLearn', 'Python', 'Become', 'Python', 'Certified', 'Take', 'Your', 'Career', 'Next', 'Level', 'Kostenfreie', 'Lernplattform', 'Tausende', 'Studenten', 'Lass', 'Dich', 'Highlights', 'Offering', 'SelfStudy', 'Courses', 'Free', 'Courses', 'Available', 'Flexible', 'DeadlinesResources', 'Free', 'Education', 'Platform', 'Get', 'Certification', 'About']
,['Python', 'For', 'Beginners', 'Pythonorgwwwpythonorg', 'Python', 'Its', 'NonProgrammers', 'Python', 'Programmers', 'Python', 'Frequently', 'Asked', 'Books']
,['People', 'Python', 'I', 'PythonIs', 'Python', 'Python']
,['PythonHighlevel', 'Created', 'Guido', 'Rossum', 'Pythons', 'WikipediaTyping', 'Duck', 'July', 'August', 'Guido', 'RossumOS', 'Linux', 'Windows', 'Vista', 'IDEsIDLEPyCharmMicrosoft', 'Visual', 'StudioSpyderEclipsePyDevPeople']
,['Welcome', 'PythonorgwwwpythonorgThe', 'Python', 'Programming', 'Language', 'Python', 'For', 'Beginners', 'Beginners', 'Guide', 'Python', 'Docs', 'Python', 'Books']
,['BeginnersGuide', 'Python', 'Wikiwikipythonorg', 'BeginnersGuide4', 'Jul', 'New', 'Python', 'This', 'Chinese']
,['Learn', 'Python', 'Codecademywwwcodecademycom', 'Python', 'By']
,['Python', 'Wikipediaenwikipediaorg', 'PythonprogramminglanguagePython', 'Created', 'Guido', 'Rossum', 'Pythons', 'History', 'Features', 'Syntax', 'Python', 'Developer', 'Python', 'Software', 'Foundation', 'Paradigm', 'Multiparadigm', 'Designed', 'Guido', 'Rossum', 'Typing', 'Duck']
,['Related']]

df = pd.DataFrame({
    'number': np.arange(10),
    'cap words' :pd.Series(x)
})

df.explode('cap words').reset_index(drop=True)

Out:出去：

     number                       cap words
0         0                            Ages
1         0                          Online
2         0                          Python
3         0                          Coding
4         0  CoursesAdwwwcodetodaycoukLearn
..      ...                             ...
140       8                           Guido
141       8                          Rossum
142       8                          Typing
143       8                            Duck
144       9                         Related

[145 rows x 2 columns]

如何操作数据框，以便我访问单元格内列表中的每个元素并根据另一列对它们进行分组？

问题描述

1 个解决方案

解决方案1
1 2020-09-26 15:16:58

如何操作数据框，以便我访问单元格内列表中的每个元素并根据另一列对它们进行分组？

问题描述

1 个解决方案

解决方案1 1 2020-09-26 15:16:58

解决方案1
1 2020-09-26 15:16:58