How do I combine lists in column of dataframe to a single list

Question

Some context, I have some data that I'm doing some text analysis on, I have just tokenized them and I want to combine all the lists in the dataframe column for some further processing.

My df is as:

df = pd.DataFrame({'title': ['issue regarding app', 'graphics should be better'], 'text': [["'app'", "'load'", "'slowly'"], ["'interface'", "'need'", "'to'", "'look'", "'nicer'"]]})`

I want to merge all the lists in the 'text' column into one list, and also remove the open/close inverted commas.

Something like this:

lst = ['app', 'load', 'slowly', 'interface', 'need', 'to', 'look', 'nicer']`

Thank you for all your help!

Answer 1

You can accomplish that with the use of apply and lambda

The use of apply method is to apply a function to each element in the 'text' column while the sum function is to concatenate all the lists together

lst = sum(df["text"].apply(lambda x: [i.replace("'", "") for i in x]), [])

Output:

['app', 'load', 'slowly', 'interface', 'need', 'to', 'look', 'nicer']

If you want to replace multiple elements like "'“ and "a" , translate will be efficient instead of replace :

trans = str.maketrans("", "", "'a")
lst = sum(df["text"].apply(lambda x: [i.translate(trans) for i in x]), [])

Answer 2

Use a simple list comprehension:

out = [x.strip("'") for l in df['text'] for x in l]

Output:

['app', 'load', 'slowly', 'interface', 'need', 'to', 'look', 'nicer']

Answer 3

We can also iterate through each list in the series and concatenate them using append() and finally use concat() to convert them to a list. Yields the same output as above.

How do I combine lists in column of dataframe to a single list

Question

3 answers

solution1
5 2023-01-15 19:50:08

solution2
3 2023-01-15 19:56:40

solution3
2 2023-01-15 19:45:11

How do I combine lists in column of dataframe to a single list

Question

3 answers

solution1 5 2023-01-15 19:50:08

solution2 3 2023-01-15 19:56:40

solution3 2 2023-01-15 19:45:11

solution1
5 2023-01-15 19:50:08

solution2
3 2023-01-15 19:56:40

solution3
2 2023-01-15 19:45:11