简体   繁体   English

将 dataframe 中的每个字符串元素与一个列表进行比较,并将其分配给一个列 python pandas

[英]Compare each string element in a dataframe to a list and assign it to a column, python pandas

How to rearrange my dataframe according to column names while searching for specific strings in cells?在单元格中搜索特定字符串时,如何根据列名重新排列我的 dataframe?

My dataframe:我的 dataframe:

0 0 1 1 2 2 3 3 4 4
apple pie苹果派 banana bread香蕉面包 orange juice橙汁 nan nan
apple cookies苹果 cookies orange lemonade橙柠檬水 nan nan nan
banana muffin香蕉松饼 orange ice橙冰 berry candy浆果糖果 nan nan
berry juice浆果汁 nan nan nan nan

I want to arrange the rows according to a list of column names, which look for specific strings of text.我想根据列名列表来排列行,该列表查找特定的文本字符串。

apple苹果 banana香蕉 orange berry浆果 lemon柠檬
apple pie苹果派 banana bread香蕉面包 orange juice橙汁 nan nan
apple cookies苹果 cookies nan orange lemonade橙柠檬水 nan nan
nan banana muffin香蕉松饼 orange ice橙冰 berry candy浆果糖果 nan
nan nan nan berry juice浆果汁 nan

I have tried to create a column/list for each fruit, searching for the right string and adding the cell if it matches, however I do not know how to iterate through the dataframe and assign values.我试图为每个水果创建一个列/列表,搜索正确的字符串并添加匹配的单元格,但是我不知道如何遍历 dataframe 并分配值。 I just get a column of Nan's.我只是得到一个南的专栏。

col_names = ['apple', 'banana', 'orange', 'berry', 'lemonade']
apples = np.where(df_fruits.str.contains("apple", case=False, na=False), df_fruits, np.nan)
bananas = np.where(df_fruits.str.contains("banana", case=False, na=False), df_fruits, np.nan)
etc...

Edit: I got the dataframe from a csv-file, so the original data format is in rows of string: "apple pie, banana bread, orange juice, nan, nan" etc.编辑:我从 csv 文件中获得了 dataframe,因此原始数据格式为字符串行:“苹果派、香蕉面包、橙汁、nan、nan”等。

we can do some re-shaping using .unstack and .str.extractall我们可以使用.unstack.str.extractall进行一些重塑

pat = '|'.join(col_names)

s = df.stack()

s1 = s.to_frame('vals').join(
      s.str.extractall(f'({pat})').groupby(level=[0,1]).agg(list))


out = s1.explode(0).set_index(0,append=True).reset_index(1,drop=True).unstack(-1)

print(out)

            vals
0          apple         banana        berry         lemonade           orange
0      apple pie   banana bread          NaN              NaN     orange juice
1  apple cookies            NaN          NaN  orange lemonade  orange lemonade
2            NaN  banana muffin  berry candy              NaN       orange ice
3            NaN            NaN  berry juice              NaN              NaN

# if you want to drop the level on the multi index.
out.columns = out.columns.droplevel(None)

0          apple         banana        berry         lemonade           orange
0      apple pie   banana bread          NaN              NaN     orange juice
1  apple cookies            NaN          NaN  orange lemonade  orange lemonade
2            NaN  banana muffin  berry candy              NaN       orange ice
3            NaN            NaN  berry juice              NaN              NaN

Try this:尝试这个:

list_values=[item for value in df_fruits.values for item in value]
list_series=[] 
for col in col_names:
   locals()[col+"series"]=pd.Series(map(lambda x:x*(col in str(x)),list_values)
   list_series.append(eval(col+"series"))

the first row is the get all your dataframe colums values into a list next we create a pandas series for every fruit type and append it into a list after we create a new data frame第一行是将所有 dataframe 列值放入列表中接下来我们为每种水果类型创建 pandas 系列,并在创建新数据框后将 append 放入列表中

new_df=pd.concat(list_series,axis=1)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将列表中的每个元素与数据框 python 中的一列列表进行比较 - Compare each element in a list with a column of lists in a dataframe python 在检查列值是否包含作为列表中元素的字符串后,如何将列表中的元素分配给数据框列? (Python) - How to assign element from a list to a dataframe column after checking if a column value contains a string that is an element in the list? (Python) 将List元素与python pandas中的字符串进行比较 - Compare List element with a string in python pandas 如何在python中将字符串与列表中的每个元素进行比较 - how to compare a string with each element of a list in python 将列表分配给 pandas dataframe 元素 - Assign a list to a pandas dataframe element 为pandas DataFrame中的每一列分配数据类型-Python - Assign data type for each column in pandas DataFrame - Python 为 pandas dataframe 中的列中的每个值计算列表中每个元素的 perc - Calculate perc of each element in a list for each value in column in pandas dataframe Python 将 Dataframe 与日期列表进行比较,并根据结果分配一个字符串 - Python Compare a Dataframe with a list of dates and assign a string based on results Python、Pandas:检查列的列表值中的每个元素是否存在于其他 dataframe - Python, Pandas: check each element in list values of column to exist in other dataframe Python Pandas:将 dataframe 列表的每个元素转换为 stings 列表 - Python Pandas: Convert each element of a dataframe list into list of stings
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM