使用 Python 中的循环根据字符串值修改新列中的行值

Question

I'd like to recode row values in a different column based on a string match in pandas using a loop.我想使用循环根据 pandas 中的字符串匹配来重新编码不同列中的行值。 I found a way to do it by creating an entirely new column each time, but that doesn't work when I need to modify select rows from multiple columns at different points in the analysis.我找到了一种方法，每次都创建一个全新的列，但是当我需要在分析的不同点从多列中修改 select 行时，这不起作用。

Here was the solution I used with an example dataframe:这是我在示例 dataframe 中使用的解决方案：

iris = sns.load_dataset('iris')
iris.head()
iris.species.value_counts()

pattern = ['setosa', 'virginica']
iris['new_column'] = 0
lis = []

for index, row in iris.iterrows():
  #print (row['species'])
  if any(ele in row.species for ele in pattern):
    lis.append('matched')
  else:
    lis.append("notmatched")

iris['new_column'] = lis

I know there may be other ways through list comprehensions in Pandas or using lambda/apply methods, but I'm requesting a solution using loops.我知道可能还有其他方法可以通过 Pandas 中的列表理解或使用 lambda/apply 方法，但我请求使用循环的解决方案。 (I don't have the full dataset here, but there's some complications with it and I believe a loop may be the most flexible). （我这里没有完整的数据集，但它有一些复杂性，我相信循环可能是最灵活的）。

Any suggestions on how to use a loop and string match to modify rows in a different column?关于如何使用循环和字符串匹配来修改不同列中的行的任何建议？ Thank you and let me know if I can make this question better!谢谢你，如果我能把这个问题做得更好，请告诉我！

Answer 1

One of the simpliest loop solution is iterate by each value of column iris['species'] and append to list lis by condition with in :最简单的循环解决方案之一是通过列iris['species']和 append 的每个值进行迭代，以按条件in lis ：

pattern = ['setosa', 'virginica']
lis = []
for val in iris['species']:
  if val in pattern:
    lis.append('matched')
  else:
    lis.append("notmatched")

iris['new_column'] = lis

Pandas solution is possible by numpy.where and Series.isin : Pandas 解决方案可以通过numpy.where和Series.isin ：

iris['new_column'] = np.where(iris['species'].isin(pattern), 'matched', 'notmatched')

Answer 2

I ended up finding an answer through a few different threads.我最终通过几个不同的线程找到了答案。

Here's how I did it我是这样做的

iris = sns.load_dataset('iris')
iris.head()
print (iris.species.value_counts())

pattern = ['setosa', 'virginica']
iris['new_column'] = 0

for index, row in iris.iterrows():
  match = re.match('|'.join(pattern), row.species)
  if match:
    iris.loc[index, "new_column"] = match.group(0)
  else:
    iris.loc[index, "new_column"] = 'no match'


print (iris.new_column.value_counts())

I imagine there's a more efficient way to do this and I also have to specify the column, which isn't ideal.我想有一种更有效的方法可以做到这一点，而且我还必须指定列，这并不理想。 Feel free to comment!随意评论！

使用 Python 中的循环根据字符串值修改新列中的行值

问题描述

2 个解决方案

解决方案1
0 2020-07-08 06:17:55

解决方案2
0 2020-07-13 00:19:45

使用 Python 中的循环根据字符串值修改新列中的行值

问题描述

2 个解决方案

解决方案1 0 2020-07-08 06:17:55

解决方案2 0 2020-07-13 00:19:45

解决方案1
0 2020-07-08 06:17:55

解决方案2
0 2020-07-13 00:19:45