如何根据 Row_id 列将值写入 dataframe 的另一列并且匹配列中存在值？

Question

I have a dataframe like this which is having ROW_ID and Matches columns.我有一个像这样的 dataframe ，它有ROW_ID和Matches列。 Based on the value in each row of Matches column I should write in Result column.根据Matches列中每一行的值，我应该在Result列中写入。 For example, in first row, we have ;例如，在第一行，我们有; ALL MATCH -3 , so in the new column Result , this ; ALL MATCH -3 ，所以在新列Result中，这个； ALL MATCH should be present in ROW_ID 3. In 8th ROW_ID , we have ; ALL MATCH应该出现在ROW_ID 3 中。在第 8个 ROW_ID中，我们有; ALL MATCH -9;全场比赛-9； Diff in# -10 .差异# -10 。 So in our Result column ;所以在我们的结果列中； ALL MATCH should be present in ROW_ID 9 and ; ALL MATCH应该出现在ROW_ID 9 和; Diff in# should be present in ROW_ID 10 Diff in#应该出现在ROW_ID 10 中

ROW_ID ROW_ID	Matches火柴
1 1	; ; ALL MATCH -3所有比赛 -3
2 2
3 3
4 4
5 5	; ; ALL MATCH -6所有比赛 -6
6 6
7 7
8 8	; ; ALL MATCH -9;全场比赛-9； Diff in# -10差异# -10
9 9
10 10

That means the final dataframe should be like this.这意味着最终的 dataframe 应该是这样的。

ROW_ID ROW_ID	Result结果
1 1
2 2
3 3	; ; ALL MATCH所有比赛
4 4
5 5
6 6	; ; ALL MATCH所有比赛
7 7
8 8
9 9	; ; ALL MATCH所有比赛
10 10	; ; Diff in#差异#

I tried a lot, I extracted the int value seperately and other parts separately for each row using dataframe.iterrows().我尝试了很多，我使用 dataframe.iterrows() 分别提取了每一行的 int 值和其他部分。 But I am not able to print that value to a particular position.但我无法将该值打印到特定的 position。 df.at[] method won't work. df.at[] 方法不起作用。 loc and iloc also i tried, but not getting how can i print that string to particular row of that column. loc 和 iloc 我也尝试过，但没有得到如何将该字符串打印到该列的特定行。

Answer 1

Try:尝试：

df['Result'] = df['ROW_ID'].map(
    df['Matches'].str.extractall('(; [^-]+) -(\d+)')
                 .astype({1: int}).set_index(1).squeeze()
).fillna('')

Output: Output：

>>> df
   ROW_ID                       Matches       Result
0       1                ; ALL MATCH -3             
1       2                                           
2       3                                ; ALL MATCH
3       4                                           
4       5                ; ALL MATCH -6             
5       6                                ; ALL MATCH
6       7                                           
7       8  ; ALL MATCH -9; Diff in# -10             
8       9                                ; ALL MATCH
9      10                                 ; Diff in#

# Details about extractall
>>> df['Matches'].str.extractall('(; [^-]+) -(\d+)')
                   0   1
  match                 
0 0      ; ALL MATCH   3
4 0      ; ALL MATCH   6
7 0      ; ALL MATCH   9
  1       ; Diff in#  10

Answer 2

Create a temporary DataFrame as:创建一个临时 DataFrame 为：

wrk = df.Matches.str.extractall(r'(?P<Result>;\D+)-(?P<id>\d+)')

Then strip the trailing spaces from Result column:然后从Result列中去除尾随空格：

wrk.Result = wrk.Result.str.strip()

The next step is to change the type of id column to int , as so far it is of object type (actually a string ):下一步是将id列的类型更改为int ，到目前为止它是object类型（实际上是string ）：

wrk.id = wrk.id.astype('int64')

and set it as the index:并将其设置为索引：

wrk.set_index('id', inplace=True)

Now wrk is actually a Series , containing:现在wrk实际上是一个Series ，包含：

         Result
id             
3   ; ALL MATCH
6   ; ALL MATCH
9   ; ALL MATCH
10   ; Diff in#

Then, to generate the result, run:然后，要生成结果，请运行：

res = df.merge(wrk, how='left', left_on='ROW_ID', right_index=True)

The result is:结果是：

   ROW_ID                       Matches       Result
0       1                ; ALL MATCH -3          NaN
1       2                           NaN          NaN
2       3                           NaN  ; ALL MATCH
3       4                           NaN          NaN
4       5                ; ALL MATCH -6          NaN
5       6                           NaN  ; ALL MATCH
6       7                           NaN          NaN
7       8  ; ALL MATCH -9; Diff in# -10          NaN
8       9                           NaN  ; ALL MATCH
9      10                           NaN   ; Diff in#

If you don't want "NaN" in "not filled" fields, append .fillna('') to the last instruction.如果您不想在“未填充”字段中出现“NaN”，则 append .fillna('')到最后一条指令。

如何根据 Row_id 列将值写入 dataframe 的另一列并且匹配列中存在值？

问题描述

2 个解决方案

解决方案1
1 2021-12-23 07:57:18

解决方案2
0 2021-12-23 08:29:16

如何根据 Row_id 列将值写入 dataframe 的另一列并且匹配列中存在值？

问题描述

2 个解决方案

解决方案1 1 2021-12-23 07:57:18

解决方案2 0 2021-12-23 08:29:16

解决方案1
1 2021-12-23 07:57:18

解决方案2
0 2021-12-23 08:29:16