替换列表中的数字 object pandas

Question

I have a dataframe df looking as follows:我有一个 dataframe df，如下所示：

id cited_ids        dummy_paper   d      
2  [4]                  NaN        NaN 
4  [9,18,6]             NaN        NaN
6  []                   9          0
7  [2]                  NaN        NaN
9  [4]                   7        0
14 [18,6]                3        0
18 [7]                   1        0

What I would like to do is to substitute into df['cited_ids'] 0 whenever the corresponding id has d=0 (i) and replace d=1 if there is at least one 0 in the list of df['cited_ids'] and the previous d was not 0 (ii).我想做的是在相应的 id 具有 d=0 (i) 时替换为df['cited_ids'] 0，如果df['cited_ids']列表中至少有一个 0，则替换 d=1并且前面的 d 不是 0 (ii)。 In other words, the first step (i) would result in:换句话说，第一步 (i) 将导致：

id cited_ids        dummy_paper   d      
2  [4]                  NaN       NaN 
4  [0,0,6]             NaN        NaN
6  []                   9         0
7  [2]                  NaN       NaN
9  [4]                   7        0
14 [0,6]                 3        0
18 [0]                   1        0

The second step (ii) would then result in:第二步 (ii) 将导致：

id cited_ids        dummy_paper   d      
2  [4]                  NaN       NaN 
4  [0,0,6]             NaN        1
6  []                   9         0
7  [2]                  NaN       NaN
9  [4]                   7        0
14 [0,6]                 3        0
18 [0]                   1        0

Please also notice that the dataframe comes with df['cited_ids'] being an object.另请注意，dataframe 附带的df['cited_ids']是 object。

df.to_dict() gives: df.to_dict() 给出：

{'docdb': {0: 2, 1: 4, 2: 6, 3: 7, 4: 9, 5: 14, 6: 18},
 'cited_docdb': {0: [4],
  1: [9, 18, 6],
  2: [],
  3: [2],
  4: [4],
  5: [18, 6],
  6: [7]},
 'fronteer': {0: nan, 1: nan, 2: 9.0, 3: nan, 4: 7.0, 5: 3.0, 6: 1.0},
 'distance': {0: nan, 1: nan, 2: 0.0, 3: nan, 4: 0.0, 5: 0.0, 6: 0.0}}

Thank you谢谢

Answer 1

The exact logic is unclear and your output doesn't seem to match the description, but IIUC:确切的逻辑不清楚，您的 output 似乎与描述不符，但 IIUC：

s = df.set_index('id')['d'].dropna().convert_dtypes()

df['cited_ids'] = [[s.get(i, i) for i in x]
                   for x in df['cited_ids']]

m = [0 in x for x in df['cited_ids']]

df.loc[m&df['d'].isna(), 'd'] = 1

output: output：

   id  cited_ids  dummy_paper    d
0   2        [4]          NaN  NaN
1   4  [0, 0, 0]          NaN  1.0
2   6         []          9.0  0.0
3   7        [2]          NaN  NaN
4   9        [4]          7.0  0.0
5  14     [0, 0]          3.0  0.0
6  18        [7]          1.0  0.0

替换列表中的数字 object pandas

问题描述

1 个解决方案

解决方案1
1 已采纳 2022-10-05 13:30:30

替换列表中的数字 object pandas

问题描述

1 个解决方案

解决方案1 1 已采纳 2022-10-05 13:30:30

解决方案1
1 已采纳 2022-10-05 13:30:30