Pandas Dataframe 中的索引列返回 NaN

Question

I am running into a problem with trying to index my dataframe.我在尝试索引我的数据框时遇到问题。 As shown in the attached picture, I have a column in the dataframe called 'Identifiers' that contains a lot of redundant information ({'print_isbn_canonical': ').如附图所示，我在数据框中有一列名为“标识符”的列，其中包含大量冗余信息（{'print_isbn_canonical':'）。 I only want the ISBN that comes after.我只想要后面的 ISBN。

    #Option 1 I tried
    testdf2 = testdf2[testdf2['identifiers'].str[26:39]]
    
    #Option 2 I tried
    testdf2['identifiers_test'] = testdf2['identifiers'].str.replace("{'print_isbn_canonical': '","")

Unfortunately both of these options turn the dataframe column into a colum only containing NaN values不幸的是，这两个选项都将数据框列变成只包含 NaN 值的列

Please help out!请帮忙！ I cannot seem to find the solution and have tried several things.我似乎无法找到解决方案并尝试了几件事。 Thank you all in advance!谢谢大家！

Example image of the dataframe数据框的示例图像

Answer 1

If the contents of your column identifiers is a real dict / json type, you can use the string accessor str[] to access the dict value by key, as follows:如果你的列identifiers的内容是真正的 dict / json 类型，你可以使用字符串访问器str[]来按键访问 dict 值，如下所示：

testdf2['identifiers_test'] = testdf2['identifiers'].str['print_isbn_canonical']

Demo演示

data = {'identifiers': [{'print_isbn_canonical': '9780721682167', 'eis': '1234'}]}
df = pd.DataFrame(data)

df['isbn'] = df['identifiers'].str['print_isbn_canonical']

print(df)

                                                identifiers           isbn
0  {'print_isbn_canonical': '9780721682167', 'eis': '1234'}  9780721682167

Answer 2

Try this out :试试这个：

testdf2['new_column'] = testdf2.apply(lambda r : r.identifiers[26:39],axis=1)

Here I assume that the identifiers column is string type这里我假设标识符列是字符串类型

Pandas Dataframe 中的索引列返回 NaN

问题描述

2 个解决方案

解决方案1
1 已采纳 2021-06-30 10:32:59

解决方案2
0 2021-06-30 09:43:35

Pandas Dataframe 中的索引列返回 NaN

问题描述

2 个解决方案

解决方案1 1 已采纳 2021-06-30 10:32:59

解决方案2 0 2021-06-30 09:43:35

解决方案1
1 已采纳 2021-06-30 10:32:59

解决方案2
0 2021-06-30 09:43:35