如何从包含 json 的文件创建新的 pandas dataframe 列？

Question

I have a dataset with metadata about voice calls我有一个包含语音通话元数据的数据集

It looks like看起来像

Data columns (total 4 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   phone        100 non-null    string
 1   group_id     100 non-null    int64 
 2   question_id  100 non-null    int64 
 3   result       100 non-null    bool

I want to create column #4 that will contain some dialogue statistics, eg total_words (int64) , and data must be taken from exteral json file, containing speech-to-text recognition results我想创建第 4 列，其中将包含一些对话统计信息，例如total_words (int64) ，数据必须取自外部 json 文件，其中包含语音到文本的识别结果

Is there any build-in pandas way to do that?有没有内置的 pandas 方法可以做到这一点？

I've tested with pandas.read_json but getting module errors ( ValueError: DataFrame constructor not properly called! and TypeError: argument of type 'method' is not iterable )我已经使用pandas.read_json进行了测试，但出现了模块错误（ ValueError: DataFrame constructor not properly called! and TypeError: argument of type 'method' is not iterable ）

I'm looking for something like我正在寻找类似的东西

df['total_words'] = pd.read_json('file://localhost:8888/auido/' + df['phone'] + '.mp3.wstat.json')

I would be glad if someone will provide a working code example of solving similar issue如果有人能提供解决类似问题的工作代码示例，我会很高兴

UPD: output of json file is like {"total_words": 74} UPD：json 文件的 output 就像{"total_words": 74}

Answer 1

Try this:试试这个：

df['total_words']=df['phone'].apply(lambda x: pd.read_json('file://localhost:8888/auido/' + x + '.mp3.wstat.json'))

Now each cell of total_words column contains another dataframe, you can access it using:现在total_words列的每个单元格都包含另一个 dataframe，您可以使用以下方式访问它：

#for first row
df.iloc[0]["total_words"].head()

如何从包含 json 的文件创建新的 pandas dataframe 列？

问题描述

1 个解决方案

解决方案1
0 2022-12-27 18:51:22

如何从包含 json 的文件创建新的 pandas dataframe 列？

问题描述

1 个解决方案

解决方案1 0 2022-12-27 18:51:22

解决方案1
0 2022-12-27 18:51:22