[英]How to create new pandas dataframe column from file that contains json?
I have a dataset with metadata about voice calls我有一个包含语音通话元数据的数据集
It looks like看起来像
Data columns (total 4 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 phone 100 non-null string
1 group_id 100 non-null int64
2 question_id 100 non-null int64
3 result 100 non-null bool
I want to create column #4 that will contain some dialogue statistics, eg total_words (int64)
, and data must be taken from exteral json file, containing speech-to-text recognition results我想创建第 4 列,其中将包含一些对话统计信息,例如total_words (int64)
,数据必须取自外部 json 文件,其中包含语音到文本的识别结果
Is there any build-in pandas way to do that?有没有内置的 pandas 方法可以做到这一点?
I've tested with pandas.read_json but getting module errors ( ValueError: DataFrame constructor not properly called!
and TypeError: argument of type 'method' is not iterable
)我已经使用pandas.read_json进行了测试,但出现了模块错误( ValueError: DataFrame constructor not properly called!
and TypeError: argument of type 'method' is not iterable
)
I'm looking for something like我正在寻找类似的东西
df['total_words'] = pd.read_json('file://localhost:8888/auido/' + df['phone'] + '.mp3.wstat.json')
I would be glad if someone will provide a working code example of solving similar issue如果有人能提供解决类似问题的工作代码示例,我会很高兴
UPD: output of json file is like {"total_words": 74}
UPD:json 文件的 output 就像{"total_words": 74}
Try this:试试这个:
df['total_words']=df['phone'].apply(lambda x: pd.read_json('file://localhost:8888/auido/' + x + '.mp3.wstat.json'))
Now each cell of total_words
column contains another dataframe, you can access it using:现在total_words
列的每个单元格都包含另一个 dataframe,您可以使用以下方式访问它:
#for first row
df.iloc[0]["total_words"].head()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.