在字符串列表上使用 df.apply 创建新列

Question

我对使用 Python 进行编程非常陌生，我有一个问题希望你们中的一些人能帮助我。 我正在处理一个 json 文件，其中包含许多不同的数据，我已将这些数据制成 dataframe。 在 json 文件中有 27 个键，其中一个被命名为评级。 在 rating 键下有多个不同的列表，包括一个名为“review”的字符串列表。 我需要将评论列表作为 dataframe 中的新列取出，以便对其进行 som 分析。 我尝试了以下事情：

df['review'] = df.ratings.apply(lambda x: x['review'])

df['review'] = df.ratings.apply(lambda row: (row['review'])

df['review'] = df['ratings'].apply(lambda x: re.sub('#\d+', '', x['review']))

所有这些都不起作用。 那么你们中的一些人错误在哪里，或者知道我这样做的更好方法吗？

非常感谢您的帮助和时间:) json 文件的示例

{"ratings": [{"date": "2018-09-04", 
"name": "MK", 
"url": "", 
"title": "employee", 
"photo": ".jpg", "team": 5, 
"vision": 5, 
"product": 5, 
"profile": 0, 
"review": "The team is building the product with full force and constantly updates the community, if the team can acquire more users it will be a superb blockchain-based product", 
"weight": "22%", 
"agree": 0}]}

Answer 1

修改了您的示例 JSON 以使其有效并演示方法
list的explode()的简单使用
apply(pd.Series)扩展嵌入式字典

js = {"movie":"best movie", "ratings": [{"date": "2017-09-29", "name": "Gerard Paul S. Netro",
"url": "gerard-paul-s-netro", 
"title": "Blockchain Entrepreneur &amp; ICO Expert", 
"photo": "https://icobench.com/images/users/gerard-paul-s-netro-1506943906.jpg", 
"team": 4, "vision": 5, "product": 5, "profile": 0, 
"review": "", "weight": "3%", "agree": 0}]}

df = pd.json_normalize(js)
# explode the list
df = df.explode("ratings")
# expand embedded dict
df.join(df.ratings.apply(pd.Series))

output

	电影	收视率	日期	姓名	url	标题	照片	团队	想象	产品	轮廓	审查	重量	同意
0	最好的电影	{'date': '2017-09-29', 'name': 'Gerard Paul S. Netro', 'url': 'gerard-paul-s-netro', 'title': 'Blockchain Entrepreneur & ICO Expert' , '照片': 'https://icobench.com/images/users/gerard-paul-s-netro-1506943906.jpg', '团队': 4, '愿景': 5, '产品': 5, ' profile': 0, 'review': '', '权重': '3%', '同意': 0}	2017-09-29	杰拉德·保罗·S·内特罗	杰拉德-保罗-斯内特罗	区块链企业家和ICO专家	https://icobench.com/images/users/gerard-paul-s-netro-1506943906.jpg	4	5	5	0		3%	0

Answer 2

另一种更简单的单线方法是，

df = pd.json_normalize(<dictionary to flatten>, record_path = ['ratings'])
df

注意：您需要 pandas 1.xx 才能运行pd.json_normalize()

在字符串列表上使用 df.apply 创建新列

问题描述

2 个解决方案

解决方案1
0 2021-03-19 07:16:59

output

解决方案2
0 2021-03-19 07:19:43

在字符串列表上使用 df.apply 创建新列

问题描述

2 个解决方案

解决方案1 0 2021-03-19 07:16:59

output

解决方案2 0 2021-03-19 07:19:43

解决方案1
0 2021-03-19 07:16:59

解决方案2
0 2021-03-19 07:19:43