[英]Extracting str from pandas dataframe
I read csv file into a dataframe named df我将 csv 文件读入名为 df 的 dataframe
Each rows contains str below.每行包含下面的 str 。
{"name":"Daniel Gimness","id":10551043...}
{"name":"丹尼尔·吉姆尼斯","id":10551043...}
I would like to extract "name" and "id" from each row and make a new dataframe to store the str.我想从每一行中提取“name”和“id”并制作一个新的 dataframe 来存储 str。
I tried several ways to do it but all failed and below is the outcome of one of my attempts.我尝试了几种方法来做到这一点,但都失败了,下面是我尝试的结果之一。 Please let me know if there is any suggestions on how to solve this problem.
如果对如何解决此问题有任何建议,请告诉我。 Thanks
谢谢
pd.DataFrame.from_records(df.creator.tolist())
0 1 2 3 4 5 6 7 8 9 ... 934 935 936 937 938 939 940 941 942 943
0 { " u r l s " : { " ... None None None None None None None None None None
1 { " u r l s " : { " ... None None None None None None None None None None
2 { " u r l s " : { " ... None None None None None None None None None None
3 { " u r l s " : { " ... None None None None None None None None None None
4 { " u r l s " : { " ... None None None None None None None None None None
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
195609 { " u r l s " : { " ... None None None None None None None None None None
195610 { " u r l s " : { " ... None None None None None None None None None None
195611 { " u r l s " : { " ... None None None None None None None None None None
195612 { " u r l s " : { " ... None None None None None None None None None None
195613 { " u r l s " : { " ... None None None None None None None None None None
Use a regex expression with pandas.Series.str.extract()
.使用带有
pandas.Series.str.extract()
的正则表达式。
Something like:就像是:
df["id"] = df["creator"].str.extract(""" "id":"([0-9]+)" """)
It seems that you've Json data in column "creator"
.您似乎在
"creator"
列中有 Json 数据。 You can try:你可以试试:
import json
x = df["creator"].apply(
lambda x: {"name": (m := json.loads(x))["name"], "id": m["id"]}
)
print(pd.DataFrame(x.to_list()))
Prints:印刷:
name id
0 Daniel Gimness 10551043
1 Redmond Entwistle 10551043
...
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.