[英]how to get value by name from dictionary in python
I have a csv file with column of company names and i need to find domain names of it and store it in the same csv file beside the company names 我有一个带有公司名称列的csv文件,我需要查找它的域名并将其存储在公司名称旁边的同一csv文件中
code i have used so far 我到目前为止使用的代码
import pandas as pd
import clearbit
import json
clearbit.key = 'your secret key'
df = pd.read_csv("/home/vipul/Desktop/new.csv", sep=',', encoding="utf-8")
saved_column = df['Company']
i=0
for data in saved_column:
n = saved_column[i]
i = i+1
domain = clearbit.NameToDomain.find(name=n)
print(domain)
l = json.loads(domain)
print(l['domain'])
This code here gives domain names logo and name in json but how to take only domain 这段代码在json中提供了域名徽标和名称,但如何仅使用域名
But this gives error 但这给了错误
l = json.loads(domain) print(l['domain'])
error: 错误:
TypeError: the JSON object must be str, not 'NameToDomain'
The csv file looks like this
csv文件如下所示
Company
Accenture
AND Digital
Accenture
Kite Consulting Group
Capgemini
expected output
预期产量
Company Domain
Accenture accenture.com
AND Digital and.digital
Accenture accenture.com
Kite Consulting Group None
Capgemini capgemini.com
The json looks like json看起来像
Name: Company, dtype: object
{'name': 'Accenture', 'logo': 'https://logo.clearbit.com/accenture.com', 'domain': 'accenture.com'}
{'name': 'AND Digital', 'logo': 'https://logo.clearbit.com/and.digital', 'domain': 'and.digital'}
{'name': 'Accenture', 'logo': 'https://logo.clearbit.com/accenture.com', 'domain': 'accenture.com'}
None
{'name': 'Capgemini', 'logo': 'https://logo.clearbit.com/capgemini.com', 'domain': 'capgemini.com'}
According to the documentation, clearbit.NameToDomain.find(name=n)
returns a dictionary, so you can access the values of it just like with any other python dictionary. 根据文档,
clearbit.NameToDomain.find(name=n)
返回一个字典,因此您可以像访问其他任何python字典一样访问它的值。 You don't care that it came from json, that's handled for you. 您不在乎它来自json,它已为您处理。 (also this question has nothing to do with csv).
(这个问题也与csv无关)。
Two other points: 另外两点:
Based on the question, there are two things: 根据该问题,有两件事:
like this: 像这样:
data = clearbit.NameToDomain.find(name=n)
print(data) # Dictionary
print(data['domain']) # Domain value
Use apply 使用申请
import pandas as pd
from urllib.parse import urlparse
def parse_url(x):
return 'unknown' if pd.isnull(x) else urlparse(x)[1]
df = pd.read_csv("./new.csv")
df['domain'] = df['Profile URL'].apply(parse_url)
df_new = df.loc[:, ['Company', 'domain']]
Parser for clearbit could be implemented like ( I have not tried this code, but it should work ): clearbit的解析器可以像这样实现( 我没有尝试过此代码,但是应该可以 ):
import clearbit
def parse_url(x):
return 'unknown' if pd.isnull(x)
data = clearbit.NameToDomain.find(name=x)
return data.get('domain', 'Default value')
This code imports data from the CSV provided.
此代码从提供的CSV导入数据。 You may instead call the clearbit API in the parse_url method and return appropriate value.
您可以改为在parse_url方法中调用clearbit API并返回适当的值。
This solution works on Python3.
该解决方案适用于Python3。 Please take it as a starting point and not as a copy-paste solution.
请以此为起点,而不是复制粘贴解决方案。
As it is a dictionary we can assign some default value to it and store in the csv file and later removing it does the job :) 由于它是字典,因此我们可以为其分配一些默认值,并将其存储在csv文件中,然后再将其删除即可:)
The edited code 编辑后的代码
import pandas as pd
import clearbit
import json
clearbit.key = 'your key'
df = pd.read_csv("/home/vipul/Desktop/new.csv", sep=',', encoding="utf-8")
saved_column = df['Company'].dropna()
i=0
res = []
for data in saved_column:
n = saved_column.get(i)
print(n)
i = i+1
data = clearbit.NameToDomain.find(name=n)
if data != None:
res.append(data['domain'])
else:
res.append('domain.com')
print(res)
df['Domain'] = res
df.to_csv("/home/vipul/Desktop/new.csv",index = False, skipinitialspace=False)
print("File saved to desktop as new.csv")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.