[英]How can I use a JSON file as an input to MS Azure text analytics using Python?
I've looked through many responses to variants of this problem but still not able to get my code working.我查看了许多对此问题变体的响应,但仍然无法让我的代码正常工作。
I am trying to use the MS Azure text analytics service and when I paste the example code (including 2/3 sample sentences) it works as you might expect.我正在尝试使用 MS Azure 文本分析服务,当我粘贴示例代码(包括 2/3 示例句子)时,它可以按您的预期工作。 However, my use case requires the same analysis to be performed on hundreds of free text survey responses so rather than pasting in each and every sentence, I would like to use a JSON file containing these responses as an input, pass that to Azure for analysis and receive back a JSON output.但是,我的用例需要对数百个自由文本调查响应执行相同的分析,因此我想使用包含这些响应的 JSON 文件作为输入,将其传递给 Azure 进行分析并收到 JSON output。
The code I am using and the response it yields is shown below (note that the last bit of ID 2 response has been chopped off before the error message).我正在使用的代码及其产生的响应如下所示(请注意,ID 2 响应的最后一位已在错误消息之前被截断)。
key = "xxxxxxxxxxx"
endpoint = "https://blablabla.cognitiveservices.azure.com/"
import json
with open(r'example.json', encoding='Latin-1') as f:
data = json.load(f)
print (data)
import os
from azure.cognitiveservices.language.textanalytics import TextAnalyticsClient
from msrest.authentication import CognitiveServicesCredentials
def authenticateClient():
credentials = CognitiveServicesCredentials(key)
text_analytics_client = TextAnalyticsClient(
endpoint=endpoint, credentials=credentials)
return text_analytics_client
import requests
# pprint is used to format the JSON response
from pprint import pprint
import os
subscription_key = "xxxxxxxxxxxxx"
endpoint = "https://blablabla.cognitiveservices.azure.com/"
entities_url = "https://blablabla.cognitiveservices.azure.com/text/analytics/v2.1/entities/"
documents = data
headers = {"Ocp-Apim-Subscription-Key": subscription_key}
response = requests.post(entities_url, headers=headers, json=documents)
entities = response.json()
pprint(entities)
[{'ID': 1, 'text': 'dog ate my homework', {'ID': 2, 'text': 'cat sat on the [{'ID': 1, 'text': '狗吃了我的作业', {'ID': 2, 'text': '猫坐在
{'code': 'BadRequest', 'innerError': {'code': 'InvalidRequestBodyFormat', 'message': 'Request body format is wrong. {'code': 'BadRequest', 'innerError': {'code': 'InvalidRequestBodyFormat', 'message': '请求正文格式错误。 Make sure the json ' 'request is serialized correctly and there are no ' 'null members.'}, 'message': 'Invalid request'}确保 json ' ' 请求被正确序列化并且没有 ' 'null 成员。'}, 'message': 'Invalid request'}
According to my research, when we call Azure text analytics rest API to Identify Entities, the request body should be like根据我的研究,当我们调用 Azure 文本分析 rest API 来识别实体时,请求正文应该像
{
"documents": [
{
"id": "1",
"text": "."
},
{
"id": "2",
"text": ""
}
]
}
For example例如
My json file我的 json 文件
[{
"id": "1",
"text": "dog ate my homework"
}, {
"id": "2",
"text": "cat sat on the sofa"
}
]
My code我的代码
key = ''
endpoint = "https://<>.cognitiveservices.azure.com/"
import requests
from pprint import pprint
import os
import json
with open(r'd:\data.json', encoding='Latin-1') as f:
data = json.load(f)
pprint(data)
entities_url = endpoint + "/text/analytics/v2.1/entities?showStats=true"
headers = {"Ocp-Apim-Subscription-Key": key}
documents=data
response = requests.post(entities_url, headers=headers, json=documents)
entities = response.json()
pprint(entities)
pprint("--------------------------------")
documents ={}
documents["documents"]=data
response = requests.post(entities_url, headers=headers, json=documents)
entities = response.json()
pprint(entities)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.