简体   繁体   English

如何使用 JSON 文件作为使用 Python 的 MS Azure 文本分析的输入?

[英]How can I use a JSON file as an input to MS Azure text analytics using Python?

I've looked through many responses to variants of this problem but still not able to get my code working.我查看了许多对此问题变体的响应,但仍然无法让我的代码正常工作。

I am trying to use the MS Azure text analytics service and when I paste the example code (including 2/3 sample sentences) it works as you might expect.我正在尝试使用 MS Azure 文本分析服务,当我粘贴示例代码(包括 2/3 示例句子)时,它可以按您的预期工作。 However, my use case requires the same analysis to be performed on hundreds of free text survey responses so rather than pasting in each and every sentence, I would like to use a JSON file containing these responses as an input, pass that to Azure for analysis and receive back a JSON output.但是,我的用例需要对数百个自由文本调查响应执行相同的分析,因此我想使用包含这些响应的 JSON 文件作为输入,将其传递给 Azure 进行分析并收到 JSON output。

The code I am using and the response it yields is shown below (note that the last bit of ID 2 response has been chopped off before the error message).我正在使用的代码及其产生的响应如下所示(请注意,ID 2 响应的最后一位已在错误消息之前被截断)。

        key = "xxxxxxxxxxx"
endpoint = "https://blablabla.cognitiveservices.azure.com/"

import json
with open(r'example.json', encoding='Latin-1') as f:
  data = json.load(f)

print (data) 

import os
from azure.cognitiveservices.language.textanalytics import TextAnalyticsClient
from msrest.authentication import CognitiveServicesCredentials

def authenticateClient():
    credentials = CognitiveServicesCredentials(key)
    text_analytics_client = TextAnalyticsClient(
        endpoint=endpoint, credentials=credentials)
    return text_analytics_client

import requests
# pprint is used to format the JSON response
from pprint import pprint

import os

subscription_key = "xxxxxxxxxxxxx"
endpoint = "https://blablabla.cognitiveservices.azure.com/"

entities_url = "https://blablabla.cognitiveservices.azure.com/text/analytics/v2.1/entities/"

documents = data 

headers = {"Ocp-Apim-Subscription-Key": subscription_key}
response = requests.post(entities_url, headers=headers, json=documents)
entities = response.json()
pprint(entities)

[{'ID': 1, 'text': 'dog ate my homework', {'ID': 2, 'text': 'cat sat on the [{'ID': 1, 'text': '狗吃了我的作业', {'ID': 2, 'text': '猫坐在

{'code': 'BadRequest', 'innerError': {'code': 'InvalidRequestBodyFormat', 'message': 'Request body format is wrong. {'code': 'BadRequest', 'innerError': {'code': 'InvalidRequestBodyFormat', 'message': '请求正文格式错误。 Make sure the json ' 'request is serialized correctly and there are no ' 'null members.'}, 'message': 'Invalid request'}确保 json ' ' 请求被正确序列化并且没有 ' 'null 成员。'}, 'message': 'Invalid request'}

According to my research, when we call Azure text analytics rest API to Identify Entities, the request body should be like根据我的研究,当我们调用 Azure 文本分析 rest API 来识别实体时,请求正文应该像

{
  "documents": [
    {
      "id": "1",
      "text": "."
    },
    {

      "id": "2",
      "text": ""
    }
  ]
}

For example例如

My json file我的 json 文件

[{
        "id": "1",
        "text": "dog ate my homework"
    }, {
        "id": "2",
        "text": "cat sat on the sofa"
    }
]

My code我的代码

key = ''
endpoint = "https://<>.cognitiveservices.azure.com/"

import requests
from pprint import pprint
import os

import json
with open(r'd:\data.json', encoding='Latin-1') as f:
  data = json.load(f)
pprint(data)
entities_url = endpoint + "/text/analytics/v2.1/entities?showStats=true"
headers = {"Ocp-Apim-Subscription-Key": key}
documents=data 
response = requests.post(entities_url, headers=headers, json=documents)
entities = response.json()
pprint(entities)
pprint("--------------------------------")
documents ={}
documents["documents"]=data
response = requests.post(entities_url, headers=headers, json=documents)
entities = response.json()
pprint(entities)

在此处输入图像描述

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将SignedJwtAssertionCredentials与Google Analytics(分析)JSON密钥文件一起使用? - How can I use SignedJwtAssertionCredentials with a Google Analytics JSON key file? 文本文件和 json 文件可以互换使用吗? 如果是这样,我如何在 python 中使用它? - Can a text file and a json file be used interchangeably? And if so how can I use it in python? 如何使用Python将Azure Blob存储中的大型JSON文件拆分为每个记录的单个文件? - How can I split large JSON file in Azure Blob Storage into individual files for each record using Python? 如何使用脚本Python 2.7.3在文本编辑器Atom.io中使用raw_input? - How can I use raw_input in the text editor Atom.io using the script Python 2.7.3? 如何将用户输入放入文本小部件中并使用 Python 使用 tkinter 在日志文件中进行搜索 - How can I put user input inside text widget and do search in Log file with tkinter using Python 如何将我的输入输入到 python 中的文本文件中? - How can I get my input into a text file in python? 如何将更多 json 数据添加到 python 中的文本文件? - How can i add more json data to a text file in python? 如何将 Ariba 与 Python 连接起来? 我希望在 Python 中读取 Ariba 文档,然后对它们使用文本分析 - How can I connect Ariba with Python? I'm looking to read documents off Ariba in Python and then use Text Analytics on them 如何直接从 Azure blob 存储读取文本文件而不将其下载到本地文件(使用 python)? - How can I read a text file from Azure blob storage directly without downloading it to a local file(using python)? 如何在 python 中使用 selenium 滑动输入 [type=text]? - How can I slide input[type=text] using selenium in python?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM