繁体   English   中英

如何使用 OpenAI 的 API 进行批量嵌入?

[英]How can I use batch embeddings using OpenAI's API?

我正在使用 OpenAI API 来获取一堆句子的嵌入。 一堆句子,我的意思是一堆句子,比如几千个。 有没有办法让它更快或让它同时进行嵌入或其他什么?

我尝试循环遍历并为每个句子发送一个请求,但这非常慢,但发送句子列表也是如此。 对于这两种情况,我都使用了这段代码:'''

response = requests.post(
    "https://api.openai.com/v1/embeddings",
    json={
        "model": "text-embedding-ada-002",
        "input": ["text:This is a test", "text:This is another test", "text:This is a third test", "text:This is a fourth test", "text:This is a fifth test", "text:This is a sixth test", "text:This is a seventh test", "text:This is a eighth test", "text:This is a ninth test", "text:This is a tenth test", "text:This is a eleventh test", "text:This is a twelfth test", "text:This is a thirteenth test", "text:This is a fourteenth test", "text:This is a fifteenth test", "text:This is a sixteenth test", "text:This is a seventeenth test", "text:This is a eighteenth test", "text:This is a nineteenth test", "text:This is a twentieth test", "text:This is a twenty first test", "text:This is a twenty second test", "text:This is a twenty third test", "text:This is a twenty fourth test", "text:This is a twenty fifth test", "text:This is a twenty sixth test", "text:This is a twenty seventh test", "text:This is a twenty eighth test", "text:This is a twenty ninth test", "text:This is a thirtieth test", "text:This is a thirty first test", "text:This is a thirty second test", "text:This is a thirty third test", "text:This is a thirty fourth test", "text:This is a thirty fifth test", "text:This is a thirty sixth test", "text:This is a thirty seventh test", "text:This is a thirty eighth test", "text:This is a thirty ninth test", "text:This is a fourtieth test", "text:This is a forty first test", "text:This is a forty second test", "text:This is a forty third test", "text:This is a forty fourth test", "text:This is a forty fifth test", "text:This is a forty sixth test", "text:This is a forty seventh test", "text:This is a forty eighth test", "text:This is a forty ninth test", "text:This is a fiftieth test", "text:This is a fifty first test", "text:This is a fifty second test", "text:This is a fifty third test"],
    },
    headers={
        "Authorization": f"Bearer {key}"
    }
    )

对于第一个测试,我一个一个地做了一堆这样的请求,第二个我发送了一个列表。 我应该并行发送单独的请求吗? 那会有帮助吗? 谢谢!

根据 OpenAi 的 Create Embeddings API,你应该能够做到这一点:

要在单个请求中获取多个输入的嵌入,请传递一个字符串数组或令牌数组 arrays。每个输入的长度不得超过 8192 个令牌。

https://beta.openai.com/docs/api-reference/embeddings/create

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM