So I made a webservice (based on starlette), with an endpoint that accepts a binary body. I want to feed this binary body to fastavro.
Starlette doc says , I can access the raw data as a async stream with request.stream()
.
async for chunk in request.stream():
# do something with chunk...
Now, I want to feed the stream to fastavro. The thing is, fastavro reader needs a file-like input stream:
with open('some-file.avro', 'rb') as fo:
avro_reader = reader(fo)
My question is, is there a clean way to transform this async stream into a file-like one?
I guess I could implement an object that has a read() method that awaits and returns the data returns by request.stream. But if the caller passes a size, I need to have a memory buffer, doesn't I? Could something based on BufferedRWPair?
Or is the only way to store the whole stream first to the disk or memory, before feeding it to fastavro?
Thanks in advance !
I ended up using a SpooledTemporaryFile:
data_file = SpooledTemporaryFile(mode='w+b',
max_size=MAX_RECEIVED_DATA_MEMORY_SIZE)
async for chunk in request.stream():
data_file.write(chunk)
data_file.seek(0)
avro_reader = reader(data_file)
It's not the ideal solution I envisonned (somehow transmit the data directly between the input and output), but still good enough...
I encountered the same problem and wrote compact class StreamingBody
. It does exactly what I need.
from typing import AsyncIterator
import asyncio
class AsyncGen:
def __init__(self, block_count, block_size) -> None:
self.bc = block_count
self.bs = block_size
def __aiter__(self):
return self
async def __anext__(self):
if self.bc == 0:
raise StopAsyncIteration()
self.bc -= 1
return b"A" * self.bs
class StreamingBody:
_chunks: AsyncIterator[bytes]
_backlog: bytes
def __init__(self, chunks: AsyncIterator[bytes]):
self._chunks = chunks
self._backlog = b""
async def _read_until_end(self):
content = self._backlog
self._backlog = b""
while True:
try:
content += await self._chunks.__anext__()
except StopAsyncIteration:
break
return content
async def _read_chunk(self, size: int):
content = self._backlog
bytes_read = len(self._backlog)
while bytes_read < size:
try:
chunk = await self._chunks.__anext__()
except StopAsyncIteration:
break
content += chunk
bytes_read += len(chunk)
self._backlog = content[size:]
content = content[:size]
return content
async def read(self, size: int = -1):
if size > 0:
return await self._read_chunk(size)
elif size == -1:
return await self._read_until_end()
else:
return b""
async def main():
async_gen = AsyncGen(11, 3)
body = StreamingBody(async_gen)
res = await body.read(11)
print(f"[{len(res)}]: {res}")
res = await body.read()
print(f"[{len(res)}]: {res}")
res = await body.read()
print(f"[{len(res)}]: {res}")
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.