简体繁体 English

在网站上录制音频：Red5流或发布音频数据？

[英]Recording Audio on website: Red5 stream or posting the audio data?

原文 2011-09-06 20:55:21 6 1 php/ flash/ audio/ ffmpeg/ red5

Let me first establish what I want to do: 首先让我确定我想做什么：

My user is able to record voicenotes on my website, add tags to said notes for indexing as well as a title. 我的用户可以在我的网站上记录语音笔记，可以在这些笔记中添加标签以建立索引以及标题。 When the note is saved I save the path of the note along with the other info in my DB. 保存便笺后，我会将便笺的路径以及其他信息保存在数据库中。

Now, I have 2 choices to do the recording, both involve a .swf embedded in my site: 现在，我有2种选择来进行记录，这两种选择都涉及嵌入到我的网站中的.swf：

1) I could use Red5 server to stream the audio to my server and save the file and return the path to said file to my app to do the DB saving, seems rather complicated since I would have to convert the audio and move it to the appropriate folder that belongs to the user in a server side Red5 app, which I'm not very aware of how to build. 1）我可以使用Red5服务器将音频流式传输到服务器并保存文件，然后将指向该文件的路径返回到我的应用程序以进行数据库保存，这似乎相当复杂，因为我必须转换音频并将其移至服务器端Red5应用程序中属于用户的适当文件夹，我不太了解如何构建。

2) I could simply record the audio and grab its byte array, do a Base64 encoding on it and send it to PHP along with the rest of the data that is necessary (be it by a simple POST or an AJAX call), decode it on the server and make the file with the appropriate extension, audio conversion would also occur here using ffmpeg, this option seems simpler but I do not know how viable it is. 2）我可以简单地记录音频并获取其字节数组，对其进行Base64编码，然后将其与所需的其余数据（通过简单的POST或AJAX调用）一起发送给PHP，对其进行解码在服务器上并使用适当的扩展名制作文件，此处也将使用ffmpeg进行音频转换，此选项似乎更简单，但我不知道它的可行性。

What option would you say is more viable and easier to develop? 您会说哪种选择更可行，更容易开发？ Thanks in advance 提前致谢

1 个解决方案

Depending on the planned duration of the recording, you may very well be able to use option number two. 根据录制的计划持续时间，您可能可以使用第二个选项。 I recently used a similar approach successfully for a project, but recordings were only up to 30 seconds or so. 我最近在一个项目中成功使用了类似的方法，但是录音最多只有30秒左右。 Here's what I did differently from what you're suggesting though, and why I think it's better: 这是我所做的与您所建议的有所不同的原因，以及为什么我认为更好的原因：

To capture the sound from the microphone and store it to a ByteArray, use the SAMPLE_DATA event which is dispatched whenever more sound data comes in from the microphone. 要从麦克风捕获声音并将其存储到ByteArray，请使用SAMPLE_DATA事件，该事件在从麦克风输入更多声音数据时分派。 There's an example in the documentation that should explain this well enough. 文档中有一个示例可以对此进行足够的解释。
Because most users would be on normal home computers without any special recording equipment, it was safe to assume that the full fidelity of the recording is not necessary. 由于大多数用户将使用普通的家用计算机，而没有任何特殊的记录设备，因此可以安全地假定不需要完全保真记录。 I used just 2 bytes per sample, and only mono, instead of using the full 64 bit floats (AS3 Number ) that you get from the microphone on the SAMPLE_DATA event. 我每个样本仅使用2个字节，并且仅使用单声道，而不是使用在SAMPLE_DATA事件中从麦克风获得的完整64位浮点数（AS3 Number ）。 Simply read the Number and do myFloatSample * 0x7fff to convert to 16 bit signed integer. 只需读取Number并执行myFloatSample * 0x7fff即可将其转换为16位带符号整数。
Don't use the native 44.1kHz sampling rate if you're just recording speech or something else in that frequency range. 如果您只是在该频率范围内录制语音或其他内容，请不要使用原始的44.1kHz采样率。 You will likely get away just fine with 22.05kHz, which will cut the amount of data in half straight away. 您可能会以22.05kHz达到最佳效果，这将立即将数据量减少一半。 Just set the Microphone.rate property accordingly. 只需相应地设置Microphone.rate属性。
Don't use Base64 to encode your data. 不要使用Base64对数据进行编码。 Send it as binary data, which will be significantly smaller. 将其作为二进制数据发送，该数据会小得多。 You can send it as raw POST data, or using something like AMF. 您可以将其作为原始POST数据发送，也可以使用AMF之类的数据发送。 Also, before you send it, use the native compress() or deflate() methods on the ByteArray to compress it. 另外，在发送之前，请使用ByteArray上的本机compress()或deflate()方法对其进行压缩。 On the server, decompress using the ZLIB or raw DEFLATE (inflate) algorithms respectively, which PHP supports . 在服务器上，分别使用PHP支持的ZLIB或原始DEFLATE（膨胀）算法进行解压缩。
Once decompressed on the server, what you have is essentially what is called a raw 16-bit mono PCM stream. 在服务器上解压缩后，您所拥有的实际上就是所谓的原始16位单声道PCM流。 Incidentally, that should be one of the very input formats that ffmpeg (or lame) supports, so you should be able to encode it to mp3 without having to do any manual decoding first. 顺便说一句，这应该是ffmpeg（或lame）支持的输入格式之一，因此您应该能够将其编码为mp3，而无需先进行任何手动解码。

Obviously the Red5 solution will likely be better, because it's more tailored for the task. 显然，Red5解决方案可能会更好，因为它是针对任务而定制的。 But if you don't have the resources to set up a Red5 server, or don't want to use Java, the above solution is proven to work well as long as you stay away from too long recordings. 但是，如果您没有足够的资源来设置Red5服务器，或者不想使用Java，那么只要您不要离开太长的录音，上述解决方案就可以很好地工作。

To take a simple example, a 30 second recording at 22,050 samples per second, 2 bytes per sample will be ~1.3MB. 举一个简单的例子，以每秒22,050个样本，每个样本2个字节的速度记录30秒将达到1.3MB。 Even once deflated, the transfer to the server will likely still be almost a megabyte for 30 seconds of audio. 即使放气，传输到服务器的声音在30秒内仍可能接近兆字节。 This may or may not be acceptable for your application. 这可能适合您的应用程序，也可能不可接受。