简体   繁体   English

从流媒体广播获取信息

[英]get info from streaming radio

There is some standard way to ask a streaming radio service about the currently playing song? 有一些标准的方法可以向流媒体广播服务询问当前播放的歌曲吗? I currently do it in a different way for each station, eg (SomaFM): 我目前以不同的方式对每个电台执行此操作,例如(SomaFM):

  $wg=join("\n",`wget -q -O - https://somafm.com/secretagent/songhistory.html`);
  $wg=~/\(Now\).*>([^<]*)<\/a><\/td><td>([^<]*)/s;  
  print "Secret Agent\n$1\n$2\n"

or (Radio Svizzera Classica): 或(Radio Svizzera Classica):

$wg=join("\n",`wget -q -O - http://www.radioswissclassic.ch/en`);
$wg=~/On Air.*?titletag">([^<]*).*?artist">([^<]*)/s;
print "Radio Svizzera Classic\n$1\n$2\n"

... but I wonder if there is maybe a more standard way to do it, not relying on downloading html pages that are bound to change sooner or later ...但我想知道是否可能有更标准的方法来执行此操作,而不是依赖于下载迟早会更改的html页面

For SHOUTcast/Icecast style stations with ICY metadata (which make up the bulk of internet radio stations), the best thing to do is get this data from the stream itself. 对于具有ICY元数据的SHOUTcast / Icecast样式的电台(构成互联网广播电台的大部分),最好的做法是从流本身获取此数据。

First, you need a URL to the actual stream. 首先,您需要一个指向实际流的URL。 If you go to SomaFM's Secret Agent page at http://somafm.com/secretagent/ , you'll see links to listen in other players. 如果转到http://somafm.com/secretagent/上的SomaFM的Secret Agent页面,您将看到在其他播放器中收听的链接。 As an example, let's use the 128k AAC link, which points at http://somafm.com/secretagent130.pls . 例如,让我们使用128k AAC链接,该链接指向http://somafm.com/secretagent130.pls This isn't the actual stream... it's a playlist file that contains links to the actual stream. 这不是实际的流……这是一个播放列表文件,其中包含指向实际流的链接。 Open it in your favorite text or code editor to see what I mean: 在您喜欢的文本或代码编辑器中将其打开,以了解我的意思:

[playlist]
numberofentries=2
File1=http://ice1.somafm.com/secretagent-128-aac
Title1=SomaFM: Secret Agent (#1  ): The soundtrack for your stylish, mysterious, dangerous life. For Spies and PIs too!
Length1=-1
File2=http://ice2.somafm.com/secretagent-128-aac
Title2=SomaFM: Secret Agent (#2  ): The soundtrack for your stylish, mysterious, dangerous life. For Spies and PIs too!
Length2=-1
Version=2

Internet radio stations typically include multiple servers here for failover. 互联网广播电台通常在此处包括用于故障转移的多个服务器。 If the listener gets disconnected from one, the player will usually roll to the next item. 如果听众与某个人断开连接,则播放器通常会滚动到下一个项目。 This is also useful when one server reaches its listener limit... the player will (hopefully) eventually hit another server that's active. 当一台服务器达到其侦听器限制时,这也很有用……播放器最终(希望)会命中另一台处于活动状态的服务器。

Anyway, fire up a copy of Wireshark or some other packet sniffer. 无论如何,启动Wireshark或其他数据包嗅探器的副本。 Hit one of the URLs in your audio player, and inspect the traffic. 点击音频播放器中的一个URL,然后检查流量。 The first thing we'll look at is the request and response. 我们首先要看的是请求和响应。

GET /secretagent-128-aac HTTP/1.1
Host: ice1.somafm.com
User-Agent: VLC/2.2.4 LibVLC/2.2.4
Range: bytes=0-
Connection: close
Icy-MetaData: 1

HTTP/1.0 200 OK
Content-Type: audio/aacp
Date: Sat, 20 May 2017 20:43:56 GMT
icy-br:128
icy-genre:Various
icy-name:Secret Agent from SomaFM [SomaFM]
icy-notice1:<BR>This stream requires <a href="http://www.winamp.com/">Winamp</a><BR>
icy-notice2:SHOUTcast Distributed Network Audio Server/Linux v1.9.5<BR>
icy-pub:0
icy-url:http://SomaFM.com
Server: Icecast 2.4.0-kh3
Cache-Control: no-cache, no-store
Pragma: no-cache
Access-Control-Allow-Origin: *
Access-Control-Allow-Headers: Origin, Accept, X-Requested-With, Content-Type
Access-Control-Allow-Methods: GET, OPTIONS, HEAD
Connection: Close
Expires: Mon, 26 Jul 1997 05:00:00 GMT
icy-metaint:45000

These internet radio servers are either HTTP (in the case of Icecast and others) or really close to it (legacy SHOUTcast), and accept normal GET requests. 这些Internet广播服务器或者是HTTP(对于Icecast等),或者是非常接近的HTTP(传统SHOUTcast),并接受常规的GET请求。 In this case, my player (VLC) makes a GET request for /secretagent-128-aac , which is the path to the actual stream. 在这种情况下,我的播放器(VLC)向/secretagent-128-aac发出GET请求,这是实际流的路径。

My player also includes one key request header: 我的播放器还包含一个关键请求标头:

Icy-MetaData: 1

This Icy-MetaData header asks the server to mux metadata with the audio stream data. Icy-MetaData标头要求服务器将元数据与音频流数据复用。 That is, the "now playing" track information is going to be sent periodically injected into the stream. 即,“正在播放”的曲目信息将被周期性地发送到流中。

In the server response headers, there's another key header: 在服务器响应头中,还有另一个键头:

icy-metaint:45000

This tells us two things... the first is that the server agreed to send metadata. 这告诉我们两件事……首先是服务器同意发送元数据。 The second is that the metadata interval is 45,000 bytes. 第二个是元数据间隔为45,000字节。 Every 45,000 bytes, the server will inject a chunk of metadata. 每45,000字节,服务器将注入大量元数据。 Let's go back to our packet sniffer, and see what this looks like: 让我们回到数据包嗅探器,看看它是什么样的:

ICY元数据十六进制转储

The very first byte of the metadata chunk, 0x06 , tells us how long the metadata chunk is. 元数据块的第一个字节0x06告诉我们元数据块有多长时间。 Take the value of that byte, multiply it by 16, and you'll have the length of the metadata chunk in bytes. 将该字节的值乘以16,您将获得元数据块的长度(以字节为单位)。 That is, 0x06 for the first metadata chunk byte tells us that the next 96 bytes will be metadata, before returning to regular stream data. 也就是说,第一个元数据块字节的0x06告诉我们,接下来的96个字节将是元数据,然后返回常规流数据。 Note that this means the entire metadata is 97 bytes... 1 byte for the length indicator, and then 96 bytes (in this case) for the rest. 请注意,这意味着整个元数据为97字节...长度指示器为1字节,其余部分为96字节(在这种情况下)。

Now, let's get into the actual text metadata format: 现在,让我们进入实际的文本元数据格式:

StreamTitle='Buscemi - First Flight To London';StreamUrl='http://SomaFM.com/secretagent/';

It's looks pretty straightforward. 看起来非常简单。 key='value' , semicolon ; key='value' ,分号; delimited. 定界的。 There are some big catches with this though. 虽然有一些大收获。 For example... there's no truly standard method for escaping the single quote. 例如,没有真正的标准方法来转义单引号。 If the metadata value needs to contain a single quote, sometimes it's \\' , sometimes it's ''' . 如果元数据值需要包含单引号,则有时为\\' ,有时为''' Sometimes it's not escaped at all! 有时根本无法逃脱!

Additionally, not all servers use the same character encoding. 此外,并非所有服务器都使用相同的字符编码。 You can probably safely assume UTF-8, but do expect that some servers might be different, or just simply broken in their own metadata encoding. 您可能可以放心地假定为UTF-8,但确实希望某些服务器可能有所不同,或者只是简单地破坏了它们自己的元数据编码。

Anyway, now that you know how all of this works, you can implement. 无论如何,既然您知道所有这些工作原理,就可以实现。 If you'd like, I have some code you can license. 如果您愿意,我有一些您可以许可的代码。 One is a Node.js API server which when given a stream URL will return the metadata for you, doing all the buffering and parsing server-side. 一个是Node.js API服务器,当给定一个流URL时,它将为您返回元数据,并完成服务器端的所有缓冲和解析。 The other is a client-side player based on MSE... note though that this only works with servers that support CORS, and as far as I know, only my own servers (AudioPump CDN) do that today. 另一个是基于MSE的客户端播放器...虽然请注意,这仅适用于支持CORS的服务器,据我所知,今天只有我自己的服务器(AudioPump CDN)可以这样做。 If you're interested in any of this code, feel free to e-mail me at brad@audiopump.co. 如果您对任何此代码感兴趣,请随时给我发送电子邮件至brad@audiopump.co。 If you have questions about my answer here on Stack Overflow, post a comment here. 如果您对我在Stack Overflow上的回答有疑问,请在此处发表评论。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM