简体   繁体   English

解析HTML内容以与iPhone应用程序一起使用

[英]Parsing HTML content to use with iPhone app

I don't even know if the title for this question is appropriate, since I'm really lost and need some advice, a starting point to what I need to accomplish. 我甚至不知道这个问题的标题是否合适,因为我真的迷失了,需要一些建议,这是我需要完成的事情的起点。

My iPhone app plays audio streamed from the Internet, with my custom made player. 我的iPhone应用程序使用我定制的播放器播放从互联网流式传输的音频。 Some links are live streams from Akamai and others are audio files stored on a website . 一些链接是来自Akamai的实时流,其他链接是存储在网站上的音频文件 I'm OK with the live streams, but my problem is with the audio files . 我对实时流很好,但我的问题在于音频文件

As I have many stored audio files that the user can choose from, in different languages, and I don't want to hardcode all of them on my application.Then I need a clever way for the user to browse on the app (pushing the information from the Internet) until he reaches the desired file to play. 由于我有许多存储的音频文件,用户可以选择,使用不同的语言,我不想在我的应用程序上对所有这些文件进行硬编码。然后我需要一种聪明的方式让用户浏览应用程序(推送来自互联网的信息)直到他到达想要播放的文件。

The website is organized like this: 该网站的组织方式如下:

First there is list, having all available programs. 首先是列表,包含所有可用的程序。 The user chooses the desired program, then another page shows up and he has to choose a day of the week to play. 用户选择所需的节目,然后显示另一页面,他必须选择一周中的某一天进行播放。

My question is: how can I parse this content, with programs and days of the week to choose from? 我的问题是:我如何解析这些内容,节目和一周中的几天可供选择? Should I look into HTML parsing? 我应该研究一下HTML解析吗? Is there a better/simpler way, like making XML files on the website? 是否有更好/更简单的方法,比如在网站上制作XML文件?

If this helps, the all the webpages end with the .aspx extension. 如果这有帮助,所有网页都以.aspx扩展名结尾。

Please, any advise from a more experienced programmer will greatly help me. 请,来自更有经验的程序员的任何建议将对我有很大帮助。 Thank you! 谢谢!

I don't think parsing HTML would be the best implementation here. 我认为解析HTML不是最好的实现。 Go for a structured source that doesn't have viewable markup to worry about parsing out or ignoring altogether (also will mean fewer resources thrown at parsing the markup because you will only be parsing what matters). 寻找一个没有可查看标记的结构化源来担心解析或完全忽略(也意味着解析标记时抛出的资源更少,因为你只会解析重要的事情)。

I'd suggest consuming an XML or JSON source that can be converted to a NSDictionary or other data structure for app use. 我建议使用XML或JSON源,可以转换为NSDictionary或其他数据结构供应用程序使用。 Here's a neat little class that converts an XML source to an NSDictionary: http://troybrant.net/blog/2010/09/simple-xml-to-nsdictionary-converter/ 这是一个简洁的小类,它将XML源转换为NSDictionary: http//troybrant.net/blog/2010/09/simple-xml-to-nsdictionary-converter/

TBXML is another light-weight XML parser for Objective-C that makes implementing a custom data object up to you: http://www.tbxml.co.uk/ TBXML是Objective-C的另一个轻量级XML解析器,可以实现自定义数据对象: http//www.tbxml.co.uk/

If you'd rather use JSON, there are a number of helpers out there. 如果你更愿意使用JSON,那里有很多助手。 A good place to start looking would be here: http://cocoaobjects.com/?s=json 一个开始寻找的好地方就在这里: http//cocoaobjects.com/?s = json

If I have understood your question correctly, whatever source you choose, you're likely to want to wind up with a dictionary object that looks something like this: 如果我已经正确地理解了你的问题,无论你选择什么来源,你可能想要找到类似这样的字典对象:

programs = (
  {
    program_name: "Foo";
    tracks = (
      { day: Monday;
        track: audio_file1.mp3;
      },
      { day: Tuesday;
        track: audio_file2.mp3;
      },
      { day: Wednesday;
        track: audio_file3.mp3;
      }
    );
  },
  {
    program_name: "Bar";
    tracks = (
      { day: Monday;
        track: audio_file4.mp3;
      },
      { day: Tuesday;
        track: audio_file5.mp3;
      },
      { day: Wednesday;
        track: audio_file6.mp3;
      }
    );
  },
  {
    program_name: "Baz";
    tracks = (
      { day: Monday;
        track: audio_file7.mp3;
      },
      { day: Tuesday;
        track: audio_file8.mp3;
      },
      { day: Wednesday;
        track: audio_file9.mp3;
      }
    );
  };
);

Once you've worked out your data source, and converted it to a native data object for working with in Obj-C, you should be able to proceed with coding up a UI that can iterate through the dictionary to provide a list of programs and, in turn, a list of days for each program with accompanying audio files to select to play. 一旦你计算出你的数据源,并将其转换为本机数据对象以便在Obj-C中工作,你应该能够继续编写一个可以遍历字典的UI来提供一个程序列表和反过来,每个节目的天数列表以及附带的音频文件供选择播放。

I had a similar need. 我有类似的需求。 Consuming data from an ASP.NET site. 从ASP.NET站点使用数据。 In the end I used JSON from the .NET side and return JSON. 最后,我使用了.NET端的JSON并返回JSON。 Then, I used the json-framework from Google Code to convert the JSON returned to an NSDictionary. 然后,我使用Google Code中的json-framework将返回的JSON转换为NSDictionary。 From there the rest is history. 从那里休息是历史。

If you are using .NET MVC, then returning JSON results is super simple in a controller. 如果您使用的是.NET MVC,那么在控制器中返回JSON结果非常简单。 Since you have aspx extensions, I assume that is not the case. 既然你有aspx扩展,我认为情况并非如此。 There are tons of JSON parsers for C# listed at the bottom of the json.org homepage. json.org主页的底部列出了大量用于C#的JSON解析器。

if the website content is static, I would hard code the file names and appropriate URL's to your server within the app and let the user scroll through the list of available items. 如果网站内容是静态的,我会硬编码文件名和适当的URL到应用程序内的服务器,让用户滚动可用项列表。

if the website content changes, then I would create an XML file on a server which your app downloads on launch (or as you deem fit) and parse within the app, then continue as per static content. 如果网站内容发生变化,那么我会在服务器上创建一个XML文件,您的应用程序会在启动时(或您认为合适时)下载并在应用程序中解析,然后按照静态内容继续。

hope this starts you off in the right direction. 希望这能让你朝着正确的方向前进。

If it were me, and assuming I have some clue as to what you're talking about, I would have a database that shows the relationship between the audio content and the date. 如果是我,并假设我有一些关于你在谈论什么的线索,我会有一个数据库,显示音频内容和日期之间的关系。 Then your spinner for the content would just be updated by a query... 然后你的微调内容将由查询更新...

So, for instance, assume a table 因此,例如,假设一个表

+----------------------------------------------------------------------+
| Filename                        | Language          | Date           |
+----------------------------------------------------------------------+
| kjslfiewofksalfjslfakj          | Swahili           | 2011-11-01     |
| shfaahflajfewifhlanfww          | Guyanese          | 2011-10-08     |
| weijalfjlajfljalsfjewn          | French            | 2011-11-01     |
| fiwojancanlsjfhkwehwlk          | Swahili           | 2011-11-01     |
| fhalksflwiehlfnaksflhw          | Swahili           | 2011-11-03     |
+----------------------------------------------------------------------+

Okay, so if joe schmo reaches the page for the show dated 2011-11-01 and his language is Swahili, two rows will be returned: 好的,所以如果joe schmo到达2011-11-01节目的页面并且他的语言是斯瓦希里语,则会返回两行:

+----------------------------------------------------------------------+
| Filename                        | Language          | Date           |
+----------------------------------------------------------------------+
| kjslfiewofksalfjslfakj          | Swahili           | 2011-11-01     |
| fiwojancanlsjfhkwehwlk          | Swahili           | 2011-11-01     |
+----------------------------------------------------------------------+

You could also easily add references for the date and language that indicate an Akamai record. 您还可以轻松添加指示Akamai记录的日期和语言的参考。 It doesn't strike me as terribly complicated, but it may mean significant redesign for you. 它并没有让我觉得非常复杂,但它可能意味着对你进行重大的重新设计。 However, you've been purposefully vague on details, so hopefully at least this points you in a right direction. 但是,你对细节一直有目的地模糊,所以希望至少这会指出你正确的方向。

Edit : 编辑

Alright, so after re-reading, there may be a relatively easy way to control content in its organization by using directory structures, but it takes a backseat to my proposed table. 好吧,所以在重新阅读之后,可能有一种相对简单的方法来通过使用目录结构来控制其组织中的内容,但它在我提议的表格中占据了一席之地。

As I understand it, there are potentially three categories at work: program , date , and language . 据我了解,可能有三类工作: 程序日期语言

If I create a file structure (assuming root): 如果我创建一个文件结构(假设root):

/public_html/audio/[date]/[language]/[program_name].mp4

Then, when the user selects a date and language, we might have: 然后,当用户选择日期和语言时,我们可能会:

/public_html/audio/2011-11-14/swahili/the_linux_show.mp4

Then, all we'd have to do is have the $_POST data from the selectors read to provide the show... Unfortunately, this will mean that we have to know the date that the show aired, then language, then show name. 然后,我们所要做的就是从选择器中读取$_POST数据以提供节目......不幸的是,这意味着我们必须知道节目播出的日期,然后是语言,然后显示名称。 This would be a far worse way than a database, but could be done. 这将是一个比数据库更糟糕的方式,但可以做到。 Use ASP to read directory contents and you can list using loops. 使用ASP读取目录内容,您可以使用循环列出。 Seems pretty simple, but not at all elegant. 看起来很简单,但一点也不优雅。

Think outside the box: use UIWebView 在盒子外面思考:使用UIWebView

How about instead of thinking how to parse data and then write UI code to display it we think more of the big picture: we want to present to iPhone user sequence of screens to select and play a recording, and this should be coming from a web server. 如何而不是思考如何解析数据然后编写UI代码来显示它我们更多地考虑大局:我们想要向iPhone用户呈现屏幕序列来选择和播放录音,这应该来自网络服务器。 Only if there was such a tool... but wait, there is! 只有有这样的工具......但是等等,有! It's called web browser and in the form of UIWebView you can integrate it in your interface, with a little twist. 它被称为Web浏览器,以UIWebView的形式,您可以将它集成到您​​的界面中,稍加扭曲。

First, adding UIWebView is very easy, check this http://zpasternack.blogspot.com/2010/09/stupid-uialertview-tricks-part-i.html for illustration. 首先,添加UIWebView非常简单,请查看http://zpasternack.blogspot.com/2010/09/stupid-uialertview-tricks-part-i.html进行说明。

So let's say we added web view and user can select an audio file from there, what happens then? 所以我们假设我们添加了网页视图,用户可以从中选择一个音频文件,然后会发生什么? Turns out you can tell it what should happen, check this question UIWebView open links in Safari . 原来你可以告诉它应该发生什么, 在Safari中检查这个问题UIWebView打开链接 You can hook your code into handling of link clicks and do whatever you please (like hide web view and show player etc). 您可以将代码挂钩到链接点击的处理中,并随意做任何事情(比如隐藏Web视图和显示播放器等)。

To give an example, say first in the web view you load 举个例子,首先在你加载的Web视图中说
http://foobar.com/somepath/listOfPrograms http://foobar.com/somepath/listOfPrograms
which happens to be web page showing list of the programs (which thanks on some clever CSS could look just like an UITableView if you please). 这恰好是显示程序列表的网页(感谢一些聪明的CSS看起来就像UITableView,如果你愿意的话)。 User clicks on a programing name, that goes to 用户点击编程名称即可
http://foobar.com/somepath/programs/CarTalk http://foobar.com/somepath/programs/CarTalk
which page presents list of weekly shows (again iPhonesque formatted) and when clicked on a link, this now points to 哪个页面显示每周节目列表(再次是iPhonesque格式化),当点击链接时,现在指向
http://audio.foobar.com/somesuch/45678913.mp3 http://audio.foobar.com/somesuch/45678913.mp3
at which point your code recognizes that's audio URL, apprehends control and plays it however it pleases. 此时你的代码会识别出音频网址,理解控制并播放它,但它很高兴。

How useful is that you may wonder. 你可能想知道它有多大用处。 The answer is " very " :-). 答案是“ 非常 ”:-)。 It moves the presentation structure away from the app - and to the web server. 它将表示结构从应用程序移动到Web服务器。 The app's entry into the UIWebView is the initial URL and the exit is click on audio file link. 应用程序进入UIWebView是初始URL,退出是单击音频文件链接。 In a few months someone decides they want the choices not to be made fist programming name and then day of the week; 在几个月内,有人决定他们希望选择不是第一个编程名称,然后是星期几。 or add additional layer of choice by language or country. 或按语言或国家/地区添加其他选择层。 No problemo, no need to release new version of the app, just tweak a bit the web pages on the server and the app will pick it up automagically. 没问题,不需要发布新版本的应用程序,只需稍微调整服务器上的网页,应用程序就会自动获取它。

It also makes testing the web server side easy - just point any browser to the initial page URL and click-through to see if you make it to a viable audio file. 它还使Web服务器端的测试变得简单 - 只需将任何浏览器指向初始页面URL并点击即可查看是否将其设置为可行的音频文件。 The web master can handle that independently of you, the app writer. 网站管理员可以独立于您,即应用程序编写者处理。 You don't even have to care what they use on their side to get those pages, is it hard-coded in html, or comes from a SQL DB, XML tarpit, whatever. 你甚至不必关心他们在他们身边使用什么来获取这些页面,它是用html进行硬编码,还是来自SQL DB,XML tarpit等等。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM