简体   繁体   English

iPhone上的HTML解析

[英]HTML Parsing on iPhone

I want to parse this site : 我想解析这个网站:

http://its.wonju.go.kr/movinginfo2/DetailSub/StopDetail.asp?StopID=1959# http://its.wonju.go.kr/movinginfo2/DetailSub/StopDetail.asp?StopID=1959#

So I tried use TFHpple. 所以我尝试使用TFHpple。 Like this : 像这样 :

NSString *htmlWillInsert = [NSString stringWithContentsOfURL:
                            [NSURL URLWithString:@"http://its.wonju.go.kr/movinginfo2/DetailSub/StopDetail.asp?StopID=1959#"]
                                                    encoding:-2147481280
                                                       error:nil];

NSData *htmlData = [htmlWillInsert dataUsingEncoding:NSUnicodeStringEncoding];

TFHpple *xpathParser =[[TFHpple alloc] initWithHTMLData:htmlData];
NSArray *busNumbs = [xpathParser search:@"//td //a[@class='new']"];
NSLog (@"%@",busNumbs);

 for (int i=0;i<[busNumbs count];i++) {
    TFHppleElement *busNumb = [busNumbs objectAtIndex:i];
    NSString *Numb = [busNumb content];
    NSString *st = [[NSString alloc]initWithFormat:@"%@",Numb];
    NSLog(@"%@",st);
}

But it isn't working :(... 但这不起作用:(...
It loads <a href="#" onClick="MapMove(22);" class="new" onFocus='this.blur();'">▷ 22번(순환) 它加载<a href="#" onClick="MapMove(22);" class="new" onFocus='this.blur();'">▷ 22번(순환) <a href="#" onClick="MapMove(22);" class="new" onFocus='this.blur();'">▷ 22번(순환) But It can't loads <font color='red'> 108분후 도착</font> <a href="#" onClick="MapMove(22);" class="new" onFocus='this.blur();'">▷ 22번(순환)但无法加载<font color='red'> 108분후 도착</font>
Like this : 像这样 :

2011-04-09 10:42:07.413 GangwonBus[5461:207] ▷ 1번(순환)
2011-04-09 10:42:07.414 GangwonBus[5461:207] ▷ 2번(순환)
2011-04-09 10:42:07.415 GangwonBus[5461:207] ▷ 21번(순환)
2011-04-09 10:42:07.415 GangwonBus[5461:207] ▷ 22번(순환)
2011-04-09 10:42:07.416 GangwonBus[5461:207] ▷ 23번(순환)
2011-04-09 10:42:07.417 GangwonBus[5461:207] ▷ 24번(순환)
2011-04-09 10:42:07.417 GangwonBus[5461:207] ▷ 25번(순환)
2011-04-09 10:42:07.422 GangwonBus[5461:207] ▷ 32번(순환)
2011-04-09 10:42:07.423 GangwonBus[5461:207] ▷ 41번(순환)
2011-04-09 10:42:07.424 GangwonBus[5461:207] ▷ 41-1번(순환)
2011-04-09 10:42:07.424 GangwonBus[5461:207] ▷ 42번(순환)
2011-04-09 10:42:07.425 GangwonBus[5461:207] ▷ 51번(순환)
2011-04-09 10:42:07.426 GangwonBus[5461:207] ▷ 52번(순환)
2011-04-09 10:42:07.426 GangwonBus[5461:207] ▷ 53번(순환)
2011-04-09 10:42:07.427 GangwonBus[5461:207] ▷ 54번(순환)
2011-04-09 10:42:07.428 GangwonBus[5461:207] ▷ 55번(순환)
2011-04-09 10:42:07.428 GangwonBus[5461:207] ▷ 56번(순환)
2011-04-09 10:42:07.429 GangwonBus[5461:207] ▷ 57번(순환)
2011-04-09 10:42:07.430 GangwonBus[5461:207] ▷ 58번(순환)
2011-04-09 10:42:07.430 GangwonBus[5461:207] ▷ 64번(기점 -> 종점)
2011-04-09 10:42:07.431 GangwonBus[5461:207] ▷ 70번(순환)
2011-04-09 10:42:07.432 GangwonBus[5461:207] ▷ 71번(순환)
2011-04-09 10:42:07.432 GangwonBus[5461:207] ▷ 72번(기점 -> 종점)
2011-04-09 10:42:07.433 GangwonBus[5461:207] ▷ 73번(순환)
2011-04-09 10:42:07.434 GangwonBus[5461:207] ▷ 74번(순환)
2011-04-09 10:42:07.434 GangwonBus[5461:207] ▷ 81-1번(순환)
2011-04-09 10:42:07.435 GangwonBus[5461:207] ▷ 82번(순환)
2011-04-09 10:42:07.435 GangwonBus[5461:207] ▷ 83번(순환)
2011-04-09 10:42:07.436 GangwonBus[5461:207] ▷ 84번(순환)
2011-04-09 10:42:07.437 GangwonBus[5461:207] ▷ 90번(순환)
2011-04-09 10:42:07.437 GangwonBus[5461:207] ▷ 91번(순환)

What's wrong? 怎么了?

I want parse like this : 我想这样解析:

2011-04-09 10:42:07.413 GangwonBus[5461:207] ▷ 1번(순환)
2011-04-09 10:42:07.414 GangwonBus[5461:207] ▷ 2번(순환)
2011-04-09 10:42:07.415 GangwonBus[5461:207] ▷ 21번(순환)
2011-04-09 10:42:07.415 GangwonBus[5461:207] ▷ 22번(순환)
2011-04-09 10:42:07.416 GangwonBus[5461:207] ▷ 23번(순환)
2011-04-09 10:42:07.417 GangwonBus[5461:207] ▷ 24번(순환)
2011-04-09 10:42:07.417 GangwonBus[5461:207] ▷ 25번(순환)
2011-04-09 10:42:07.422 GangwonBus[5461:207] ▷ 32번(순환)
2011-04-09 10:42:07.423 GangwonBus[5461:207] ▷ 41번(순환)
2011-04-09 10:42:07.424 GangwonBus[5461:207] ▷ 41-1번(순환)
2011-04-09 10:42:07.424 GangwonBus[5461:207] ▷ 42번(순환)
2011-04-09 10:42:07.425 GangwonBus[5461:207] ▷ 51번(순환)
2011-04-09 10:42:07.426 GangwonBus[5461:207] ▷ 52번(순환)
2011-04-09 10:42:07.426 GangwonBus[5461:207] ▷ 53번(순환)
2011-04-09 10:42:07.427 GangwonBus[5461:207] ▷ 54번(순환)
2011-04-09 10:42:07.428 GangwonBus[5461:207] ▷ 55번(순환)
2011-04-09 10:42:07.428 GangwonBus[5461:207] ▷ 56번(순환)
2011-04-09 10:42:07.429 GangwonBus[5461:207] ▷ 57번(순환)
2011-04-09 10:42:07.430 GangwonBus[5461:207] ▷ 58번(순환)
2011-04-09 10:42:07.430 GangwonBus[5461:207] ▷ 64번(기점 -> 종점)
2011-04-09 10:42:07.431 GangwonBus[5461:207] ▷ 70번(순환)
2011-04-09 10:42:07.432 GangwonBus[5461:207] ▷ 71번(순환)
2011-04-09 10:42:07.432 GangwonBus[5461:207] ▷ 72번(기점 -> 종점)
2011-04-09 10:42:07.433 GangwonBus[5461:207] ▷ 73번(순환) 77분후 도착
2011-04-09 10:42:07.434 GangwonBus[5461:207] ▷ 74번(순환)
2011-04-09 10:42:07.434 GangwonBus[5461:207] ▷ 81-1번(순환)
2011-04-09 10:42:07.435 GangwonBus[5461:207] ▷ 82번(순환)
2011-04-09 10:42:07.435 GangwonBus[5461:207] ▷ 83번(순환)
2011-04-09 10:42:07.436 GangwonBus[5461:207] ▷ 84번(순환)
2011-04-09 10:42:07.437 GangwonBus[5461:207] ▷ 90번(순환)
2011-04-09 10:42:07.437 GangwonBus[5461:207] ▷ 91번(순환) 108분후 도착

The <font> tag is a child node to the anchor tag. <font>标记是锚标记的子节点。 The description [aTFHppleElement content] is incorrectly described as the innerHTML as it only contains the textual content of the element and not the content of its child nodes. 描述[aTFHppleElement content]被错误地描述为innerHTML,因为它仅包含元素的文本内容,而不包含其子节点的内容。 So you will need to access the child nodes to display that information. 因此,您将需要访问子节点以显示该信息。 As it were, the code in the repository doesn't provide the neccessary API to access the child nodes but one of its forks has the changes. 实际上,存储库中的代码未提供访问子节点所需的API,但其中的一个分支具有更改。

[aTFHppleElement children] provides access to the array of child nodes. [aTFHppleElement children]提供对子节点数组的访问。 It is an array of TFHppleElement 's. 它是TFHppleElement的数组。 Looking at the HTML structure, only few of them have the child nodes so, 从HTML结构来看,只有少数几个具有子节点,因此,

for (int i=0;i<[busNumbs count];i++) {
    TFHppleElement *busNumb = [busNumbs objectAtIndex:i];
    NSString *out = [busNumb content];
    if ( [[busNumb children] count] != 0 ) {
        out = [out stringByAppendingFormat:@" %@", [[busNumb firstChild] content]];
    }
    NSLog(@"%@",out);
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM