简体   繁体   English

iOS:下载,解包,解码和解析大文件

[英]iOS: Download, unwrap, decode, and parse a large file

A project I'm working on (iPhone/Obj-C) requires me to fetch a large file (via HTTP Post) and process it. 我正在研究的项目(iPhone / Obj-C)要求我获取一个大文件(通过HTTP Post)并处理它。 The server will return some XML wrapping BASE64 encoded gzipped XML data. 服务器将返回一些XML包装BASE64编码的gzip压缩XML数据。 ie: SERVER -> XML -> BASE64 -> GZIP -> XML -> My Model 即:SERVER - > XML - > BASE64 - > GZIP - > XML - >我的模型

The amount of data will vary, but I'm told the final XML will be about 5 MB. 数据量会有所不同,但我被告知最终的XML将大约为5 MB。

I'd like to unwrap, decode, and parse the data as it arrives. 我想在数据到达时解包,解码和解析数据。

I'm looking for tips / pointers. 我正在寻找提示/指针。 (Ideally, there's existing published code out there, but I didn't see "stream friendly" examples in my searching.) (理想情况下,现有已发布的代码,但我在搜索中没有看到“流友好”的示例。)

Will I end up subclassing NSStream? 我最终会继承子类NSStream吗?

The ideal solution will work for devices running iOS 3.2 and later. 理想的解决方案适用于运行iOS 3.2及更高版本的设备。

Thanks! 谢谢!

Have the server (Apache?) do the gzip within the HTTP, and the iOS NSURLConnection will un-gzip as it goes. 让服务器(Apache?)在HTTP中执行gzip,iOS NSURLConnection将随时取消gzip。 HTTP can contain binary data, so Base64 is not needed either. HTTP可以包含二进制数据,因此也不需要Base64。 You should be able to get XML to arrive at your NSURLConnection as NSData, which you could feed into a SAX type parser (which can parse as it downloads). 您应该能够获得XML作为NSData到达您的NSURLConnection,您可以将其提供给SAX类型解析器(可以在下载时解析)。

If your server is under your control, and the server is only being used by an iOS app, and performance is your main concern, you could attempt to send your model data encoded as a binary plist. 如果您的服务器在您的控制之下,并且该服务器仅由iOS应用程序使用,并且性能是您主要关心的问题,那么您可以尝试将编码为二进制plist的模型数据发送。 XML or JSON is probably going to be easier to work with though. 但是,XML或JSON可能更容易使用。

Well, this is not the answer to the question I asked, but perhaps a "solution". 嗯,这不是我问的问题的答案,但也许是一个“解决方案”。

The data I'm downloading is fitting nicely in memory, so there is no pressing need to optimize things to process as a stream. 我正在下载的数据非常适合内存,因此没有必要优化要作为流处理的内容。

  1. I use the fantastic ASIHTTPRequest library (Ben Copsey) to fetch the initial XML and just run it through an NSXML parser to grab the tag. 我使用精彩的ASIHTTPRequest库 (Ben Copsey)来获取初始XML,然后通过NSXML解析器运行它来获取标记。 I highly recommend ASI-HTTP-REQUEST for anyone using the HTTP protocol for iOS. 对于使用iOS协议的任何人,我强烈推荐ASI-HTTP-REQUEST。

  2. Next I used a slightly tweaked (to rid clang warnings) version of Matt Gallagher' Base64 category to unwrap the Base64 to gzip. 接下来,我使用稍微调整(消除铿锵警告)版本的Matt Gallagher'Base64类别将Base64解包到gzip。

  3. Then I run the gzip data through ASI's decoder: NSData* xmlData = [ASIDataDecompressor uncompressData:gzippedData error:&error]; 然后我通过ASI的解码器运行gzip数据: NSData* xmlData = [ASIDataDecompressor uncompressData:gzippedData error:&error]; to get at the XML that should have been sent all along. 获取应该一直发送的XML。

  4. Finally, I run the XML through another NSXMLParser to pick out the bits of data I need. 最后,我通过另一个NSXMLParser运行XML来挑选我需要的数据位。

In another part of the project, I'm actually directed to fetch a ZIP archive containing a few hundred tiny .txt files. 在项目的另一部分,我实际上是指示获取包含几百个.txt文件的ZIP存档。 (Yeah, it's that kind of gig.) To decode the ZIP file; (是的,这就是那种演出。)解码ZIP文件; I'm currently using ZipKit by Karl Moskowski. 我目前正在使用Karl Moskowski的ZipKit

I hope the data never grows to the point where I'll need to process it all as a stream. 我希望数据永远不会增长到我需要将其全部作为流处理的程度。 If it does, I know an easy way to shave off 33%. 如果确实如此,我知道一个简单的方法可以削减33%。 :) :)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM