简体   繁体   English

ByteArrayOutputStream转换为字符串数组

[英]ByteArrayOutputStream to String Array

I'm writing an application which has a method that will download a text file from my server. 我正在编写一个应用程序,该应用程序的方法将从服务器上下载文本文件。 This text file will contain ~1,000 proxy IP's. 该文本文件将包含约1,000个代理IP。 The download will happen every 10 minutes. 下载将每10分钟进行一次。 I need to find the most efficient way of doing this. 我需要找到最有效的方法。

Currently I have a method in a class called Connection which will return the bytes of whatever I want to retrieve. 当前,我在名为Connection的类中有一个方法,该方法将返回我想要检索的任何内容的字节。 So if I make a connection to the server for the text file using such method, I will get it returned in bytes. 因此,如果使用这种方法为文本文件建立与服务器的连接,我将获得以字节为单位的返回值。 My other method will create a very long string from these bytes. 我的其他方法将从这些字节创建一个非常长的字符串。 After, I split the long string into an array using System.LineSeparator. 之后,我使用System.LineSeparator将长字符串拆分为一个数组。 Here is the code: 这是代码:

 public static void fetchProxies(String url) {
    Connection c = new Connection();
    List<Proxy> tempProxy = new ArrayList<Proxy>();
    ByteArrayOutputStream baos = 
            c.requestBytes(url);  
    String line = new String(baos.toByteArray()); 

    String[] split = line.split(System.lineSeparator());
    //more code to come but the above works fine.

}

This currently works but I know that it isn't the most efficient way. 这目前可行,但我知道这不是最有效的方法。 I 一世

My Problem 我的问题
Instead of turning the bytes into a very long string, what is the most efficient way of turning the bytes into my IP's so I can add each individual IP into an arraylist and then return the arraylist full of IP's? 不是将字节转换成很长的字符串,而是将字节转换为我的IP的最有效方法是什么,这样我可以将每个IP添加到一个数组列表中,然后返回充满IP的数组列表?

The most efficient and logical way would be to create a BufferedReader wrapping an InputStreamReader wrapping the InputStream of the URL connection. 最有效和合乎逻辑的方法是创建一个包装了InputStreamReaderBufferedReader包装了URL连接的InputStreamReaderInputStream You would the use readLine() on the BufferedReader until it returns null, and append each line read to the list of IP addresses: 您将在BufferedReader上使用readLine() ,直到它返回null为止,并将读取的每一行附加到IP地址列表中:

List<String> ipList = new ArrayList<>();
try (BufferedReader reader = new BufferedReader(new InputStreamReader(connection.getInputStream(), theAppropriateEncoding))) {
    String line;
    while ((line = reader.readLine()) != null) {
        ipList.add(line);
    }
}

Note that this probably won't change much in the performance of the method, though, because most of the time is being spend in waiting fof bytes coming from the remote host, which is considerably slower than building and splitting a String in memory. 注意,这可能不会改变方法的性能,因为大部分时间都花在等待来自远程主机的fof字节上,这比在内存中建立和拆分String慢得多。

split method from String isn't the fastest way to separate all the IP's. 从String拆分方法不是分离所有IP的最快方法。 There ara other libraries to achive this in an more optimized way. 还有其他一些库可以以更优化的方式实现这一目标。 Read this: http://demeranville.com/battle-of-the-tokenizers-delimited-text-parser-performance/ 阅读以下内容: http : //demeranville.com/battle-of-the-tokenizers-delimited-text-parser-performance/

There is a very nice time comparision about 7 different ways to split a String. 关于拆分字符串的7种不同方式,时间比较不错。

For example a the Splitter class from Guava library returns an Iterable, and with Guava you could also convert the results as List: 例如,来自Guava库的Splitter类返回一个Iterable,并且使用Guava您还可以将结果转换为List:

import com.google.common.base.Splitter;
...
public static void fetchProxies(String url) {
Connection c = new Connection();
List<Proxy> tempProxy = new ArrayList<Proxy>();
ByteArrayOutputStream baos = 
        c.requestBytes(url);  
String line = new String(baos.toByteArray()); 

Iterator<Element> myIterator = 
    Splitter.on(System.getProperty("line.separator")).split(line);
List<Element> myList = Lists.newArrayList(myIterator);

// do somethjing with the List...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM