簡體   English   中英

Java無法讀取完整文件

[英]Java Unable to read full file

我需要一些問題的幫助。 我試圖從文本文件加載我的2000代理列表,但我的類只填充1040個數組索引與每行讀取的內容。

我不知道該怎么做。 :(

import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;

public class ProxyLoader {

private String[] lineSplit = new String[100000];
private static String[] addresses = new String[100000];
private static int[] ports = new int[100000];
public int i = 0;

public ProxyLoader() {
    readData();
}

public synchronized String getAddr(int i) {
    return this.addresses[i];
}

public synchronized int getPort(int i) {
    return this.ports[i];
}

public synchronized void readData() {
    try {
        BufferedReader br = new BufferedReader(
                new FileReader("./proxy.txt"));
        String line = "";

        try {
            while ((line = br.readLine()) != null) {

                lineSplit = line.split(":");
                i++;

                addresses[i] = lineSplit[0];
                ports[i] = Integer.parseInt(lineSplit[1]);
                System.out.println("Line Number [" + i + "]  Adr: "
                        + addresses[i] + " Port: " + ports[i]);
            }

            for (String s : addresses) {
                if (s == null) {
                    s = "127.0.0.1";
                }
            }

            for (int x : ports) {
                if (x == 0) {
                    x = 8080;
                }
            }

        } catch (IOException e) {
            e.printStackTrace();
        }
    } catch (FileNotFoundException e) {
        e.printStackTrace();
    }
}

}

讓我們從整理你的代碼開始,有很多問題可能會給你帶來麻煩。 但是,如果沒有代理文件的相關部分,我們就無法測試或復制您所看到的行為。 考慮創建和發布SSCCE ,而不僅僅是代碼片段。

  1. 正確縮進/格式化您的代碼。
  2. 這些方法不需要(不應該) synchronized - 在多線程環境中從數組中讀取是安全的,你永遠不應該在不同的線程上構建多個ProxyLoader實例,這意味着readData()上的synchronized只是簡單的浪費。
  3. 創建海量數組是存儲這些數據的一種非常糟糕的方式 - 分配那么多額外的內存是浪費的,如果加載的文件恰好大於你設置的常量,你的代碼現在會失敗。 使用可伸縮的數據結構,例如ArrayListMap
  4. 您將地址和端口存儲在單獨的數組中,使用一個對象來保存這兩個值將節省內存並防止數據不一致。
  5. 你的public int i變量是危險的 - 大概是你用它來表示加載的最大行數,但是應該避免使用size()方法 - 作為一個公共實例變量,任何使用該類的人都可以改變它值, i是變量的不良名稱, max是更好的選擇。
  6. 你可能不希望readData()是公共的,因為多次調用它會做很奇怪的事情(它會再次加載文件,從i開始,用重復數據填充數組)。 最好的想法是直接在構造函數中加載數據(或者在構造函數調用的private方法中),這樣文件只會為每個創建的ProxyLoader實例加載一次。
  7. 你正在創建一個龐大的空數組lineSplit ,然后用String.split()的結果替換它。 這是令人困惑和浪費,使用局部變量來保持分割線。
  8. 您在讀取文件后沒有關閉該文件,這可能導致內存泄漏或與數據的其他不一致。 使用try-with-resources語法有助於簡化這一過程。
  9. 在填充它們之后循環遍歷整個字符串和端口數組,用剩余的基本噪聲填充剩余的插槽。 目前還不清楚你要通過這樣做完成什么,但我確定這是一個糟糕的計划。

我建議以下實現:

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Iterator;

public class ProxyLoader implements Iterable<ProxyLoader.Proxy> {
  // Remove DEFAULT_PROXY if not needed
  private static final Proxy DEFAULT_PROXY = new Proxy("127.0.0.1", 8080);
  private static final String DATA_FILE = "./proxy.txt";
  private ArrayList<Proxy> proxyList = new ArrayList<>();

  public ProxyLoader() {
    // Try-with-resources ensures file is closed safely and cleanly
    try(BufferedReader br = new BufferedReader(new FileReader(DATA_FILE))) {
      String line;
      while ((line = br.readLine()) != null) {
        String[] lineSplit = line.split(":");
        Proxy p = new Proxy(lineSplit[0], Integer.parseInt(lineSplit[1]));
        proxyList.add(p);
      }
    } catch (IOException e) {
      System.err.println("Failed to open/read "+DATA_FILE);
      e.printStackTrace(System.err);
    }
  }

  // If you request a positive index larger than the size of the file, it will return
  // DEFAULT_PROXY, since that's the behavior your original implementation
  // essentially did.  I'd suggest deleting DEFAULT_PROXY, having this method simply
  // return proxyList.get(i), and letting it fail if you request an invalid index.
  public Proxy getProxy(int i) {
    if(i < proxyList.size()) {
      return proxyList.get(i);
    } else {
      return DEFAULT_PROXY;
    }
  }

  // Lets you safely get the maximum index, without exposing the list directly
  public int getSize() {
    return proxyList.size();
  }

  // lets you run for(Proxy p : proxyLoader) { ... }
  @Override
  public Iterator<Proxy> iterator() {
    return proxyList.iterator();
  }

  // Inner static class just to hold data
  // can be pulled out into its own file if you prefer
  public static class Proxy {
    // note these values are public; since they're final, this is safe.
    // Using getters is more standard, but it adds a lot of boilerplate code
    // somewhat needlessly; for a simple case like this, public final should be fine.
    public final String address;
    public int port;

    public Proxy(String a, int p) {
      address = a;
      port = p;
    }
  }
}

我已經包含了一些可能不完全適合您的用例的示例,但是展示了一些編寫代碼的方法,這些代碼更易於維護和閱讀。

代碼難以閱讀,難以調試和維護。

  • 對象需要驗證其輸入(構造函數args)。
  • 拒絕不良數據。 調試時更難以快速失敗。
  • 除非你可以恢復,否則永遠不要捕捉 要么軟化它(在運行時包裝並重新拋出它),要么將它添加到throws子句中。 如果你不知道該怎么做,什么都不做。
  • 永遠不要吃異常。 重新拋出或處理它。
  • 您的代碼保持不需要的狀態。
  • 類比兩個gak數組更自我描述。
  • 避免公共場合。 除非他們是最終的。
  • 保護對象的狀態。
  • 想想你的方法將如何被調用,並避免副作用。 兩次調用readData會導致難以調試的副作用
  • 記憶很便宜,但不是免費的。 不要實例化您不需要的大型數組。
  • 如果你打開它,你必須關閉它。

Java 7和8允許您從FileSystem讀取行,因此無需編寫大部分代碼來開始:

 Path thePath = FileSystems.getDefault().getPath(location);
 return Files.readAllLines(thePath, Charset.forName("UTF-8"));

如果您必須將大量小文件讀入行並且不想使用FileSystem或者您使用的是Java 6或Java 5,那么您將創建一個實用程序類,如下所示:

public class IOUtils {

    public final static String CHARSET = "UTF-8";

...

public static List<String> readLines(File file) {
    try (FileReader reader = new FileReader(file)) {
        return readLines(reader);
    } catch (Exception ex) {
        return Exceptions.handle(List.class, ex);
    }
}

調用帶讀取器的readLines:

public static List<String> readLines(Reader reader) {

    try (BufferedReader bufferedReader = new BufferedReader(reader)) {
          return readLines(bufferedReader);
    } catch (Exception ex) {
        return Exceptions.handle(List.class, ex);
    }
}

調用帶有BufferedReader的readLines:

public static List<String> readLines(BufferedReader reader) {
    List<String> lines = new ArrayList<>(80);

    try (BufferedReader bufferedReader = reader) {


        String line = null;
        while ( (line = bufferedReader.readLine()) != null) {
        lines.add(line);
        }

    } catch (Exception ex) {

        return Exceptions.handle(List.class, ex);
    }
    return lines;
}

Apache有一組名為Apache commons的實用程序( http://commons.apache.org/ )。 它包括lang,它包括IO utils( http://commons.apache.org/proper/commons-io/ )。 如果您使用Java 5或Java 6,那么這些中的任何一個都會很好。

回到我們的示例,您可以將任何位置轉換為行列表:

public static List<String> readLines(String location) {
    URI uri =  URI.create(location);

    try {

        if ( uri.getScheme()==null ) {

            Path thePath = FileSystems.getDefault().getPath(location);
            return Files.readAllLines(thePath, Charset.forName("UTF-8"));

        } else if ( uri.getScheme().equals("file") ) {

            Path thePath = FileSystems.getDefault().getPath(uri.getPath());
            return Files.readAllLines(thePath, Charset.forName("UTF-8"));

        } else {
            return readLines(location, uri);
        }

    } catch (Exception ex) {
         return Exceptions.handle(List.class, ex);
    }

}

FileSystem,Path,URI等都在JDK中。

繼續這個例子:

private static List<String> readLines(String location, URI uri) throws Exception {
    try {

        FileSystem fileSystem = FileSystems.getFileSystem(uri);
        Path fsPath = fileSystem.getPath(location);
        return Files.readAllLines(fsPath, Charset.forName("UTF-8"));

    } catch (ProviderNotFoundException ex) {
         return readLines(uri.toURL().openStream());
    }
}

上面嘗試從FileSystem讀取uri,如果它無法加載它,那么它會通過URL流查找它。 URL,URI,文件,文件系統等都是JDK的一部分。

要將URL流轉換為Reader,然后轉換為字符串,我們使用:

public static List<String> readLines(InputStream is) {

    try (Reader reader = new InputStreamReader(is, CHARSET)) {

        return readLines(reader);

    } catch (Exception ex) {

        return Exceptions.handle(List.class, ex);
    }
}

:)

現在讓我們回到我們的例子(我們現在可以從包括文件在內的任何地方讀取行):

public static final class Proxy {
    private final String address;
    private final int port;
    private static final String DATA_FILE = "./files/proxy.txt";

    private static final Pattern addressPattern = Pattern.compile("^(\\d{1,3}[.]{1}){3}[0-9]{1,3}$");

    private Proxy(String address, int port) {

        /* Validate address in not null.*/
        Objects.requireNonNull(address, "address should not be null");

        /* Validate port is in range. */
        if (port < 1 || port > 65535) {
            throw new IllegalArgumentException("Port is not in range port=" + port);
        }

        /* Validate address is of the form 123.12.1.5 .*/
        if (!addressPattern.matcher(address).matches()) {
            throw new IllegalArgumentException("Invalid Inet address");
        }

        /* Now initialize our address and port. */
        this.address = address;
        this.port = port;
    }

    private static Proxy createProxy(String line) {
        String[] lineSplit = line.split(":");
        String address = lineSplit[0];
        int port =  parseInt(lineSplit[1]);
        return new Proxy(address, port);
    }

    public final String getAddress() {
        return address;
    }

    public final int getPort() {
        return port;
    }

    public static List<Proxy> loadProxies() {
        List <String> lines = IOUtils.readLines(DATA_FILE);
        List<Proxy> proxyList  = new ArrayList<>(lines.size());

        for (String line : lines) {
            proxyList.add(createProxy(line));
        }
        return proxyList;
    }

}

請注意,我們沒有任何不可變狀態。 這可以防止錯誤。 它使您的代碼更容易調試和支持。

注意我們的IOUtils.readLines讀取文件系統中的行。

注意構造函數中的額外工作,以確保沒有人初始化具有錯誤狀態的Proxy實例。 這些都在JDK對象,模式等中。

如果你想要一個可重用的ProxyLoader,它看起來像這樣:

public static class ProxyLoader {
    private static final String DATA_FILE = "./files/proxy.txt";


    private List<Proxy> proxyList = Collections.EMPTY_LIST;
    private final String dataFile;

    public ProxyLoader() {
        this.dataFile = DATA_FILE;
        init();
    }

    public ProxyLoader(String dataFile) {
        this.dataFile = DATA_FILE;
        init();
    }

    private void init() {
        List <String> lines = IO.readLines(dataFile);
        proxyList = new ArrayList<>(lines.size());

        for (String line : lines) {
            proxyList.add(Proxy.createProxy(line));
        }
    }

    public String getDataFile() {
        return this.dataFile;
    }

    public static List<Proxy> loadProxies() {
            return new ProxyLoader().getProxyList();
    }

    public List<Proxy> getProxyList() {
        return proxyList;
    }
   ...

}

public static class Proxy {
    private final String address;
    private final int port;

    ...

    public Proxy(String address, int port) {
        ... 
        this.address = address;
        this.port = port;
    }

    public static Proxy createProxy(String line) {
        String[] lineSplit = line.split(":");
        String address = lineSplit[0];
        int port =  parseInt(lineSplit[1]);
        return new Proxy(address, port);
    }

    public String getAddress() {
        return address;
    }

    public int getPort() {
        return port;
    }
}

編碼很棒。 測試是神聖的! 以下是該示例的一些測試。

public static class ProxyLoader {
    private static final String DATA_FILE = "./files/proxy.txt";


    private List<Proxy> proxyList = Collections.EMPTY_LIST;
    private final String dataFile;

    public ProxyLoader() {
        this.dataFile = DATA_FILE;
        init();
    }

    public ProxyLoader(String dataFile) {
        this.dataFile = DATA_FILE;
        init();
    }

    private void init() {
        List <String> lines = IO.readLines(dataFile);
        proxyList = new ArrayList<>(lines.size());

        for (String line : lines) {
            proxyList.add(Proxy.createProxy(line));
        }
    }

    public String getDataFile() {
        return this.dataFile;
    }

    public static List<Proxy> loadProxies() {
            return new ProxyLoader().getProxyList();
    }

    public List<Proxy> getProxyList() {
        return proxyList;
    }

}

public static class Proxy {
    private final String address;
    private final int port;

    public Proxy(String address, int port) {
        this.address = address;
        this.port = port;
    }

    public static Proxy createProxy(String line) {
        String[] lineSplit = line.split(":");
        String address = lineSplit[0];
        int port =  parseInt(lineSplit[1]);
        return new Proxy(address, port);
    }

    public String getAddress() {
        return address;
    }

    public int getPort() {
        return port;
    }
}

這是一個類中的替代選擇。 (我在ProxyLoader中沒有看到太多意義)。

public static final class Proxy2 {
    private final String address;
    private final int port;
    private static final String DATA_FILE = "./files/proxy.txt";

    private static final Pattern addressPattern = Pattern.compile("^(\\d{1,3}[.]{1}){3}[0-9]{1,3}$");

    private Proxy2(String address, int port) {

        /* Validate address in not null.*/
        Objects.requireNonNull(address, "address should not be null");

        /* Validate port is in range. */
        if (port < 1 || port > 65535) {
            throw new IllegalArgumentException("Port is not in range port=" + port);
        }

        /* Validate address is of the form 123.12.1.5 .*/
        if (!addressPattern.matcher(address).matches()) {
            throw new IllegalArgumentException("Invalid Inet address");
        }

        /* Now initialize our address and port. */
        this.address = address;
        this.port = port;
    }

    private static Proxy2 createProxy(String line) {
        String[] lineSplit = line.split(":");
        String address = lineSplit[0];
        int port =  parseInt(lineSplit[1]);
        return new Proxy2(address, port);
    }

    public final String getAddress() {
        return address;
    }

    public final int getPort() {
        return port;
    }

    public static List<Proxy2> loadProxies() {
        List <String> lines = IO.readLines(DATA_FILE);
        List<Proxy2> proxyList  = new ArrayList<>(lines.size());

        for (String line : lines) {
            proxyList.add(createProxy(line));
        }
        return proxyList;
    }

}

現在我們編寫測試(測試和TDD幫助您解決這些問題):

@Test public void proxyTest() {
    List<Proxy> proxyList = ProxyLoader.loadProxies();
    assertEquals(
            5, len(proxyList)
    );


    assertEquals(
            "127.0.0.1", idx(proxyList, 0).getAddress()
    );



    assertEquals(
            8080, idx(proxyList, 0).getPort()
    );


    //192.55.55.57:9091
    assertEquals(
            "192.55.55.57", idx(proxyList, -1).getAddress()
    );



    assertEquals(
            9091, idx(proxyList, -1).getPort()
    );


}

idx等在我自己的helper lib中定義,名為boon。 idx方法的工作方式類似於Python或Ruby切片表示法。

@Test public void proxyTest2() {
    List<Proxy2> proxyList = Proxy2.loadProxies();
    assertEquals(
            5, len(proxyList)
    );


    assertEquals(
            "127.0.0.1", idx(proxyList, 0).getAddress()
    );



    assertEquals(
            8080, idx(proxyList, 0).getPort()
    );


    //192.55.55.57:9091
    assertEquals(
            "192.55.55.57", idx(proxyList, -1).getAddress()
    );



    assertEquals(
            9091, idx(proxyList, -1).getPort()
    );


}

我的輸入文件

127.0.0.1:8080
192.55.55.55:9090
127.0.0.2:8080
192.55.55.56:9090
192.55.55.57:9091

那我的IOUtils(實際上叫IO)呢:

這是對那些關心IO(utils)的人的測試:

package org.boon.utils;

import com.sun.net.httpserver.Headers;
import com.sun.net.httpserver.HttpExchange;
import com.sun.net.httpserver.HttpHandler;
import com.sun.net.httpserver.HttpServer;
import org.junit.Test;

import java.io.File;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.net.InetSocketAddress;
import java.net.URI;
import java.util.*;
import java.util.regex.Pattern;

import static javax.lang.Integer.parseInt;
import static org.boon.utils.Lists.idx;
import static org.boon.utils.Lists.len;
import static org.boon.utils.Maps.copy;
import static org.boon.utils.Maps.map;
import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertTrue;

...

這讓您了解所涉及的進口。

public class IOTest {

....

這是一個從文件系統上的文件中讀取行的測試。

@Test
public void testReadLines() {
    File testDir = new File("src/test/resources");
    File testFile = new File(testDir, "testfile.txt");


    List<String> lines = IO.readLines(testFile);

    assertLines(lines);

}

這是一個幫助方法斷言文件已正確讀取。

private void assertLines(List<String> lines) {

    assertEquals(
            4, len(lines)
    );


    assertEquals(
            "line 1", idx(lines, 0)
    );



    assertEquals(
            "grapes", idx(lines, 3)
    );
}

這是一個測試,顯示從String路徑讀取文件。

@Test
public void testReadLinesFromPath() {


    List<String> lines = IO.readLines("src/test/resources/testfile.txt");

    assertLines(lines);



}

此測試顯示從URI讀取文件。

@Test
public void testReadLinesURI() {

    File testDir = new File("src/test/resources");
    File testFile = new File(testDir, "testfile.txt");
    URI uri = testFile.toURI();


    //"file:///....src/test/resources/testfile.txt"
    List<String> lines = IO.readLines(uri.toString());
    assertLines(lines);


}

這是一個測試,顯示您可以從HTTP服務器讀取文件中的行:

static class MyHandler implements HttpHandler {
    public void handle(HttpExchange t) throws IOException {

        File testDir = new File("src/test/resources");
        File testFile = new File(testDir, "testfile.txt");
        String body = IO.read(testFile);
        t.sendResponseHeaders(200, body.length());
        OutputStream os = t.getResponseBody();
        os.write(body.getBytes(IO.CHARSET));
        os.close();
    }
}

這是HTTP服務器測試(它實例化HTTP服務器)。

@Test
public void testReadFromHttp() throws Exception {

    HttpServer server = HttpServer.create(new InetSocketAddress(9666), 0);
    server.createContext("/test", new MyHandler());
    server.setExecutor(null); // creates a default executor
    server.start();

    Thread.sleep(1000);

    List<String> lines = IO.readLines("http://localhost:9666/test");
    assertLines(lines);

}

這是代理緩存測試:

public static class ProxyLoader {
    private static final String DATA_FILE = "./files/proxy.txt";


    private List<Proxy> proxyList = Collections.EMPTY_LIST;
    private final String dataFile;

    public ProxyLoader() {
        this.dataFile = DATA_FILE;
        init();
    }

    public ProxyLoader(String dataFile) {
        this.dataFile = DATA_FILE;
        init();
    }

    private void init() {
        List <String> lines = IO.readLines(dataFile);
        proxyList = new ArrayList<>(lines.size());

        for (String line : lines) {
            proxyList.add(Proxy.createProxy(line));
        }
    }

    public String getDataFile() {
        return this.dataFile;
    }

    public static List<Proxy> loadProxies() {
            return new ProxyLoader().getProxyList();
    }

    public List<Proxy> getProxyList() {
        return proxyList;
    }

}

public static class Proxy {
    private final String address;
    private final int port;

    public Proxy(String address, int port) {
        this.address = address;
        this.port = port;
    }

    public static Proxy createProxy(String line) {
        String[] lineSplit = line.split(":");
        String address = lineSplit[0];
        int port =  parseInt(lineSplit[1]);
        return new Proxy(address, port);
    }

    public String getAddress() {
        return address;
    }

    public int getPort() {
        return port;
    }
}


public static final class Proxy2 {
    private final String address;
    private final int port;
    private static final String DATA_FILE = "./files/proxy.txt";

    private static final Pattern addressPattern = Pattern.compile("^(\\d{1,3}[.]{1}){3}[0-9]{1,3}$");

    private Proxy2(String address, int port) {

        /* Validate address in not null.*/
        Objects.requireNonNull(address, "address should not be null");

        /* Validate port is in range. */
        if (port < 1 || port > 65535) {
            throw new IllegalArgumentException("Port is not in range port=" + port);
        }

        /* Validate address is of the form 123.12.1.5 .*/
        if (!addressPattern.matcher(address).matches()) {
            throw new IllegalArgumentException("Invalid Inet address");
        }

        /* Now initialize our address and port. */
        this.address = address;
        this.port = port;
    }

    private static Proxy2 createProxy(String line) {
        String[] lineSplit = line.split(":");
        String address = lineSplit[0];
        int port =  parseInt(lineSplit[1]);
        return new Proxy2(address, port);
    }

    public final String getAddress() {
        return address;
    }

    public final int getPort() {
        return port;
    }

    public static List<Proxy2> loadProxies() {
        List <String> lines = IO.readLines(DATA_FILE);
        List<Proxy2> proxyList  = new ArrayList<>(lines.size());

        for (String line : lines) {
            proxyList.add(createProxy(line));
        }
        return proxyList;
    }

}

@Test public void proxyTest() {
    List<Proxy> proxyList = ProxyLoader.loadProxies();
    assertEquals(
            5, len(proxyList)
    );


    assertEquals(
            "127.0.0.1", idx(proxyList, 0).getAddress()
    );



    assertEquals(
            8080, idx(proxyList, 0).getPort()
    );


    //192.55.55.57:9091
    assertEquals(
            "192.55.55.57", idx(proxyList, -1).getAddress()
    );



    assertEquals(
            9091, idx(proxyList, -1).getPort()
    );


}

這是實際的代理緩存測試:

@Test public void proxyTest2() {
    List<Proxy2> proxyList = Proxy2.loadProxies();
    assertEquals(
            5, len(proxyList)
    );


    assertEquals(
            "127.0.0.1", idx(proxyList, 0).getAddress()
    );



    assertEquals(
            8080, idx(proxyList, 0).getPort()
    );


    //192.55.55.57:9091
    assertEquals(
            "192.55.55.57", idx(proxyList, -1).getAddress()
    );



    assertEquals(
            9091, idx(proxyList, -1).getPort()
    );


}

}

您可以在此處查看此示例的所有源代碼和此實用程序類:

https://github.com/RichardHightower/boon

https://github.com/RichardHightower/boon/blob/master/src/main/java/org/boon/utils/IO.java

或者來看我:

http://rick-hightower.blogspot.com/

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM