[英]Java Unable to read full file
我需要一些問題的幫助。 我試圖從文本文件加載我的2000代理列表,但我的類只填充1040個數組索引與每行讀取的內容。
我不知道該怎么做。 :(
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
public class ProxyLoader {
private String[] lineSplit = new String[100000];
private static String[] addresses = new String[100000];
private static int[] ports = new int[100000];
public int i = 0;
public ProxyLoader() {
readData();
}
public synchronized String getAddr(int i) {
return this.addresses[i];
}
public synchronized int getPort(int i) {
return this.ports[i];
}
public synchronized void readData() {
try {
BufferedReader br = new BufferedReader(
new FileReader("./proxy.txt"));
String line = "";
try {
while ((line = br.readLine()) != null) {
lineSplit = line.split(":");
i++;
addresses[i] = lineSplit[0];
ports[i] = Integer.parseInt(lineSplit[1]);
System.out.println("Line Number [" + i + "] Adr: "
+ addresses[i] + " Port: " + ports[i]);
}
for (String s : addresses) {
if (s == null) {
s = "127.0.0.1";
}
}
for (int x : ports) {
if (x == 0) {
x = 8080;
}
}
} catch (IOException e) {
e.printStackTrace();
}
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
}
讓我們從整理你的代碼開始,有很多問題可能會給你帶來麻煩。 但是,如果沒有代理文件的相關部分,我們就無法測試或復制您所看到的行為。 考慮創建和發布SSCCE ,而不僅僅是代碼片段。
synchronized
- 在多線程環境中從數組中讀取是安全的,你永遠不應該在不同的線程上構建多個ProxyLoader
實例,這意味着readData()
上的synchronized
只是簡單的浪費。 ArrayList
或Map
。 public int i
變量是危險的 - 大概是你用它來表示加載的最大行數,但是應該避免使用size()
方法 - 作為一個公共實例變量,任何使用該類的人都可以改變它值, i
是變量的不良名稱, max
是更好的選擇。 readData()
是公共的,因為多次調用它會做很奇怪的事情(它會再次加載文件,從i
開始,用重復數據填充數組)。 最好的想法是直接在構造函數中加載數據(或者在構造函數調用的private
方法中),這樣文件只會為每個創建的ProxyLoader
實例加載一次。 lineSplit
,然后用String.split()
的結果替換它。 這是令人困惑和浪費,使用局部變量來保持分割線。 我建議以下實現:
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Iterator;
public class ProxyLoader implements Iterable<ProxyLoader.Proxy> {
// Remove DEFAULT_PROXY if not needed
private static final Proxy DEFAULT_PROXY = new Proxy("127.0.0.1", 8080);
private static final String DATA_FILE = "./proxy.txt";
private ArrayList<Proxy> proxyList = new ArrayList<>();
public ProxyLoader() {
// Try-with-resources ensures file is closed safely and cleanly
try(BufferedReader br = new BufferedReader(new FileReader(DATA_FILE))) {
String line;
while ((line = br.readLine()) != null) {
String[] lineSplit = line.split(":");
Proxy p = new Proxy(lineSplit[0], Integer.parseInt(lineSplit[1]));
proxyList.add(p);
}
} catch (IOException e) {
System.err.println("Failed to open/read "+DATA_FILE);
e.printStackTrace(System.err);
}
}
// If you request a positive index larger than the size of the file, it will return
// DEFAULT_PROXY, since that's the behavior your original implementation
// essentially did. I'd suggest deleting DEFAULT_PROXY, having this method simply
// return proxyList.get(i), and letting it fail if you request an invalid index.
public Proxy getProxy(int i) {
if(i < proxyList.size()) {
return proxyList.get(i);
} else {
return DEFAULT_PROXY;
}
}
// Lets you safely get the maximum index, without exposing the list directly
public int getSize() {
return proxyList.size();
}
// lets you run for(Proxy p : proxyLoader) { ... }
@Override
public Iterator<Proxy> iterator() {
return proxyList.iterator();
}
// Inner static class just to hold data
// can be pulled out into its own file if you prefer
public static class Proxy {
// note these values are public; since they're final, this is safe.
// Using getters is more standard, but it adds a lot of boilerplate code
// somewhat needlessly; for a simple case like this, public final should be fine.
public final String address;
public int port;
public Proxy(String a, int p) {
address = a;
port = p;
}
}
}
我已經包含了一些可能不完全適合您的用例的示例,但是展示了一些編寫代碼的方法,這些代碼更易於維護和閱讀。
代碼難以閱讀,難以調試和維護。
Java 7和8允許您從FileSystem讀取行,因此無需編寫大部分代碼來開始:
Path thePath = FileSystems.getDefault().getPath(location);
return Files.readAllLines(thePath, Charset.forName("UTF-8"));
如果您必須將大量小文件讀入行並且不想使用FileSystem或者您使用的是Java 6或Java 5,那么您將創建一個實用程序類,如下所示:
public class IOUtils {
public final static String CHARSET = "UTF-8";
...
public static List<String> readLines(File file) {
try (FileReader reader = new FileReader(file)) {
return readLines(reader);
} catch (Exception ex) {
return Exceptions.handle(List.class, ex);
}
}
調用帶讀取器的readLines:
public static List<String> readLines(Reader reader) {
try (BufferedReader bufferedReader = new BufferedReader(reader)) {
return readLines(bufferedReader);
} catch (Exception ex) {
return Exceptions.handle(List.class, ex);
}
}
調用帶有BufferedReader的readLines:
public static List<String> readLines(BufferedReader reader) {
List<String> lines = new ArrayList<>(80);
try (BufferedReader bufferedReader = reader) {
String line = null;
while ( (line = bufferedReader.readLine()) != null) {
lines.add(line);
}
} catch (Exception ex) {
return Exceptions.handle(List.class, ex);
}
return lines;
}
Apache有一組名為Apache commons的實用程序( http://commons.apache.org/ )。 它包括lang,它包括IO utils( http://commons.apache.org/proper/commons-io/ )。 如果您使用Java 5或Java 6,那么這些中的任何一個都會很好。
回到我們的示例,您可以將任何位置轉換為行列表:
public static List<String> readLines(String location) {
URI uri = URI.create(location);
try {
if ( uri.getScheme()==null ) {
Path thePath = FileSystems.getDefault().getPath(location);
return Files.readAllLines(thePath, Charset.forName("UTF-8"));
} else if ( uri.getScheme().equals("file") ) {
Path thePath = FileSystems.getDefault().getPath(uri.getPath());
return Files.readAllLines(thePath, Charset.forName("UTF-8"));
} else {
return readLines(location, uri);
}
} catch (Exception ex) {
return Exceptions.handle(List.class, ex);
}
}
FileSystem,Path,URI等都在JDK中。
繼續這個例子:
private static List<String> readLines(String location, URI uri) throws Exception {
try {
FileSystem fileSystem = FileSystems.getFileSystem(uri);
Path fsPath = fileSystem.getPath(location);
return Files.readAllLines(fsPath, Charset.forName("UTF-8"));
} catch (ProviderNotFoundException ex) {
return readLines(uri.toURL().openStream());
}
}
上面嘗試從FileSystem讀取uri,如果它無法加載它,那么它會通過URL流查找它。 URL,URI,文件,文件系統等都是JDK的一部分。
要將URL流轉換為Reader,然后轉換為字符串,我們使用:
public static List<String> readLines(InputStream is) {
try (Reader reader = new InputStreamReader(is, CHARSET)) {
return readLines(reader);
} catch (Exception ex) {
return Exceptions.handle(List.class, ex);
}
}
:)
現在讓我們回到我們的例子(我們現在可以從包括文件在內的任何地方讀取行):
public static final class Proxy {
private final String address;
private final int port;
private static final String DATA_FILE = "./files/proxy.txt";
private static final Pattern addressPattern = Pattern.compile("^(\\d{1,3}[.]{1}){3}[0-9]{1,3}$");
private Proxy(String address, int port) {
/* Validate address in not null.*/
Objects.requireNonNull(address, "address should not be null");
/* Validate port is in range. */
if (port < 1 || port > 65535) {
throw new IllegalArgumentException("Port is not in range port=" + port);
}
/* Validate address is of the form 123.12.1.5 .*/
if (!addressPattern.matcher(address).matches()) {
throw new IllegalArgumentException("Invalid Inet address");
}
/* Now initialize our address and port. */
this.address = address;
this.port = port;
}
private static Proxy createProxy(String line) {
String[] lineSplit = line.split(":");
String address = lineSplit[0];
int port = parseInt(lineSplit[1]);
return new Proxy(address, port);
}
public final String getAddress() {
return address;
}
public final int getPort() {
return port;
}
public static List<Proxy> loadProxies() {
List <String> lines = IOUtils.readLines(DATA_FILE);
List<Proxy> proxyList = new ArrayList<>(lines.size());
for (String line : lines) {
proxyList.add(createProxy(line));
}
return proxyList;
}
}
請注意,我們沒有任何不可變狀態。 這可以防止錯誤。 它使您的代碼更容易調試和支持。
注意我們的IOUtils.readLines讀取文件系統中的行。
注意構造函數中的額外工作,以確保沒有人初始化具有錯誤狀態的Proxy實例。 這些都在JDK對象,模式等中。
如果你想要一個可重用的ProxyLoader,它看起來像這樣:
public static class ProxyLoader {
private static final String DATA_FILE = "./files/proxy.txt";
private List<Proxy> proxyList = Collections.EMPTY_LIST;
private final String dataFile;
public ProxyLoader() {
this.dataFile = DATA_FILE;
init();
}
public ProxyLoader(String dataFile) {
this.dataFile = DATA_FILE;
init();
}
private void init() {
List <String> lines = IO.readLines(dataFile);
proxyList = new ArrayList<>(lines.size());
for (String line : lines) {
proxyList.add(Proxy.createProxy(line));
}
}
public String getDataFile() {
return this.dataFile;
}
public static List<Proxy> loadProxies() {
return new ProxyLoader().getProxyList();
}
public List<Proxy> getProxyList() {
return proxyList;
}
...
}
public static class Proxy {
private final String address;
private final int port;
...
public Proxy(String address, int port) {
...
this.address = address;
this.port = port;
}
public static Proxy createProxy(String line) {
String[] lineSplit = line.split(":");
String address = lineSplit[0];
int port = parseInt(lineSplit[1]);
return new Proxy(address, port);
}
public String getAddress() {
return address;
}
public int getPort() {
return port;
}
}
編碼很棒。 測試是神聖的! 以下是該示例的一些測試。
public static class ProxyLoader {
private static final String DATA_FILE = "./files/proxy.txt";
private List<Proxy> proxyList = Collections.EMPTY_LIST;
private final String dataFile;
public ProxyLoader() {
this.dataFile = DATA_FILE;
init();
}
public ProxyLoader(String dataFile) {
this.dataFile = DATA_FILE;
init();
}
private void init() {
List <String> lines = IO.readLines(dataFile);
proxyList = new ArrayList<>(lines.size());
for (String line : lines) {
proxyList.add(Proxy.createProxy(line));
}
}
public String getDataFile() {
return this.dataFile;
}
public static List<Proxy> loadProxies() {
return new ProxyLoader().getProxyList();
}
public List<Proxy> getProxyList() {
return proxyList;
}
}
public static class Proxy {
private final String address;
private final int port;
public Proxy(String address, int port) {
this.address = address;
this.port = port;
}
public static Proxy createProxy(String line) {
String[] lineSplit = line.split(":");
String address = lineSplit[0];
int port = parseInt(lineSplit[1]);
return new Proxy(address, port);
}
public String getAddress() {
return address;
}
public int getPort() {
return port;
}
}
這是一個類中的替代選擇。 (我在ProxyLoader中沒有看到太多意義)。
public static final class Proxy2 {
private final String address;
private final int port;
private static final String DATA_FILE = "./files/proxy.txt";
private static final Pattern addressPattern = Pattern.compile("^(\\d{1,3}[.]{1}){3}[0-9]{1,3}$");
private Proxy2(String address, int port) {
/* Validate address in not null.*/
Objects.requireNonNull(address, "address should not be null");
/* Validate port is in range. */
if (port < 1 || port > 65535) {
throw new IllegalArgumentException("Port is not in range port=" + port);
}
/* Validate address is of the form 123.12.1.5 .*/
if (!addressPattern.matcher(address).matches()) {
throw new IllegalArgumentException("Invalid Inet address");
}
/* Now initialize our address and port. */
this.address = address;
this.port = port;
}
private static Proxy2 createProxy(String line) {
String[] lineSplit = line.split(":");
String address = lineSplit[0];
int port = parseInt(lineSplit[1]);
return new Proxy2(address, port);
}
public final String getAddress() {
return address;
}
public final int getPort() {
return port;
}
public static List<Proxy2> loadProxies() {
List <String> lines = IO.readLines(DATA_FILE);
List<Proxy2> proxyList = new ArrayList<>(lines.size());
for (String line : lines) {
proxyList.add(createProxy(line));
}
return proxyList;
}
}
現在我們編寫測試(測試和TDD幫助您解決這些問題):
@Test public void proxyTest() {
List<Proxy> proxyList = ProxyLoader.loadProxies();
assertEquals(
5, len(proxyList)
);
assertEquals(
"127.0.0.1", idx(proxyList, 0).getAddress()
);
assertEquals(
8080, idx(proxyList, 0).getPort()
);
//192.55.55.57:9091
assertEquals(
"192.55.55.57", idx(proxyList, -1).getAddress()
);
assertEquals(
9091, idx(proxyList, -1).getPort()
);
}
idx等在我自己的helper lib中定義,名為boon。 idx方法的工作方式類似於Python或Ruby切片表示法。
@Test public void proxyTest2() {
List<Proxy2> proxyList = Proxy2.loadProxies();
assertEquals(
5, len(proxyList)
);
assertEquals(
"127.0.0.1", idx(proxyList, 0).getAddress()
);
assertEquals(
8080, idx(proxyList, 0).getPort()
);
//192.55.55.57:9091
assertEquals(
"192.55.55.57", idx(proxyList, -1).getAddress()
);
assertEquals(
9091, idx(proxyList, -1).getPort()
);
}
我的輸入文件
127.0.0.1:8080
192.55.55.55:9090
127.0.0.2:8080
192.55.55.56:9090
192.55.55.57:9091
那我的IOUtils(實際上叫IO)呢:
這是對那些關心IO(utils)的人的測試:
package org.boon.utils;
import com.sun.net.httpserver.Headers;
import com.sun.net.httpserver.HttpExchange;
import com.sun.net.httpserver.HttpHandler;
import com.sun.net.httpserver.HttpServer;
import org.junit.Test;
import java.io.File;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.net.InetSocketAddress;
import java.net.URI;
import java.util.*;
import java.util.regex.Pattern;
import static javax.lang.Integer.parseInt;
import static org.boon.utils.Lists.idx;
import static org.boon.utils.Lists.len;
import static org.boon.utils.Maps.copy;
import static org.boon.utils.Maps.map;
import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertTrue;
...
這讓您了解所涉及的進口。
public class IOTest {
....
這是一個從文件系統上的文件中讀取行的測試。
@Test
public void testReadLines() {
File testDir = new File("src/test/resources");
File testFile = new File(testDir, "testfile.txt");
List<String> lines = IO.readLines(testFile);
assertLines(lines);
}
這是一個幫助方法斷言文件已正確讀取。
private void assertLines(List<String> lines) {
assertEquals(
4, len(lines)
);
assertEquals(
"line 1", idx(lines, 0)
);
assertEquals(
"grapes", idx(lines, 3)
);
}
這是一個測試,顯示從String路徑讀取文件。
@Test
public void testReadLinesFromPath() {
List<String> lines = IO.readLines("src/test/resources/testfile.txt");
assertLines(lines);
}
此測試顯示從URI讀取文件。
@Test
public void testReadLinesURI() {
File testDir = new File("src/test/resources");
File testFile = new File(testDir, "testfile.txt");
URI uri = testFile.toURI();
//"file:///....src/test/resources/testfile.txt"
List<String> lines = IO.readLines(uri.toString());
assertLines(lines);
}
這是一個測試,顯示您可以從HTTP服務器讀取文件中的行:
static class MyHandler implements HttpHandler {
public void handle(HttpExchange t) throws IOException {
File testDir = new File("src/test/resources");
File testFile = new File(testDir, "testfile.txt");
String body = IO.read(testFile);
t.sendResponseHeaders(200, body.length());
OutputStream os = t.getResponseBody();
os.write(body.getBytes(IO.CHARSET));
os.close();
}
}
這是HTTP服務器測試(它實例化HTTP服務器)。
@Test
public void testReadFromHttp() throws Exception {
HttpServer server = HttpServer.create(new InetSocketAddress(9666), 0);
server.createContext("/test", new MyHandler());
server.setExecutor(null); // creates a default executor
server.start();
Thread.sleep(1000);
List<String> lines = IO.readLines("http://localhost:9666/test");
assertLines(lines);
}
這是代理緩存測試:
public static class ProxyLoader {
private static final String DATA_FILE = "./files/proxy.txt";
private List<Proxy> proxyList = Collections.EMPTY_LIST;
private final String dataFile;
public ProxyLoader() {
this.dataFile = DATA_FILE;
init();
}
public ProxyLoader(String dataFile) {
this.dataFile = DATA_FILE;
init();
}
private void init() {
List <String> lines = IO.readLines(dataFile);
proxyList = new ArrayList<>(lines.size());
for (String line : lines) {
proxyList.add(Proxy.createProxy(line));
}
}
public String getDataFile() {
return this.dataFile;
}
public static List<Proxy> loadProxies() {
return new ProxyLoader().getProxyList();
}
public List<Proxy> getProxyList() {
return proxyList;
}
}
public static class Proxy {
private final String address;
private final int port;
public Proxy(String address, int port) {
this.address = address;
this.port = port;
}
public static Proxy createProxy(String line) {
String[] lineSplit = line.split(":");
String address = lineSplit[0];
int port = parseInt(lineSplit[1]);
return new Proxy(address, port);
}
public String getAddress() {
return address;
}
public int getPort() {
return port;
}
}
public static final class Proxy2 {
private final String address;
private final int port;
private static final String DATA_FILE = "./files/proxy.txt";
private static final Pattern addressPattern = Pattern.compile("^(\\d{1,3}[.]{1}){3}[0-9]{1,3}$");
private Proxy2(String address, int port) {
/* Validate address in not null.*/
Objects.requireNonNull(address, "address should not be null");
/* Validate port is in range. */
if (port < 1 || port > 65535) {
throw new IllegalArgumentException("Port is not in range port=" + port);
}
/* Validate address is of the form 123.12.1.5 .*/
if (!addressPattern.matcher(address).matches()) {
throw new IllegalArgumentException("Invalid Inet address");
}
/* Now initialize our address and port. */
this.address = address;
this.port = port;
}
private static Proxy2 createProxy(String line) {
String[] lineSplit = line.split(":");
String address = lineSplit[0];
int port = parseInt(lineSplit[1]);
return new Proxy2(address, port);
}
public final String getAddress() {
return address;
}
public final int getPort() {
return port;
}
public static List<Proxy2> loadProxies() {
List <String> lines = IO.readLines(DATA_FILE);
List<Proxy2> proxyList = new ArrayList<>(lines.size());
for (String line : lines) {
proxyList.add(createProxy(line));
}
return proxyList;
}
}
@Test public void proxyTest() {
List<Proxy> proxyList = ProxyLoader.loadProxies();
assertEquals(
5, len(proxyList)
);
assertEquals(
"127.0.0.1", idx(proxyList, 0).getAddress()
);
assertEquals(
8080, idx(proxyList, 0).getPort()
);
//192.55.55.57:9091
assertEquals(
"192.55.55.57", idx(proxyList, -1).getAddress()
);
assertEquals(
9091, idx(proxyList, -1).getPort()
);
}
這是實際的代理緩存測試:
@Test public void proxyTest2() {
List<Proxy2> proxyList = Proxy2.loadProxies();
assertEquals(
5, len(proxyList)
);
assertEquals(
"127.0.0.1", idx(proxyList, 0).getAddress()
);
assertEquals(
8080, idx(proxyList, 0).getPort()
);
//192.55.55.57:9091
assertEquals(
"192.55.55.57", idx(proxyList, -1).getAddress()
);
assertEquals(
9091, idx(proxyList, -1).getPort()
);
}
}
您可以在此處查看此示例的所有源代碼和此實用程序類:
https://github.com/RichardHightower/boon
https://github.com/RichardHightower/boon/blob/master/src/main/java/org/boon/utils/IO.java
或者來看我:
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.