[英]How to read HTML from URL.openStream() after 500 return code causes exception?
[英]How to remove extra information from URL OpenStream scanning?
我创建了一个程序,该程序从美国人口普查日期( 链接此处 )获取姓氏,并使用二进制搜索来查找用户的姓氏。 需要明确的是,这部分代码没有问题 。 我遇到麻烦的地方是尝试删除多余的信息。 当我运行代码时,它将返回整行。 例如:
ROBINSON 0.233 8.372 20
当我想要的是:
ROBINSON
我该如何解决此问题? 我的代码如下:
import java.util.ArrayList;
import java.net.*;
import java.io.*;
import java.util.Scanner;
import java.util.Collections;
public class NameGuesser
{
public static void main(String[] args) throws IOException, MalformedURLException
{
ArrayList<String> lastNames = new ArrayList<String>();
URL url = new URL("http://www.census.gov/genealogy/www/data/1990surnames/dist.all.last");
Scanner in = new Scanner(url.openStream());
while (in.hasNextLine())
{
lastNames.add(in.nextLine());
}
Collections.sort(lastNames);
int low = 0;
int high = lastNames.size();
int loop = 0;
int mid = 7;
while(loop==0)
{
String t1 = lastNames.get(mid);
mid = (low+high)/2;
System.out.println("This program tries to guess your last name, but you have to give some hints.");
System.out.println("Does your name come before " + lastNames.get(mid) + " in the dictionary? (Y/N)");
Scanner input = new Scanner(System.in);
String answer = input.nextLine();
String t2 = lastNames.get(mid);
if(t1.equals(t2))
{
System.out.println("Your name is " + t2);
System.exit(0);
}
if(answer.equalsIgnoreCase("Y"))
{
high = mid;
}
if(answer.equalsIgnoreCase("N"))
{
low = mid;
}
}
}
}
这一点就在这里:
while (in.hasNextLine())
{
lastNames.add(in.nextLine());
}
您正在阅读整行并将其添加到lastNames中。 数据是制表符分隔的,因此要摆脱多余的字段,应在将其放入lastNames 之前执行此操作:
while (in.hasNextLine())
{
String lastNameLine = in.nextLine();
String [] parts = lastNameLine.split("\\t"); // Split by tabs.
if(parts == null || parts.length < 0) { /* uh-oh... error condition */ }
String lastName = parts[0];
lastNames.add(lastName);
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.