简体   繁体   English

迭代器的实现 <List<String> &gt;无法正常工作

[英]Implementation for Iterator<List<String>> doesn't work correctly

I have to write an implementation for Iterator interface. 我必须为Iterator接口编写一个实现。

Its constructor should look like following: 其构造函数应如下所示:

public BlockIterator(Iterator<List<String>> iterator, String regex) {

To make a long story short, this implementation should parse files with huge sizes, thus can't save it to memory (like storing and processing to array or collection), everything should be operated "on the fly". 长话短说,此实现应解析大型文件,因此无法将其保存到内存中(例如存储和处理到数组或集合),所有操作均应“即时”进行。

Also, next() implementation should return sublist from the first occurrence of pattern to next one. 同样, next()实现应将子列表从模式的第一次出现返回到下一个。 However, next one shouldn't be included. 但是,不应包括下一个。

One more notice, hasNext() should be idempotent. 另外, hasNext()应该是幂等的。 Even after 20 calls, the result should be the same. 即使经过20次通话,结果也应该相同。

Here is my solution with tests: 这是我的测试解决方案:

import com.google.common.collect.Lists;
import org.junit.Test;

import java.util.Iterator;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertTrue;

class BlockIterator implements Iterator<List<String>> {

    private final Iterator<List<String>> iterator;
    private final Pattern pattern;

    public BlockIterator(Iterator<List<String>> iterator, String regex) {
        this.iterator = iterator;
        this.pattern = Pattern.compile(regex);
    }

    @Override
    public boolean hasNext() {
        while (iterator.hasNext()) {
            List<String> line = iterator.next();
            for (String word : line) {
                Matcher matcher = pattern.matcher(word);
                if (matcher.find()) {
                    return true;
                }
            }
        }
        return false;
    }

    @Override
    public List<String> next() {
        String matchWord = null;
        List<String> result = Lists.newArrayList();

        while (iterator.hasNext()) {
            List<String> line = iterator.next();
            for (String word : line) {
                Matcher matcher = pattern.matcher(word);
                if (matcher.find()) {
                    if (null != matchWord) {
                        return result;
                    } else {
                        matchWord = word;
                    }
                }
                if (null != matchWord) {
                    result.add(word);
                }
            }
        }
        return result;
    }
}

public class BlockIteratorTest {

    public static final List<List<String>> lines = Lists.newArrayList(
            Lists.newArrayList("123"),
            Lists.newArrayList("- test -"),
            Lists.newArrayList("start"),
            Lists.newArrayList("end"),
            Lists.newArrayList("test123"));

    @Test
    public void testNext() throws Exception {
        List<String> expectedFirstNext = Lists.newArrayList("- test -", "start", "end");
        List<String> expectedSecondNext = Lists.newArrayList("test123");

        BlockIterator blockIterator = new BlockIterator(lines.iterator(), "test");

        List<String> actualFirstNext = blockIterator.next();
        assertEquals(expectedFirstNext, actualFirstNext);

        List<String> actualSecondNext = blockIterator.next();
        assertEquals(expectedSecondNext, actualSecondNext);
    }

    @Test
    public void testHasNext() throws Exception {
        BlockIterator blockIterator = new BlockIterator(lines.iterator(), "test");

        for (int i = 0; i < 20; i++) {
            assertTrue(blockIterator.hasNext());
        }
    }
}

It has few fails: 它几乎没有失败:

  • hasNext() isn't idempotent hasNext()不是幂等的
  • after second next() call we should return only match subllist (because no text anymore). 在第二个next()调用之后,我们应该仅返回匹配subllist(因为不再有文本)。

I couldn't find an effective solution in such case. 在这种情况下,我找不到有效的解决方案。

Any suggestions? 有什么建议么?

Tried playing with this, not sure if this is what you mean, but it's passing your tests, so... it's something! 试着玩这个游戏,不确定这是否是您的意思,但是它正在通过您的测试,所以...确实如此! I don't understand your second fail and I'm not sure what you want to happen when the inner lists have more than 1 word, but try this anyway: 我不明白您的第二次失败,也不确定内部列表中的单词多于1个字时您想做什么,但是还是可以尝试一下:

class IteratorTesting implements Iterator<List<String>> {

    private final Iterator<List<String>> iterator;
    private final Pattern pattern;

    private boolean hasNext = false;
    private List<String> next = null;
    private String startNext = null;

    public IteratorTesting(Iterator<List<String>> iterator, String regex) {
        this.iterator = iterator;
        this.pattern = Pattern.compile(regex);

        hasNext = checkNext();
    }

    @Override
    public boolean hasNext() {
        return hasNext;
    }

    private boolean checkNext() {
        String matchWord = null;
        List<String> result = new ArrayList<>();
        if(startNext != null)
            result.add(startNext);

        while(iterator.hasNext()) {
            List<String> line = iterator.next();
            for(String word : line) {
                Matcher matcher = pattern.matcher(word);
                if(matcher.find()) {
                    if(null != matchWord || startNext != null) {
                        next = result;
                        startNext = word;
                        return true;
                    } else {
                        matchWord = word;
                    }
                }
                if(null != matchWord || startNext != null) {
                    result.add(word);
                }
            }
        }
        next = result;
        startNext = null;
        return !next.isEmpty();
    }

    @Override
    public List<String> next() {
        List<String> current = next;
        hasNext = checkNext();
        return current;
    }
}

I know this is bad code, I see even now things that could be instantly refactored ( if(null != matchWord || startNext != null) { ...), don't hate me. 我知道这是错误的代码,即使是现在,我仍然可以立即重构某些东西( if(null != matchWord || startNext != null) { ...),不要讨厌我。

你可以存储与匹配列表中的一个字段,用空的比较它hasNext ,和返回值在next

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM