简体   繁体   English

如何使用矩阵中的相邻字母查找所有可能的单词

[英]How to find all the possible words using adjacent letters in a matrix

I have the following test matrix: 我有以下测试矩阵:

a l i
g t m
j e a

I intend to create an algorithm that helps me find every possible word from a given minimum length to a maximum length using adjacent letters only. 我打算创建一个算法,帮助我只使用相邻的字母从给定的最小长度到最大长度找到每个可能的单词。

For example: 例如:

Minimum: 3 letters 最少:3个字母

Maximum: 6 letters 最多:6个字母

Based on the test matrix, I should have the following results: 基于测试矩阵,我应该得到以下结果:

  • ali 阿里
  • alm ALM
  • alg ALG
  • alt ALT
  • ati ATI
  • atm 自动取款机
  • atg ATG
  • ... ...
  • atmea atmea

etc. 等等

I created a test code (C#) that has a custom class which represents the letters. 我创建了一个测试代码(C#),它有一个代表字母的自定义类。

Each letter knows its neighbors and has a generation counter (for keeping track of them during traversal). 每个字母都知道它的邻居并且有一个生成计数器(用于在遍历期间跟踪它们)。

Here is its code: 这是它的代码:

public class Letter
{
    public int X { get; set; }
    public int Y { get; set; }

    public char Character { get; set; }

    public List<Letter> Neighbors { get; set; }

    public Letter PreviousLetter { get; set; }

    public int Generation { get; set; }

    public Letter(char character)
    {
        Neighbors = new List<Letter>();
        Character = character;
    }

    public void SetGeneration(int generation)
    {
        foreach (var item in Neighbors)
        {
            item.Generation = generation;
        }
    }
}

I figured out that if I want it to be dynamic, it has to be based on recursion. 我想出如果我想要它是动态的,它必须基于递归。

Unfortunately, the following code creates the first 4 words, then stops. 不幸的是,以下代码创建了前4个单词,然后停止。 It is no wonder, as the recursion is stopped by the specified generation level. 毫无疑问,因为递归是由指定的生成级别停止的。

The main problem is that the recursion returns only one level but it would be better to return to the starting point. 主要问题是递归只返回一个级别,但返回起点更好。

 private static void GenerateWords(Letter input, int maxLength, StringBuilder sb)
    {
        if (input.Generation >= maxLength)
        {               
            if (sb.Length == maxLength)
            {
                allWords.Add(sb.ToString());
                sb.Remove(sb.Length - 1, 1);
            }                
            return;
        }
        sb.Append(input.Character);
        if (input.Neighbors.Count > 0)
        {
            foreach (var child in input.Neighbors)
            {
                if (input.PreviousLetter == child)
                    continue;
                child.PreviousLetter = input;
                child.Generation = input.Generation + 1;
                GenerateWords(child, maxLength, sb);
            }
        }
    }

So, I feel a little stuck, any idea how I should proceed? 所以,我觉得有点卡住了,不知道我该怎么办?

From here, you can treat this as a graph traversal problem. 从这里开始,您可以将其视为图遍历问题。 You start at each given letter, finding each path of length min_size to max_size , with 3 and 6 as those values in your example. 从每个给定的字母开始,找到长度为min_sizemax_size的每条路径,在示例中将3和6作为这些值。 I suggest a recursive routine that builds the words as paths through the grid. 我建议一个递归例程,将单词构建为通过网格的路径。 This will look something like the following; 这将类似于以下内容; replace types and pseudo-code with your preferences. 用您的偏好替换类型和伪代码。

<array_of_string> build_word(size, current_node) {
    if (size == 1)  return current_node.letter as an array_of_string;
    result = <empty array_of_string>
    for each next_node in current_node.neighbours {
        solution_list = build_word(size-1, next_node);
        for each word in solution_list {
             // add current_node.letter to front of that word.
             // add this new word to the result array
        }
    }
    return the result array_of_string
}

Does that move you toward a solution? 这会让你走向解决方案吗?

When solving these kind of problems, I tend to use immutable classes because everything is so much easier to reason about. 在解决这些问题时,我倾向于使用不可变类,因为一切都更容易推理。 The following implementation makes use of a ad hoc ImmutableStack because its pretty straightforward to implement one. 以下实现使用ad hoc ImmutableStack因为它非常简单地实现一个。 In production code I'd probably want to look into System.Collections.Immutable to improve performance ( visited would be an ImmutableHashSet<> to point out the obvious case). 在生产代码中,我可能想查看System.Collections.Immutable以提高性能( visited是一个ImmutableHashSet<>来指出明显的情况)。

So why do I need an immutable stack? 那为什么我需要一个不可变的堆栈呢? To keep track of the current character path and visited "locations" inside the matrix. 跟踪当前字符路径并访问矩阵内的“位置”。 Because the selected tool for the job is immutable, sending it down recursive calls is a no brainer, we know it can't change so I don't have to worry about my invariants in every recursion level. 因为所选择的作业工具是不可变的,所以发送它递归调用是没有道理的,我们知道它不能改变所以我不必担心每个递归级别中的不变量。

So lets implement an immutable stack. 所以让我们实现一个不可变的堆栈。

We'll also implement a helper class Coordinates that encapsulates our "locations" inside the matrix, will give us value equality semantics and a convenient way to obtain valid neighbors of any given location. 我们还将实现一个辅助类Coordinates ,它封装了矩阵内的“位置”,它将为我们提供值相等语义和获取任何给定位置的有效邻居的便捷方法。 It will obviously come in handy. 它显然会派上用场。

public class ImmutableStack<T>: IEnumerable<T>
{
    private readonly T head;
    private readonly ImmutableStack<T> tail;

    public static readonly ImmutableStack<T> Empty = new ImmutableStack<T>(default(T), null);
    public int Count => this == Empty ? 0 : tail.Count + 1;

    private ImmutableStack(T head, ImmutableStack<T> tail)
    {
        this.head = head;
        this.tail = tail;
    }

    public T Peek()
    {
        if (this == Empty)
            throw new InvalidOperationException("Can not peek an empty stack.");

        return head;
    }

    public ImmutableStack<T> Pop()
    {
        if (this == Empty)
            throw new InvalidOperationException("Can not pop an empty stack.");

        return tail;
    }

    public ImmutableStack<T> Push(T value) => new ImmutableStack<T>(value, this);

    public IEnumerator<T> GetEnumerator()
    {
        var current = this;

        while (current != Empty)
        {
            yield return current.head;
            current = current.tail;
        }
    }

    IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}

struct Coordinates: IEquatable<Coordinates>
{
    public int Row { get; }
    public int Column { get; }

    public Coordinates(int row, int column)
    {
        Row = row;
        Column = column;
    }

    public bool Equals(Coordinates other) => Column == other.Column && Row == other.Row;
    public override bool Equals(object obj)
    {
        if (obj is Coordinates)
        {
            return Equals((Coordinates)obj);
        }

        return false;
    }

    public override int GetHashCode() => unchecked(27947 ^ Row ^ Column);

    public IEnumerable<Coordinates> GetNeighbors(int rows, int columns)
    {
        var increasedRow = Row + 1;
        var decreasedRow = Row - 1;
        var increasedColumn = Column + 1;
        var decreasedColumn = Column - 1;
        var canIncreaseRow = increasedRow < rows;
        var canIncreaseColumn = increasedColumn < columns;
        var canDecreaseRow = decreasedRow > -1;
        var canDecreaseColumn = decreasedColumn > -1;

        if (canDecreaseRow)
        {
            if (canDecreaseColumn)
            {
                yield return new Coordinates(decreasedRow, decreasedColumn);
            }

            yield return new Coordinates(decreasedRow, Column);

            if (canIncreaseColumn)
            {
                yield return new Coordinates(decreasedRow, increasedColumn);
            }
        }

        if (canIncreaseRow)
        {
            if (canDecreaseColumn)
            {
                yield return new Coordinates(increasedRow, decreasedColumn);
            }

            yield return new Coordinates(increasedRow, Column);

            if (canIncreaseColumn)
            {
                yield return new Coordinates(increasedRow, increasedColumn);
            }
        }

        if (canDecreaseColumn)
        {
            yield return new Coordinates(Row, decreasedColumn);
        }

        if (canIncreaseColumn)
        {
            yield return new Coordinates(Row, increasedColumn);
        }
    }
}

Ok, now we need a method that traverses the matrix visiting each position once returning words that have a specified minimum number of characters and don't exceed a specified maximum. 好的,现在我们需要一种方法,一旦返回具有指定最小字符数且不超过指定最大值的单词,就遍历访问每个位置的矩阵。

public static IEnumerable<string> GetWords(char[,] matrix,
                                           Coordinates startingPoint,
                                           int minimumLength,
                                           int maximumLength)

That looks about right. 这看起来是正确的。 Now, when recursing we need to keep track of what characters we've visited, That's easy using our immutable stack, so our recursive method will look like: 现在,在递归时我们需要跟踪我们访问过的字符,使用我们的不可变堆栈很容易,所以我们的递归方法将如下所示:

static IEnumerable<string> getWords(char[,] matrix,
                                    ImmutableStack<char> path,
                                    ImmutableStack<Coordinates> visited,
                                    Coordinates coordinates,
                                    int minimumLength,
                                    int maximumLength)

Now the rest is just plumbing and connecting the wires: 现在剩下的只是管道和连接电线:

public static IEnumerable<string> GetWords(char[,] matrix,
                                           Coordinates startingPoint,
                                           int minimumLength,
                                           int maximumLength)
    => getWords(matrix,
                ImmutableStack<char>.Empty,
                ImmutableStack<Coordinates>.Empty,
                startingPoint,
                minimumLength,
                maximumLength);


static IEnumerable<string> getWords(char[,] matrix,
                                    ImmutableStack<char> path,
                                    ImmutableStack<Coordinates> visited,
                                    Coordinates coordinates,
                                    int minimumLength,
                                    int maximumLength)
{
    var newPath = path.Push(matrix[coordinates.Row, coordinates.Column]);
    var newVisited = visited.Push(coordinates);

    if (newPath.Count > maximumLength)
    {
        yield break;
    }
    else if (newPath.Count >= minimumLength)
    {
        yield return new string(newPath.Reverse().ToArray());
    }

    foreach (Coordinates neighbor in coordinates.GetNeighbors(matrix.GetLength(0), matrix.GetLength(1)))
    {
        if (!visited.Contains(neighbor))
        {
            foreach (var word in getWords(matrix,
                                          newPath,
                                          newVisited,
                                          neighbor,
                                          minimumLength,
                                          maximumLength))
            {
                yield return word;
            }
        }
    }
}

And we're done. 我们已经完成了。 Is this the most elegant or fastest algorithm? 这是最优雅或最快的算法吗? Probably not, but I find it the most understandable and therefore maintainable. 可能不是,但我发现它是最容易理解的,因此也是可维护的。 Hope it helps you out. 希望它可以帮助你。

UPDATE Based upon comments below, I've run a few test cases one of which is: 更新根据下面的评论,我运行了一些测试用例,其中一个是:

var matrix = new[,] { {'a', 'l'},
                      {'g', 't'} };
var words = GetWords(matrix, new Coordinates(0,0), 2, 4);
Console.WriteLine(string.Join(Environment.NewLine, words.Select((w,i) => $"{i:00}: {w}")));

And the outcome is the expected: 结果是预期的:

00: ag
01: agl
02: aglt
03: agt
04: agtl
05: at
06: atl
07: atlg
08: atg
09: atgl
10: al
11: alg
12: algt
13: alt
14: altg

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM