简体   繁体   English

合并和排序O(n)中的n个字符串

[英]Merging and sorting n strings in O(n)

I recently was given a question in a coding challenge where I had to merge n strings of alphanumeric characters and then sort the new merged string while only allowing alphabetical characters in the sorted string. 最近,我在编码挑战中遇到一个问题,我必须合并n个字母数字字符的字符串,然后对新的合并字符串进行排序,而只允许在排序的字符串中使用字母字符。 Now, this would be fairly straight forward except that the caveat added was that the algorithm had to be O(n) (it didn't specify whether this was time or space complexity or both). 现在,这将是相当简单的,除了增加的警告是算法必须为O(n)(它没有指定这是时间还是空间复杂性,还是两者都没有)。

My initial approach was to concatenate the strings into a new one, only adding alphabetical characters and then sorting at the end. 我最初的方法是将字符串连接成一个新字符串,只添加字母字符,然后在末尾排序。 I wanted to come up with a more efficient solution but I was given less time than I was initially told. 我想提出一个更有效的解决方案,但是给我的时间少于最初告诉我的时间。 There isn't any sorting algorithm (that I know of) which runs in O(n) time, so the only thing I can think of is that I could increase the space complexity and use a sorted hashtable (eg C++ map) to store the counts of each character and then print the hashtable in sorted order. (我所知道的)没有在O(n)时间内运行的排序算法,所以我唯一能想到的是我可以增加空间复杂度并使用排序的哈希表(例如C ++映射)进行存储每个字符的计数,然后按排序顺序打印哈希表。 But as this would require possibly printing n characters n times, I think it would still run in quadratic time. 但这可能需要n次打印n个字符,因此我认为它仍将以二次时间运行。 Also, I was using python which I don't think has a way to keep a dictionary sorted (maybe it does). 另外,我使用的是python,但我认为它没有办法使字典保持排序(也许可以)。

Is there anyway this problem could have been solved in O(n) time and/or space complexity? 无论如何,是否可以在O(n)时间和/或空间复杂度中解决此问题?

Your counting sort is the way to go: build a simple count table for the 26 letters in order. 您的计数排序要走的路:建立了26个字母的简单计数表中的顺序。 Iterate through your two strings, counting letters, ignoring non-letters. 遍历两个字符串,计算字母,忽略非字母。 This is one pass of O(n). 这是O(n)的一遍。 Now, simply go through your table, printing each letter the number of times indicated. 现在,只需遍历表格,按指示的次数打印每个字母。 This is also O(n), since the sum of the counts cannot exceed n . 这也是O(n),因为计数的总和不能超过n You're not printing n letters n times each: you're printing a total of n letters. 您不会每次打印n字母n总共要打印n字母。

  1. Concatenate your strings (not really needed, you can also count chars in the individual strings) 连接您的字符串(不是真正需要的,您也可以在单个字符串中计算字符)
  2. Create an array with length equal to total nr of charcodes 创建一个长度等于字符代码总和的数组
  3. Read through your concatenated string and count occurences in the array made at step 2 通读您的串联字符串,并计算在步骤2中创建的数组中的出现次数
  4. By reading through the char freq array, build up an output array with the right nr of repetitions of each char. 通过读取char freq数组,使用每个char的正确nr个重复构建输出数组。

Since each step is O(n) the whole thing is O(n) 因为每个步骤都是O(n),所以整个事情都是O(n)

[@patatahooligan: had made this edit before I saw your remark, accidentally duplicated the answer] [@patatahooligan:在我看到您的评论之前进行了此编辑,不小心重复了答案]

If I've understood the requirement correctly, you're simply sorting the characters in the string? 如果我正确理解了要求,那么您只是在对字符串中的字符进行排序?

Ie ADFSACVB becomes AABCDFSV ? ADFSACVB变为AABCDFSV

If so then the trick is to not really "sort". 如果是这样,那么诀窍就是不要真正“排序”。 You have a fixed (and small) number of values. 您有固定数量的值。 So you can simply keep a count of each value and generate your result from that. 因此,您只需对每个值进行计数并从中生成结果。

Eg Given ABACBA 例如,鉴于ABACBA

In the first pass, increment a counters in an array indexed by characters. 在第一遍中,在由字符索引的数组中增加一个计数器。 This produces: 这将产生:

[A] == 3
[B] == 2
[C] == 1

In second pass output the number of each character indicated by the counters. 在第二遍输出中,计数器指示的每个字符的编号。 AAABBC

In summary, you're told to sort , but thinking outside the box, you really want a counting algorithm. 总而言之,系统会告诉您进行排序 ,但要跳出框框思考,您确实需要计数算法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM