简体   繁体   English

如何证明两个字符串是彼此的排列?

[英]How to prove that two strings are permutations of each other?

I am doing cracking the code interview book and I came across the question in the arrays and strings chapter where they're asking to write a method that proves that two strings given as input are permutations of each other. 我正在破解代码访谈书,我在数组和字符串章节中遇到了问题,他们要求编写一个方法,证明作为输入给出的两个字符串是彼此的排列。

The answers in the book are pretty clean and clear. 书中的答案非常干净清晰。 One is to sort, and then compare if they're identical, and the other is to check if the two strings have identical character counts. 一种是排序,然后比较它们是否相同,另一种是检查两个字符串是否具有相同的字符数。

However, I had a different approach for this problem, and I wanted to share it with you to see your opinion. 但是,我对此问题有不同的处理方法,我想与您分享以查看您的意见。

I am making the assumption that the characters are ASCII characters. 我假设字符是ASCII字符。 So what I was thinking of is first check if the lengths are equal for both strings, if not we directly return false because obviously it opposes the definition of permutations. 所以我想到的是首先检查两个字符串的长度是否相等,如果不是,我们直接返回false,因为很明显它反对排列的定义。

If it's the case we proceed with the algorithm. 如果是这种情况,我们继续算法。 First, we initialize: 首先,我们初始化:

int sum = 0;
int sum1 = 0;

Then we go through the character of each string adding the ASCII value of each character to the sum and comparing the sums in the end. 然后我们遍历每个字符串的字符,将每个字符的ASCII值添加到总和中,并比较最后的总和。 If they're equal, then we got ourselves a permutation. 如果他们是平等的,那么我们就得到了一个排列。

Does this approach work? 这种方法有效吗?

No, it doesn't work, because 12 is both the sum of 2 and 10 and the sum of 3 and 9 . 不,它不起作用,因为12210的总和以及39的总和。

With your algorithm "ad" would be a permutation of "bc" . 使用您的算法"ad"将是"bc"的排列。

In the general case, if you allow a reasonable range of characters and string lenght, there's no real shortcut. 在一般情况下,如果您允许合理范围的字符和字符串长度,则没有真正的捷径。 The best solution among the two you mention depends on the language. 你提到的两个中最好的解决方案取决于语言。

dystroy is right 破坏是对的

to get it work at 99.999% correctness (by your approach) you compute: 为了让它以99.999%的正确性(通过你的方法)工作,你计算:

sum1 = sum (ASCII(i))
sum2 = sum (ASCII(i)^2)
sum3 = sum (ASCII(i)^3)
  • for both strings and if all of the same powered sum is the same 对于两个字符串,如果所有相同的动力和是相同的
  • then you have most likely permutated string ... 然后你最有可能排列的字符串......

to be sure compare histograms (as you mentioned in question) but that need more memory ... 确保比较直方图(如你所提到的那样)但需要更多记忆......

Your approach won't work because there will be lots of collisions for sums ie basically what you are assuming is 5+3 = 8 and there is no other combination that would produce 8 but you are wrong example 4 + 4 is also 8 . 你的方法不会起作用,因为总和会有很多碰撞,基本上你假设的是5 + 3 = 8并且没有其他组合可以产生8但你错了例子4 + 4也是8。

There are many Ad hoc methods to solve this problem i am going to describe two of them . 有许多Ad hoc方法可以解决这个问题,我将描述其中的两个。 You can use prime numbers instead to solve the problem by method similar to yours or simply allocate 2 arrays and keep record of the characters. 您可以使用素数代替通过类似于您的方法解决问题,或者只是分配2个数组并保留字符的记录。

1. You can initialize 2 integer arrays of size 27 each say list1[27] and list2[27] initialized as 0 , read both the strings character by character say if you read 'c' from string 1 , increment the 3rd element of the list1 because 'c' is the third character and so on and when you are done reading both the string scan both the arrays for mismatch if there is any mismatch they are not permutations of each other. 1.你可以初始化2个大小为27的整数数组,每个都说list1 [27]和list2 [27]初始化为0,逐个字符地读取字符串,如果你从字符串1中读取'c',则增加第3个元素list1因为'c'是第三个字符,依此类推,当你读完两个字符串时,扫描两个数组是否不匹配,如果有任何不匹配,它们不是彼此的排列。

A possible implementation can be 可能的实现方式是

char str1[50]="permutation";
char str2[50]="importunate";

int list1[27]={0},list2[27]={0};


for(int i=0;i<11;i++){
    list1[(int)str1[i]-(int)'a'+1]++;
    list2[(int)str2[i]-(int)'a'+1]++;
}

for(int i=0;i<=27;i++){
    if(i==27){
        return true;
    }
    if(list1[i]!=list2[i])
    {
        return false;
    }
}

this method can be easily extended to consider spaces , different case characters and digits . 这种方法可以很容易地扩展到考虑空格,不同的大小写字符和数字。

2. This method is similar to what you have done but instead of using ASCII values it uses prime numbers and instead of addition it uses multiplication .Problem with your method was lot's of possible collisions as dystroy pointed out if you chose to multiply instead you will again face the same problem but what if instead of multiplying ascii values we multiply prime numbers assigned to a particular character. 2.这个方法与你所做的类似,但它不是使用ASCII值而是使用素数而不是加法它使用乘法。问题与你的方法有很多可能的碰撞,因为如果你选择乘法,你会指出再次面临同样的问题,但如果不是乘以ascii值,我们将分配给特定字符的素数相乘。

here we first allocate an array which stores first 26 prime numbers starting from 2 , and read strings character by character and multiply all the respective prime numbers assigned to each character of the string we finally compare the two large integer numbers and if those are equal then the strings are permutations of each other 这里我们首先分配一个数组,它存储从2开始的前26个素数,并逐个字符地读取字符串并将分配给字符串的每个字符的所有相应素数相乘,我们最终比较两个大整数,如果它们相等那么字符串是彼此的排列

A possible implementation could be 可能的实施方式可能是

int arr[27]={2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89,97,101,103};
char str1[50]="permutation";
char str2[50]="importunate";

int prd1=1,prd2=1;


for(int i=0;i<11;i++){
    prd1=prd1*arr[(int)str1[i]-(int)'a'];
    prd2=prd2*arr[(int)str2[i]-(int)'a'];
}

if(prd1==prd2)
    return true;

else
    return false;

This method is not much extendable as the first one because numbers grow big with the length of string , we can 这个方法不像第一个那样可扩展,因为数字随着字符串的长度变大,我们可以

prd1=prd1*arr[(int)str1[i]-(int)'a']%1000000009;
prd2=prd2*arr[(int)str2[i]-(int)'a']%1000000009;//or some other large prime number 

This cannot be achieved using sum, since a number doesn't have unique sum-factors (as mentioned by previous answers) 使用sum无法实现这一点,因为数字不具有唯一的和因子(如前面的答案所述)

This can be done by comparing character-histograms 这可以通过比较字符直方图来完成

Code Java 代码 Java

class Character_Histogram
{
    public Map<Character,Integer> histogram;

    public Character_Histogram ()
    {
        histogram = new TreeMap<Character,Integer> ();
    }

    public void count (Character c)
    {
        if (histogram.containsKey(c))
            histogram.put(c, histogram.get(c)+1);
        else
            histogram.put(c, 1);
    }

    public void count (String str)
    {
        for(char c : str.toCharArray())
            count(new Character(c));
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 O(n)时间复杂度和O(1)空间复杂度的方式来查看两个字符串是否彼此置换 - O(n) time complexity and O(1) space complexity way to see if two strings are permutations of each other 如何互相替换两个或多个字符串? - How to replace two or more strings with each other? 如何找到两个彼此接近的字符串 - How to find two strings that are close to each other 计算两个字符串相互共享的唯一字符数 - Count how many unique characters two strings share with each other 如何测试两个字符串是否相互旋转? - How can I test whether two strings are rotations of each other? 如何互相交换两个字符串的第一个字符? 这两个字符串存储在指针数组中 - How can I swap the first character of two strings with each other ? the two strings are stored in array of pointers 检查C中是否有两个字符串排列 - Check if two strings are permutations in C 检查两根弦在旋转方向上是否相等 - Check if two strings are rotationally equal to each other 字符串如何相互比较? - How strings are compared to each other? 当且仅当字符串的字符的总和与乘积相同时,两个字符串才是彼此的字谜。 怎么样? - Two strings are anagrams of each other if and only if the sum and product of the characters of the strings are same. How?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM