[英]How do I compact contiguous repeated elements in a string in C?
An example of my problem:我的问题的一个例子:
input: "abcabcabcabcxyxyxyccccccc"
输入:“abcabcabcabcxyxyxyccccccc”
output: "abc4xy3c7"
output:“abc4xy3c7”
So far, I've made a code that can count all the charaters in the string and store those numbers into an array from 0 to 25, which represents the alphabet (I'm only considering small letters).到目前为止,我已经编写了一个代码,可以计算字符串中的所有字符并将这些数字存储到一个从 0 到 25 的数组中,它代表字母表(我只考虑小写字母)。 For the example above, my code would generate the following array:
对于上面的示例,我的代码将生成以下数组:
letter_count = [4 4 4 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 3 0]
letter_count = [4 4 4 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 3 0]
From this, I'd be able to know which substrings are which and print them accordingly.由此,我将能够知道哪些子字符串是哪些子字符串并相应地打印它们。 But I can't do it, no matter how I try.
但无论如何我都做不到。
Could someone help me?有人可以帮助我吗?
Update: removing steps to convert to array
更新:删除转换为数组的步骤
I assume you are starting off with a string so...我假设你从一个字符串开始,所以......
The first initialize an empty auxiliary array which will hold the result第一个初始化一个空的辅助数组,它将保存结果
Next cycle through the string like an array asking if the current element is the same as the last element in your results array, if not then add the current element to the auxiliary array else continue to next element.下一次像数组一样循环遍历字符串,询问当前元素是否与结果数组中的最后一个元素相同,如果不是,则将当前元素添加到辅助数组,否则继续下一个元素。
This should give you what you are looking for.这应该给你你正在寻找的东西。
This sounds a bit home-worky so I'll just put my thoughts down here.这听起来有点家庭作业,所以我将把我的想法写在这里。 The letter_count array has no value here.
letter_count 数组在这里没有值。 I think the approach to use is:-
我认为使用的方法是: -
startindex = 0
while startindex < length of string
for n = 1 to (length of string - startindex) / 2
if substring (startindex, n) == substring (startindex+n, n) then
found a repitition, count how many times substring is repeated
output substring and repitition count
set startindex to index of last character in repeated string
break
startindex = startindex + 1
There's a few things missing (if you find a non-repeated sequence like abcdcd for example) but it's a start I guess.缺少一些东西(例如,如果您发现像 abcdcd 这样的非重复序列),但我想这是一个开始。
Thinking about the problem, what would the output for ababcdababcd be?思考问题,ababcdababcd 的 output 会是什么? Would it be ab2cd1ab2cd1 or ababcd2?
是 ab2cd1ab2cd1 还是 ababcd2? Should the algorithm find the shortest compacted string?
算法是否应该找到最短的压缩字符串?
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.