I already have the code that removes a substring from a string (word) in C, but I don't understand it. Can someone explain it to me? It doesn't use functions from the standard library. I tried to analyze it myself, but certain parts I still don't understand - I put them in the comments. I just need to figure out how does this all work.
Thanks!
#include <stdio.h>
#include <stdlib.h>
void remove(char *s1, char *s2);
int main()
{
char s1[101], s2[101];
printf("First word: ");
scanf("%s", s1);
printf("Second word: ");
scanf("%s", s2);
remove(s1, s2);
printf("The first word after removing is '%s'.", s1);
return 0;
}
void remove(char *s1, char *s2)
{
int i = 0, j, k;
while (s1[i]) // ITERATES THROUGH THE FIRST STRING s1?
{
for (j = 0; s2[j] && s2[j] == s1[i + j]; j++); // WHAT DOES THIS LINE DO?
if (!s2[j]) // IF WE'RE AT THE END OF STRING s2?
{
for (k = i; s1[k + j]; k++) //WHAT DOES THIS ENTIRE BLOCK DO?
s1[k] = s1[k + j];
s1[k] = 0;
}
else
i++; // ???
}
}
Here main working of function is like :
-Skip the common part between both strings and assign the first string with new string.
while (s1[i]) // Yes It ITERATES THROUGH THE FIRST STRING s1
{
for (j = 0; s2[j] && s2[j] == s1[i + j]; j++); // Here it skips the part which is
//similar in both
As this loop just increasing the index of common part so this will skip storing of data in s1.
if (!s2[j]) // IF WE'RE AT THE END OF STRING s2
{
for (k = i; s1[k + j]; k++) //Here it is re assigning the non common part.
s1[k] = s1[k + j];
s1[k] = 0;
}
else
i++; // it is req. if both have more values.
}
The first while (s1[i])
iterates through s1. Yes, you are right.
for (j = 0; s2[j] && s2[j] == s1[i + j]; j++);
The above for loop checks whether the substring s2 is present in s1 starting from s1[i]. If it matches, s2 is completely iterated. If not, at the end of the for loop, s2[j] will not be null character. Example: if s1 = ITERATE and s2 = RAT, then the loop will execute completely only when i=3.
so the if (!s2[j])
holds then it means we have found a substring and i is the starting point of the substring in s1.
for (k = i; s1[k + j]; k++) //WHAT DOES THIS ENTIRE BLOCK DO?
s1[k] = s1[k + j];
s1[k] = 0;
The abov block removes the substring. So, for the ITERATE and RAT example, this is done by copying E and null char at positions where R and A were present. The for loop achieves this. If s2[j] is not null after for loop, the i is incremented to check for substribng from the next position of s1.
Here is an approach of the functionality condensed in the comments
void remove(char *s1, char *s2)
{
int i = 0, j, k;
while (s1[i]) // Iterates through s1 (until it finds a zero)
{
for (j = 0; s2[j] && s2[j] == s1[i + j]; j++); // Iterates through s2 while both it is NOT the end of the string s2 and each character of s2 coincides with s1 (if s2 == s1, j points to the end of s2 => zero)
if (!s2[j]) // If j point to the end of s2 => We've found the coincidence
{
for (k = i; s1[k + j]; k++) //Remove the coincident substring
s1[k] = s1[k + j];
s1[k] = 0;
}
else
i++; // There is no coincidence so we continue to the next character of s1
}
}
Note: I also hace noticed that this may be easily exploted since it iterates out of s1 range.
Let's break it down. You have
while (s1[i])
{
// Code
}
This iterates through s1
. Once you get to the end of the string, you have \\0
, which is the null terminator. When evaluated in a condition, it will evaluate to 0
. It may have been better to use a for
here.
You then have
for (j = 0; s2[j] && s2[j] == s1[i + j]; j++);
This does nothing but increment j
. It should be noted that this expression does not have braces and it terminated with a semicolon, so the code after it shouldn't be executed within the loop body. If it did have the braces correctly, it would loop over the following if/else
while s2
was not null and s2[j] == s1[i+j]
. I don't really have an explanation for the second part other than the character in s2
is offset by an amount i
in s1
. This part could likely be improved to remove unnecessary iterations.
Then there's
if (!s2[j])
{
}
else
{
}
This checks to make sure the position in s2
is valid and executes the removal of the string if so and otherwise increments i
. It could be improved by returning in the else
when s2
could no longer fit in the remainder of s1
.
for (k = i; s1[k + j]; k++)
s1[k] = s1[k + j];
s1[k] = 0;
This is another somewhat strange loop since due to the absence of braces, s1[k] = 0
will be set outside of the loop. What happens here is that the string is compacted down by removing s2
and shifting the character at k+j
down to k
. At the end of the loop s1[k] = 0
ends the string in a null terminator to be properly ended.
If you want a deeper understanding, it may be worth trying to write your own code to do the same thing and then comparing afterwards. I have found that that generally helps more than reading a bunch of tests.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.