我试图在本教程之后用C编写一些代码来替换给定字符串中的子字符串。我设法让它工作,但是该函数的问题之一是,如果我们替换的字符串比原始子字符串大,它将导致缓冲区溢出。在我的程序中,我知道我只会将动态分配的字符串传递给这个函数,所以我想我可以检查我们要替换的字符串是否大于子字符串,如果是,我可以使用
realloc
函数调整原始字符串的大小,为我们要替换的字符串腾出空间。最初,该函数似乎可以工作,但由于某种原因,当我重复调用它来替换同一字符串上的不同子字符串时,该字符串会损坏。
在下面的代码中,我有一个名为
GetFileContents
的函数,它将检索名为 first_page.html
的文件的文件内容到动态分配的字符串中,然后返回所述字符串,该函数工作正常并且没有任何问题。 StrReplaceSubstringFirstOccurance
将用新字符串替换我指定的子字符串,它还将打印出该字符串的输出,并在输出前加上 Inside:
。在 if 语句中,我检查要替换的字符串是否大于子字符串,我有两种不同的方法来调整字符串大小。方法 2 使用 realloc
,方法 1 使用 malloc
分配一个新字符串,将原始字符串的内容复制到新字符串,在原始字符串上调用 free
,然后使指向原始字符串的指针指向新字符串。两种方法都会产生同样的问题。正如您在输出中看到的,对 StrReplaceSubstringFirstOccurance
的前两次调用成功运行,但是对函数的第三次调用(子字符串是 [[to]]
,要替换的字符串是 NOT CUSTOM!
),在函数内部打印出正确的字符串,但是当我尝试在函数外部打印字符串时,它显示垃圾字符。在对该函数的第四次调用中,该函数识别出原始字符串中不再出现该子字符串(因为原始字符串现在包含垃圾字符),因此它只是退出该函数,原始字符串中的垃圾字符仍然存在留在那里。
代码:
static bool StrReplaceSubstringFirstOccurance(char* source, char* substring, char* replace) {
char* substring_occurance = strstr(source, substring);
if (substring_occurance == NULL) {
printf("No substring: %s found.\n", substring);
return false;
}
if (strlen(replace) > strlen(substring)) {
size_t new_size = strlen(source) + (strlen(replace)-strlen(substring))+1;
// Approach 1
// char* temp = malloc(new_size);
// memcpy(temp, source, strlen(source)+1);
// free(source);
// source = temp;
// Approach 2
source = realloc(source, new_size);
}
substring_occurance = strstr(source, substring);
memmove(substring_occurance + strlen(replace),
substring_occurance + strlen(substring),
strlen(substring_occurance) - strlen(substring)+1);
memcpy(substring_occurance, replace, strlen(replace));
printf("\nInside: %s\n", source);
return true;
}
int main(void) {
char* first_page = GetFileContents("first_page.html");
StrReplaceSubstringFirstOccurance(first_page, "[[say]]", "CUSTOM!");
StrReplaceSubstringFirstOccurance(first_page, "[[say]]", "CUSTOM!");
printf("\nFirst Print: %s\n", first_page);
StrReplaceSubstringFirstOccurance(first_page, "[[to]]", "NOT CUSTOM!");
printf("\nSecond Print: %s\n", first_page);
StrReplaceSubstringFirstOccurance(first_page, "[[to]]", "NOT CUSTOM!");
printf("\nThird Print: %s\n", first_page);
printf("Reached the end!\n");
return 0;
}
输出:
Inside: <!DOCTYPE html>
<html lang="en-US">
<title>CUSTOM! [[to]]</title>
<body>
[[say]]
<h1>This is a header!</h1>
[[to]]
</body>
</html>
Inside: <!DOCTYPE html>
<html lang="en-US">
<title>CUSTOM! [[to]]</title>
<body>
CUSTOM!
<h1>This is a header!</h1>
[[to]]
</body>
</html>
First Print: <!DOCTYPE html>
<html lang="en-US">
<title>CUSTOM! [[to]]</title>
<body>
CUSTOM!
<h1>This is a header!</h1>
[[to]]
</body>
</html>
Inside: <!DOCTYPE html>
<html lang="en-US">
<title>CUSTOM! NOT CUSTOM!</title>
<body>
CUSTOM!
<h1>This is a header!</h1>
[[to]]
</body>
</html>
Second Print: α#F^Ñ☻
No substring: [[to]] found.
Third Print: α#F^Ñ☻
Reached the end!
first_page.html
<!DOCTYPE html>
<html lang="en-US">
<title>[[say]] [[to]]</title>
<body>
[[say]]
<h1>This is a header!</h1>
[[to]]
</body>
</html>
我按照 pmg 的评论设法解决了这个问题。我在
file_path
函数中传递了一个指向 StrReplaceSubstringFirstOccurance
变量的指针。
新代码如下所示:
static bool StrReplaceSubstringFirstOccurance(char** source, char* substring, char* replace) {
char* substring_occurance = strstr(*source, substring);
if (substring_occurance == NULL) {
printf("No substring: %s found.\n", substring);
return false;
}
if (strlen(replace) > strlen(substring)) {
size_t new_size = strlen(*source) + (strlen(replace)-strlen(substring))+1;
*source = realloc(*source, new_size);
}
substring_occurance = strstr(*source, substring);
memmove(substring_occurance + strlen(replace),
substring_occurance + strlen(substring),
strlen(substring_occurance) - strlen(substring)+1);
memcpy(substring_occurance, replace, strlen(replace));
printf("\nInside: %s\n", *source);
return true;
}
int main(void) {
char* first_page = GetFileContents("first_page.html");
StrReplaceSubstringFirstOccurance(&first_page, "[[say]]", "CUSTOM!");
printf("\nFirst Print: %s\n", first_page);
StrReplaceSubstringFirstOccurance(&first_page, "[[to]]", "NOT CUSTOM!");
printf("\nThird Print: %s\n", first_page);
printf("Reached the end!\n");
return 0;
}
输出为:
Inside: <!DOCTYPE html>
<html lang="en-US">
<title>CUSTOM! [[to]]</title>
<body>
[[say]]
<h1>This is a header!</h1>
[[to]]
[[say]][[to]]
</body>
</html>
First Print: <!DOCTYPE html>
<html lang="en-US">
<title>CUSTOM! [[to]]</title>
<body>
[[say]]
<h1>This is a header!</h1>
[[to]]
[[say]][[to]]
</body>
</html>
Inside: <!DOCTYPE html>
<html lang="en-US">
<title>CUSTOM! NOT CUSTOM!</title>
<body>
[[say]]
<h1>This is a header!</h1>
[[to]]
[[say]][[to]]
</body>
</html>
Third Print: <!DOCTYPE html>
<html lang="en-US">
<title>CUSTOM! NOT CUSTOM!</title>
<body>
[[say]]
<h1>This is a header!</h1>
[[to]]
[[say]][[to]]
</body>
</html>
Reached the end!