我正在用 C 编写一个简单的解析器作为一项任务,并且我正在尝试尽可能多地重用前一个代码。我有以下片段:
代币:
typedef struct token {
const char *lexeme; //the actual string value of the token
Category category; //the category of the token
} Token;
Token create_token(const char *lexeme, Category cat) {
Token token = {.lexeme = lexeme, .category = cat};
return token;
}
自动机:
typedef struct automata {
//does not directly store token but is easy to retrieve
Category token_type; //which type of token is the automata supposed to scan
char *scanned; //char buffer to store read characters from a string. Not realloced because it's been given reasonable capacity
int lexeme_capacity; //capacity of scanned buffer. Dynamic change is not implemented, all tokens should be at most this length
...
} Automata;
Automata create_automata(int num_states, char *accepted_chars,
int num_accepted_states, const int *accepted_states,
Category token_type) {
...
//create and return an automata
Automata automata = {
.token_type = token_type,
//TODO malloc returns an allocated pointer so it is not null but it must take into account possible overflows
.scanned = malloc(DEFAULT_LEXEME_LENGTH * sizeof(char)),
.lexeme_capacity = DEFAULT_LEXEME_LENGTH,
...
};
return automata;
}
Token get_token(Automata *automata) {
// if automata is not in accepting state, it did not recognize the lexeme
Category category = accept(automata) ? automata->token_type : CAT_NONRECOGNIZED;
// easy to understand if written like this
Token value = {
.lexeme = automata->scanned,
.category = category,
};
return value;
}
据我所知,我使用
get_token
函数创建的任何令牌都将包含一个 char *
到堆内存。这对我之前的任务很有效。
Token allocate_token(const char *lexeme, size_t lexeme_len, Category cat) {
//allocate and copy into heap memory the contents of the lexeme string
char *heap_allocated = calloc(lexeme_len + 1, sizeof(char));
memcpy(heap_allocated, lexeme, lexeme_len * sizeof(char));
//create and return a token
return create_token(heap_allocated, cat);
}
void free_token(Token *token) {
free(&token->lexeme);
}
但是,CLion 警告说
free_token
函数是“试图在非堆对象 'lexeme' 上调用 free”。我不明白lexeme
怎么会不是堆分配的指针。如果您正在调用
free(token->lexeme)
,编译器应该抱怨您正在使用 free
合格指针调用 const
,这是不正确的,因为 free
需要一个 char *
参数。
确保使用指向已分配内存的指针调用
free
是您的责任,但是将其存储到 const char *
会使跟踪对象生命周期和区分指向堆对象的指针和指向常量数据(例如字符串文字)的指针变得更加困难.