我想提取函数指针类型中函数参数的名称和类型。用例是围绕一些 C 类型生成包装器,例如:
typedef struct { void (*f)(int x, int y); } vtable_t;
我希望这些包装器使用准确的参数名称。
使用next(top.underlying_typedef_type.get_fields()).type.get_pointee().argument_types()
之类的东西很容易获得参数
types,它返回两个整数类型的列表,但没有名称。
我可以看到参数 names 可从
Cursor
对象中获得,但 next(node.underlying_typedef_type.get_fields()).type.get_pointee().get_declaration()
似乎返回某种虚拟光标(<SourceLocation file None, line 0, column 0>
,没有子项)。
我找到获取参数名称的唯一方法是遍历 typedef 的完整游标树;例如,
next(next(top.get_children()).get_children()).spelling
的计算结果为 'x'
。
因此,通过对游标和类型树进行完全并行遍历来提取命名参数列表似乎是“可能的”,但这种策略似乎复杂且脆弱。有没有更简单的方法?
POINTER
到
FUNCTIONPROTO
,后者有一个
get_result()
获取返回类型。
PARM_DECL
子项,每个子项都有一个
type
属性。
FUNCTIONPROTO
参数
types
argument_types()
方法,那么就会错过他们的名字,所以
最简单的解决方案是忽略 argument_types()
并仅使用
PARM_DECL
儿童及其类型。然而,人们应该意识到,存在一种固有的脆弱性
get_children()
因为节点的“子节点”可以包含 根据所检查的语法,随意混合 AST 节点。 这 Clang C++ API 具有按角色明确分隔的子 API,但 C API 和 因此 Python API 也将它们全部放在一个单一的文件中 列表。 您可能需要仔细检查每个孩子的类型(以及 也许更多)以便可靠地识别其在其父级中的角色。
示例程序#!/usr/bin/env python3
"""Print function pointer declaration parameters."""
import sys
from clang.cindex import Config, CursorKind, Index, TypeKind
def cursor_loc(c):
"""Return the location of `c` as a string."""
return f"{c.location.line}:{c.location.column}"
def print_type_details(t, indent_level):
"""Print details about the type `t`."""
ind = " " * indent_level
print(f"{ind}kind: {t.kind}")
if t.kind == TypeKind.POINTER:
print(f"{ind} pointee:")
print_type_details(t.get_pointee(), indent_level + 2)
elif t.kind == TypeKind.FUNCTIONPROTO:
print(f"{ind} return type:")
print_type_details(t.get_result(), indent_level + 2)
print(f"{ind} parameter types:")
for param_type in t.argument_types():
print_type_details(param_type, indent_level + 2)
# A comprehensive program would print details for more kinds of types
# here. The above is just what is needed to demonstrate getting the
# details for a declaration of a pointer to a function.
def print_decl_details(c, already_printed, indent_level):
"""Print details about the declaration `c`. `already_printed` is a
map from location (as a string) to True for those declarations that
have already been printed."""
ind = " " * indent_level
loc = cursor_loc(c)
if c.location:
# Avoid printing the same declaration twice. This technique is very
# crude (there can be multiple distinct declarations at the same
# location) but suffices for use in a demonstration program.
if loc in already_printed:
print(f"{ind}{c.kind} at {loc} (already printed)")
return
already_printed[loc] = True
print(f"{ind}{c.kind} at {loc}")
print(f"{ind} spelling: {c.spelling}")
if c.type:
print(f"{ind} type:")
print_type_details(c.type, indent_level + 2)
for child in c.get_children():
print_decl_details(child, already_printed, indent_level + 1)
def main():
# Load the Clang module. On my Windows system, using Cygwin Python, I
# seem to have to tell it explicitly the name of the DLL (it being on
# the PATH is not enough).
Config.set_library_file("/cygdrive/d/opt/winlibs-mingw64-13.2/bin/libclang.dll");
index = Index.create()
# Parse the C source code.
tu = index.parse("test.c");
# Stop if there were syntax errors.
if len(tu.diagnostics) > 0:
for d in tu.diagnostics:
print(d)
sys.exit(2)
# Parse was successful. Inspect the AST.
print_decl_details(tu.cursor, {}, 0)
main()
# EOF
当使用
test.c
运行时,包含:
typedef struct {
void (*f)(int x, int y);
} vtable_t;
它打印:
CursorKind.TRANSLATION_UNIT at 0:0
spelling: test.c
type:
kind: TypeKind.INVALID
CursorKind.STRUCT_DECL at 1:9
spelling: vtable_t
type:
kind: TypeKind.RECORD
CursorKind.FIELD_DECL at 2:10
spelling: f
type:
kind: TypeKind.POINTER
pointee:
kind: TypeKind.FUNCTIONPROTO
return type:
kind: TypeKind.VOID <--- return type
parameter types:
kind: TypeKind.INT
kind: TypeKind.INT
CursorKind.PARM_DECL at 2:17
spelling: x <--- param 1 name
type:
kind: TypeKind.INT <--- param 1 type
CursorKind.PARM_DECL at 2:24
spelling: y <--- param 2 name
type:
kind: TypeKind.INT <--- param 2 type
CursorKind.TYPEDEF_DECL at 3:3
spelling: vtable_t
type:
kind: TypeKind.TYPEDEF
CursorKind.STRUCT_DECL at 1:9 (already printed)
上面的程序使用
already_printed
机制的原因是 避免打印结构声明两次。 事实是它 在 AST 中出现两次是由于 Clang 表示方式的一个怪癖
typedef struct
习语以及 C API 如何公开它(通常,
抽象语法树实际上是一棵树)。