我在 yaml 中有一个输入,其中包含各种级别的嵌套对象。我需要一个 python 函数来遍历所有内容并获得所需的输出 - 字符串列表,其中每个字段如果嵌套则用点分隔 - Object1.Object2.Object3.Object4...下面的示例。
我试图用递归函数来实现它。我的代码片段:
tests = []
test2 = {}
def test(config, parent=None):
previous_parent = None
names = []
for column in config:
if column.get("dtype") in ["array", "struct"]:
parent = column["name"]
print(f"parent: {parent}")
test(column["columns"], parent)
else:
value = column["name"]
print(f"value: {value}")
# names.append(value)
输出为:
value: PartitionDate
value: TransactionID
value: EventTimestamp
parent: ControlTransaction
value: StoreID
parent: RetailTransaction
value: StoreID
value: WorkstationID
...
输入:
columns:
- name: PartitionDate
- name: TransactionID
- name: EventTimestamp
- name: ControlTransaction
dtype: struct
columns:
- name: StoreID
- name: WorkstationID
- name: Transaction
dtype: struct
columns:
- name: TransactionID
- name: TransactionNumber
- name: ControlType
- name: RetailTransaction
dtype: struct
columns:
- name: StoreID
- name: WorkstationID
输出:
[
PartitionDate,
TransactionID,
EventTimestamp,
ControlTransaction.StoreID,
ControlTransaction.WorkstationID,
ControlTransaction.Transaction.TransactionID,
ControlTransaction.TransactionNumber,
ControlType,
RetailTransaction.StoreID,
RetailTransaction.WorkstationID
]
只需一些更改:
parent=None
参数替换为 parents=[]
以提供完整的父级名称列表。"columns"
:
parent
列表中。names
。"columns"
:将其名称与 parents
和 join
此列表与 .
分隔符组合起来。names
。import yaml
def test(config, parents=[]):
names = []
for column in config:
if column.get("dtype") in ["array", "struct"] and "columns" in column:
cur_parents = parents.copy()
cur_parents.append(column["name"])
children = test(column["columns"], cur_parents)
names.extend(children)
else:
value = column["name"]
value_path = parents + [value]
names.append(".".join(value_path))
return names
with open("input.yaml", "r") as inp:
yaml_conf = yaml.safe_load(inp)
values = test(yaml_conf.get("columns"))
print("[\n{}\n]".format(",\n".join(values)))