我对python很陌生,请像对待我一样。当我尝试将XML内容转换为“词典列表”时,我得到了输出,但没有达到预期的效果,并且尝试了很多。
XML内容
<project> <data> <row> <respondent>m0wxo5f6w42h3fot34m7s6xij</respondent> <timestamp>10-06-16 11:30</timestamp> <product>1</product> <replica>1</replica> <seqnr>1</seqnr> <session>1</session> <column> <question>Q1</question> <answer>a1</answer> </column> <column> <question>Q2</question> <answer>a2</answer> </column> </row> <row> <respondent>w42h3fot34m7s6x</respondent> <timestamp>10-06-16 11:30</timestamp> <product>1</product> <replica>1</replica> <seqnr>1</seqnr> <session>1</session> <column> <question>Q3</question> <answer>a3</answer> </column> <column> <question>Q4</question> <answer>a4</answer> </column> <column> <question>Q5</question> <answer>a5</answer> </column> </row> </data> </project>
我使用的代码:
import xml.etree.ElementTree as ET tree = ET.parse(xml_file.xml) # import xml from root = tree.getroot() data_list = [] for item in root.find('./data'): # find all projects node data = {} # dictionary to store content of each projects for child in item: data[child.tag] = child.text # add item to dictionary #-----------------for loop with subchild is not working as expcted in my case for subchild in child: data[subchild.tag] = subchild.text data_list.append(data) print(data_list) headers = {k for d in data_list for k in d.keys()} # headers for csv with open(csv_file,'w') as f: writer = csv.DictWriter(f, fieldnames = headers) # creating a DictWriter object writer.writeheader() # write headers to csv writer.writerows(data_list)
data_list的输出正在将问题的最后一个信息发送到词典列表中。我想问题出在子子forloop上,但是我不明白如何用字典追加列表。
[{ 'respondent': 'anonymous_m0wxo5f6w42h3fot34m7s6xij', 'timestamp': '10-06-16 11:30', 'product': '1', 'replica': '1', 'seqnr': '1', 'session': '1', 'column': '\n , 'question': 'Q2', 'answer': 'a2' }, { 'respondent': 'w42h3fot34m7s6x', 'timestamp': '10-06-16 11:30', 'product': '1', 'replica': '1', 'seqnr': '1', 'session': '1', 'column': '\n , 'question': 'Q2', 'answer': 'a2' }....... ]
我期望下面的输出,尝试了很多,但无法遍历列标记。
[{ 'respondent': 'anonymous_m0wxo5f6w42h3fot34m7s6xij', 'timestamp': '10-06-16 11:30', 'product': '1', 'replica': '1', 'seqnr': '1', 'session': '1', 'question': 'Q1', 'answer': 'a1' }, { 'respondent': 'anonymous_m0wxo5f6w42h3fot34m7s6xij', 'timestamp': '10-06-16 11:30', 'product': '1', 'replica': '1', 'seqnr': '1', 'session': '1', 'question': 'Q2', 'answer': 'a2' }, { 'respondent': 'w42h3fot34m7s6x', 'timestamp': '10-06-16 11:30', 'product': '1', 'replica': '1', 'seqnr': '1', 'session': '1', 'question': 'Q3', 'answer': 'a3' }, { 'respondent': 'w42h3fot34m7s6x', 'timestamp': '10-06-16 11:30', 'product': '1', 'replica': '1', 'seqnr': '1', 'session': '1', 'question': 'Q4', 'answer': 'a4' }, { 'respondent': 'w42h3fot34m7s6x', 'timestamp': '10-06-16 11:30', 'product': '1', 'replica': '1', 'seqnr': '1', 'session': '1', 'question': 'Q5', 'answer': 'a5' } ]
我在xml树上引用了很多堆栈溢出问题,但仍然没有帮助我。
感谢您的任何帮助/建议。
我对python很陌生,请像对待我一样。当我尝试将XML内容转换为字典列表时,我得到了输出,但没有达到预期的效果,并且尝试了很多。 XML ...
我在理解此代码应该执行的操作时遇到了问题,因为它使用了抽象变量名,例如item
,child
,subchild
,这使得对代码进行推理变得困难。我不是那么聪明,所以我将变量重命名为row
,tag
和column
,以便于我更轻松地查看代码的作用。 (在我的书中,即使row