循环遍历pandas列中的字符串列表

问题描述 投票:0回答:1

我有一个JSON文件,我转换为pandas数据帧,

# Bring in data 
audit = pd.read_json('audit_2018-03-02.json')

现在,我有几列,其中这些列的值是一个字符串列表。

    foo

    [By Audience, By Vendor]
    [By Month, By Keyword, By Ad Group, By Service]
    [By Month, By To Date, By Keyword, By Ad Group]

我试图遍历列foo并从此列创建数据框。

我试过了,

list_of_records = [
    (i['By Month'],
     i['By Keyword'],
     i['By Ad Group'],
     i['By Audience'],
     i['By Vender'],
     i['By Week'],
     i['By To Date'],
     i['By Creative'],
     i['By Strategy'],
     i['By Converstion'],
     i['By Geo'],
     i['By Campaign']
    )
    for i, in zip(audit['foo'])
]

Dimensions_Measured = pd.DataFrame.from_records(
list_of_records,
columns = ['By Month', 'By Keyword', 'By Ad Group', 'By Audience', 'By Vender', 
           'By Week', 'By To Date', 'By Creative', 'By Strategy', 'By Converstion', 
           'By Geo', 'By Campaign']
    )

但我得到一个错误TypeError: list indices must be integers, not str

关于如何实现这一点的任何想法?

我应该进行某种热编码然后创建数据框吗?

python json pandas
1个回答
1
投票

您可以通过pd.Series.values.tolist()将一系列列表转换为多个系列:

foo = pd.Series([['By Audience', 'By Vendor'],
                 ['By Month', 'By Keyword', 'By Ad Group', 'By Service'],
                 ['By Month', 'By To Date', 'By Keyword', 'By Ad Group']])

df = pd.DataFrame(foo.values.tolist())

#              0           1            2            3
# 0  By Audience   By Vendor         None         None
# 1     By Month  By Keyword  By Ad Group   By Service
# 2     By Month  By To Date   By Keyword  By Ad Group
© www.soinside.com 2019 - 2024. All rights reserved.