熊猫表刮

问题描述 投票:0回答:1

我试图找出将表转换为JSON记录的最佳方法。目前我有所需的输出,但桌子的格式让我感到困惑。以下示例应解释:

ID   Product        Item_Material   Owner           Interest %
123  Test Item 1    Electric        Elctrotech              60%
null null           null            Spark inc               40%
124  Test Item 2    Wood            TY Toys                 100%
125  Test Item 3    Plastic         NA Materials            100%

我的新行JSON是我想要的,但我希望以某种方式将嵌套的表行实现为嵌套的JSON格式(如果是父行的一部分)。

{"ID":"Test Item 1", "Item_Material":"Electric", "Owner":"Elctrotech","Interest %":"60%"}
{"ID":null, "Item_Material":null, "Owner":"Spark inc","Insterest %":"40%"} 
{"ID":"Test Item 2", "Item_Material":"Wood", "Owner":"TY Toys","Insterest %":"100%"}
{"ID":"Test Item 3","Item_Material":"Plastic","Owner":"NA Materials","Interest %":"100%"}

目标是让第一行JSON像这样吗?

{"ID":"Test Item 1", "Item_Material":"Electric", "Owners": [{"Owner": "Elctrotech", "Interest %":"60%", "Owner":"Spark inc","Interest %":"40%"}]}

数据源自使用Beautiful Soup的刮表,我提供的表中的行都在单独的<tr>标签中,因此当拉入pandas数据帧时,它就会以这种方式呈现。我不知道是否有功能甚至将pandas合并到上面的行中,因此每个'Product'可以有一个JSON记录。有时每个项目可能有多个“所有者”而不仅仅是2个。

python json pandas
1个回答
0
投票

输出的dict行与你预期的不一样,但是你的dict sintax错了。试试这个。只有熊猫

p=[[123,"Test Item 1","Electric","Elctrotech","60%"], [124,"Test Item 2","Wood"," TY Toys","100%"],[125,"Test Item 1","Plastic","NA Materials","100%"], [123,"Test Item 1","Foo","Bar","80%"], [123,"Test Item 1","Electric","TRY TRY TRY","70%"]]

x=pd.DataFrame(p, columns=["ID","Product","Item_Material","Owner","Interest %"])

d=dict(ID="", Item_Material="", Owners={"Owner":[], "Interest %":[]})
x_gb=x.groupby(["Product", "Item_Material"])
grouped_Series_Owner = x_gb["Owner"].apply(list).to_dict()
grouped_Series_Interest = x_gb["Interest %"].apply(list).to_dict()
for k in out.keys():
    d["Item_Material"]=out[k]["Item_Material"]
    d["ID"]=out[k]["Product"]
    d["Owners"]["Owner"]= grouped_Series_Owner[(out[k]["Product"], out[k]["Item_Material"])]
    d["Owners"]["Interest %"]= grouped_Series_Interest[(out[k]["Product"], out[k]["Item_Material"])]
    print(d)
© www.soinside.com 2019 - 2024. All rights reserved.