Web Scraping Python:没有从网页的下拉菜单中获取所需数据

问题描述 投票:0回答:1

我试图从网页获取数据。这是链接https://www.cardekho.com/compare-cars。在这个页面中,我们需要在下拉菜单中提供汽车模型及其变体的URL后,对汽车数据表及其规格进行比较。这是我的示例代码。

from bs4 import BeautifulSoup
import requests
import csv

def job():
    url = 'https://www.cardekho.com/compare/maruti-gypsy-and-maruti-omni.htm'

    headers = {'User-Agent': 'Mozilla/65.0'}
    response = requests.get(url, headers=headers)

    soup = BeautifulSoup(response.content, 'html.parser')

    stat_table_1 = soup.find_all('table')


    print(len(stat_table_1))

    tab_1 = stat_table_1[0]
    tab_2 = stat_table_1[1]
    tab_3 = stat_table_1[2]
    tab_4 = stat_table_1[3]
    tab_5 = stat_table_1[4]

    rows_tab_1 = tab_1.findAll('tr')
    rows_tab_2 = tab_2.findAll('tr')
    rows_tab_3 = tab_3.findAll('tr')
    rows_tab_4 = tab_4.findAll('tr')
    rows_tab_5 = tab_5.findAll('tr')

    csv_file_1 = open("D:/CarDekho_Data/maruti/maruti_2/overview.csv", 'wt', encoding="utf-8", newline='')
    csv_file_2 = open("D:/CarDekho_Data/maruti/maruti_2/engine.csv", 'wt', encoding="utf-8", newline='')
    csv_file_3 = open("D:/CarDekho_Data/maruti/maruti_2/transmission.csv", 'wt', encoding="utf-8", newline='')
    csv_file_4 = open("D:/CarDekho_Data/maruti/maruti_2/steering.csv", 'wt', encoding="utf-8", newline='')
    csv_file_5 = open("D:/CarDekho_Data/maruti/maruti_2/brake_system.csv", 'wt', encoding="utf-8", newline='')

    writer_1 = csv.writer(csv_file_1)
    writer_2 = csv.writer(csv_file_2)
    writer_3 = csv.writer(csv_file_3)
    writer_4 = csv.writer(csv_file_4)
    writer_5 = csv.writer(csv_file_5)

    try:
        for row in rows_tab_1:
            csv_row = []
            for cell in row.findAll(['td', 'th']):
                csv_row.append(cell.get_text())
            writer_1.writerow(csv_row)
    finally:
        csv_file_1.close()

    try:
        for row in rows_tab_2:
            csv_row = []
            for cell in row.findAll(['td', 'th']):
                csv_row.append(cell.get_text())
            writer_2.writerow(csv_row)
    finally:
        csv_file_2.close()

    try:
        for row in rows_tab_3:
            csv_row = []
            for cell in row.findAll(['td', 'th']):
                csv_row.append(cell.get_text())
            writer_3.writerow(csv_row)
    finally:
        csv_file_3.close()

    try:
        for row in rows_tab_4:
            csv_row = []
            for cell in row.findAll(['td', 'th']):
                csv_row.append(cell.get_text())
            writer_4.writerow(csv_row)
    finally:
        csv_file_4.close()

    try:
        for row in rows_tab_5:
            csv_row = []
            for cell in row.findAll(['td', 'th']):
                csv_row.append(cell.get_text())
            writer_5.writerow(csv_row)
    finally:
        csv_file_5.close()

但问题是,由于URL,我无法获得所需的确切数据。这意味着如果我提供四种车型及其变体进行比较,它会从所提到的下拉菜单中随机提供该车型的数据。

任何人都可以解释我如何解决这个问题,并从该URL获取我需要的确切数据。

任何帮助将不胜感激。

python-3.x web-scraping beautifulsoup request
1个回答
0
投票

你正在做很多工作来解析这些表。熊猫可以使用.read_html()为您完成这项工作。

这将返回给您一个数据帧列表。只需选择数据帧并使用pandas .to_csv()写入csv。

如果是我,我会将这些压缩成一个循环来迭代这些,但我把它扩展了,所以你可以看到它被分解(如果这有用)

import pandas as pd

url = 'https://www.cardekho.com/compare/maruti-gypsy-and-maruti-omni.htm'
tables = pd.read_html(url)
compare_cols = list(tables[0].columns[1:])

overview = tables[0]

engine = tables[1]
engine.columns = [engine.columns[0]] + compare_cols

transmision = tables[2]
transmision.columns = [transmision.columns[0]] + compare_cols

steering = tables[3]
steering.columns = [steering.columns[0]] + compare_cols

brakes_system = tables[4]
brakes_system.columns = [brakes_system.columns[0]] + compare_cols


overview.to_csv('D:/CarDekho_Data/maruti/maruti_2/overview.csv', index=False)
engine.to_csv('D:/CarDekho_Data/maruti/maruti_2/engine.csv', index=False)
transmision.to_csv('D:/CarDekho_Data/maruti/maruti_2/transmision.csv', index=False)
steering.to_csv('D:/CarDekho_Data/maruti/maruti_2/steering.csv', index=False)
brakes_system.to_csv('D:/CarDekho_Data/maruti/maruti_2/brakes_system.csv', index=False)

输出:

print (overview)
                                      Overview                        ...                                                                       Omni
0                                On Road Price                        ...                                                               Rs.3,36,883*
1                                    Fuel Type                        ...                                                                     Petrol
2                     Engine Displacement (cc)                        ...                                                                        796
3                             Available Colors                        ...                          Fantasy BlackMetallic silky silverMetallic Pea...
4                                    Body Type                        ...                                                                    Minivan
5                                    Max Power                        ...                                                            34.2bhp@5000rpm
6                                 User Reviews                        ...                                                     4.5Based on 45 Reviews
7                               Mileage (ARAI)                        ...                                                                  16.8 kmpl
8                                 Cargo Volume                        ...                                                                 210-litres
9                           Fuel Tank Capacity                        ...                                                                   35Litres
10                            Seating Capacity                        ...                                                                          5
11                           Transmission Type                        ...                                                                     Manual
12                           Offers & Discount                        ...                                                            1 OfferView now
13                     Finance Available (EMI)                        ...                                                         Rs.6,510 Check Now
14                           Insurance SaveBig                        ...                                                          Rs.17,146Know how
15                                Service Cost                        ...                                                                   Rs.2,996
16                                         NaN                        ...                                                                        NaN
17                             Air Conditioner                        ...                                                                         No
18                                   Cd Player                        ...                                                                         No
19                    Anti Lock Braking System                        ...                                                                         No
20                              Power Steering                        ...                                                                         No
21                         Power Windows Front                        ...                                                                         No
22                          Power Windows Rear                        ...                                                                         No
23                               Leather Seats                        ...                                                                         No
24                Speed Sensing Auto Door Lock                        ...                                                                         No
25             Impact Sensing Auto Door Unlock                        ...                                                                          -
26                             Air Conditioner                        ...                                                                         No
27                                      Heater                        ...                                                                         No
28                         Adjustable Steering                        ...                                                                         No
29                                  Tachometer                        ...                                                                         No
..                                         ...                        ...                                                                        ...
47                       Adjustable Headlights                        ...                                                                        Yes
48                            Fog Lights Front                        ...                                                                         No
49                             Fog Lights Rear                        ...                                                                         No
50  Power Adjustable Exterior Rear View Mirror                        ...                                                                         No
51    Manually Adjustable Ext Rear View Mirror                        ...                                                                        Yes
52           Electric Folding Rear View Mirror                        ...                                                                         No
53                          Rain Sensing Wiper                        ...                                                                         No
54                           Rear Window Wiper                        ...                                                                         No
55                          Rear Window Washer                        ...                                                                         No
56                        Rear Window Defogger                        ...                                                                         No
57                                Wheel Covers                        ...                                                                         No
58                                Alloy Wheels                        ...                                                                         No
59                               Power Antenna                        ...                                                                         No
60                                Tinted Glass                        ...                                                                         No
61                                Rear Spoiler                        ...                                                                         No
62                Removable Or Convertible Top                        ...                                                                         No
63                                Roof Carrier                        ...                                                                         No
64                                    Sun Roof                        ...                                                                         No
65                                   Moon Roof                        ...                                                                         No
66                                Side Stepper                        ...                                                                         No
67    Outside Rear View Mirror Turn Indicators                        ...                                                                         No
68                          Integrated Antenna                        ...                                                                         No
69                               Chrome Grille                        ...                                                                         No
70                              Chrome Garnish                        ...                                                                         No
71                             Smoke Headlamps                        ...                                                                         No
72                                   Roof Rail                        ...                                                                         No
73                                    Lighting                        ...                                                                         No
74                                Trunk Opener                        ...                                                                      Lever
75                         Additional Features                        ...                          2 Speed Windshield WiperFront And Rear Thermop...
76                          Heated Wing Mirror                        ...                                                                         No

[77 rows x 3 columns]

...

print (engine)
                 Engine                 ...                                                  Omni
0                  Type                 ...                                        In-Line Engine
1          Displacement                 ...                                                   796
2             Max Power                 ...                                       34.2bhp@5000rpm
3                  Year                 ...                                                  2010
4            Max Torque                 ...                                          59Nm@2500rpm
5           Description                 ...                   0.8-litre 34.2bhp 6V In-Line Engine
6        No Of Cylinder                 ...                                                     3
7   Valves Per Cylinder                 ...                                                     2
8   Valve Configuration                 ...                                                  SOHC
9    Fuel Supply System                 ...                                                  MPFI
10         Bore XStroke                 ...                                                    No
11    Compression Ratio                 ...                                                    No
12        Turbo Charger                 ...                                                    No
13        Super Charger                 ...                                                    No

[14 rows x 3 columns]

等等。

© www.soinside.com 2019 - 2024. All rights reserved.