提高Python REST请求效率

问题描述 投票:0回答:1

之前,我使用 SQL 连接来获取数据,10,000 行只需要几分之一秒的时间。

select name,
       profession.expertise,
       resonsible.name,
       building.locker
from technicians
join profession on technicin.profession = profession.id
join responsible on techician.responsible = responsible.id
join building on technician.locker = buiding.id

迁移到云端后,我无法再访问该数据库,需要休息。 不幸的是,将其转换为 python REST 请求,即使只是 50 行,也会导致 20 秒的停顿,10000 行将只是几个小时!? 因为我必须遍历所有结果并请求相关关系链接数据,对吧?

伪Python

result_list = requests.get("https://sample.com/api/v1/technicians")
for technician in result_list:
    professn = requests.get("https://sample.com/api/v1/technicians/technician_ID/profession")
    resopnsible = requests.get("https://sample.com/api/v1/technicians/technician_ID/responsible")
    building = requests.get("https://sample.com/api/v1/technicians/technician_ID/building")

问题:我提出的请求是否做错了什么?

python rest python-requests
1个回答
0
投票

迁移到基于 REST 的解决方案后,您将面临性能问题,现在为技术人员获取相关数据所需的时间比使用 SQL 连接时要长得多。您可以通过以下几种方法来优化 REST API 使用:

1. 检查批量或扩展请求

首先检查API是否支持在单个请求中查询相关实体。许多 REST API 提供

include
expand
参数,这可以大大减少 API 调用次数。

例如:

result_list = requests.get("https://sample.com/api/v1/technicians?include=profession,responsible,building")

这会在单个请求中返回技术人员及其

profession
responsible
building
数据,从而避免对每个关系进行额外调用。

2. 使用并发请求

如果无法将相关数据批处理到一个请求中,您可以通过发出并发请求而不是按顺序执行请求来提高性能。 Python 的

concurrent.futures
asyncio
可以帮助解决这个问题。

这是一个使用

concurrent.futures
的示例:

import requests
from concurrent.futures import ThreadPoolExecutor

def get_related_data(technician):
    technician_id = technician['id']
    profession = requests.get(f"https://sample.com/api/v1/technicians/{technician_id}/profession").json()
    responsible = requests.get(f"https://sample.com/api/v1/technicians/{technician_id}/responsible").json()
    building = requests.get(f"https://sample.com/api/v1/technicians/{technician_id}/building").json()
    return {'technician': technician, 'profession': profession, 'responsible': responsible, 'building': building}

# Fetch the list of technicians
result_list = requests.get("https://sample.com/api/v1/technicians").json()

# Fetch related data concurrently
with ThreadPoolExecutor(max_workers=10) as executor:
    results = list(executor.map(get_related_data, result_list))

# `results` will now contain all technicians and their related data

这种方法通过并行获取数据而不是一次一个请求来减少等待时间。

3. 缓存重复或不频繁的数据

如果某些实体(如

profession
responsible
)在多个技术人员之间共享或不经常更新,则在本地缓存此数据可以帮助减少冗余 API 调用。

这是一个简单的缓存实现:

cache = {}

def fetch_with_cache(url):
    if url not in cache:
        cache[url] = requests.get(url).json()
    return cache[url]

result_list = requests.get("https://sample.com/api/v1/technicians").json()

for technician in result_list:
    profession = fetch_with_cache(f"https://sample.com/api/v1/technicians/{technician['id']}/profession")
    responsible = fetch_with_cache(f"https://sample.com/api/v1/technicians/{technician['id']}/responsible")
    building = fetch_with_cache(f"https://sample.com/api/v1/technicians/{technician['id']}/building")

此方法减少了多次请求同一资源时的请求次数。

4. 限制请求字段

如果您获取的数据过多,请检查 API 是否允许部分响应,您可以仅请求所需的特定字段。这减少了有效负载大小并缩短了响应时间。

© www.soinside.com 2019 - 2024. All rights reserved.