提高Python REST请求效率

Question

之前，我使用 SQL 连接来获取数据，10,000 行只需要几分之一秒的时间。

select name,
       profession.expertise,
       resonsible.name,
       building.locker
from technicians
join profession on technicin.profession = profession.id
join responsible on techician.responsible = responsible.id
join building on technician.locker = buiding.id

迁移到云端后，我无法再访问该数据库，需要休息。不幸的是，将其转换为 python REST 请求，即使只是 50 行，也会导致 20 秒的停顿，10000 行将只是几个小时！？因为我必须遍历所有结果并请求相关关系链接数据，对吧？

伪Python

result_list = requests.get("https://sample.com/api/v1/technicians")
for technician in result_list:
    professn = requests.get("https://sample.com/api/v1/technicians/technician_ID/profession")
    resopnsible = requests.get("https://sample.com/api/v1/technicians/technician_ID/responsible")
    building = requests.get("https://sample.com/api/v1/technicians/technician_ID/building")

问题：我提出的请求是否做错了什么？

Answer 1

迁移到基于 REST 的解决方案后，您将面临性能问题，现在为技术人员获取相关数据所需的时间比使用 SQL 连接时要长得多。您可以通过以下几种方法来优化 REST API 使用：

1. 检查批量或扩展请求

首先检查API是否支持在单个请求中查询相关实体。许多 REST API 提供

include

或

expand

参数，这可以大大减少 API 调用次数。

例如：

result_list = requests.get("https://sample.com/api/v1/technicians?include=profession,responsible,building")

这会在单个请求中返回技术人员及其

profession

、

responsible

和

building

数据，从而避免对每个关系进行额外调用。

2. 使用并发请求

如果无法将相关数据批处理到一个请求中，您可以通过发出并发请求而不是按顺序执行请求来提高性能。 Python 的

concurrent.futures

或

asyncio

可以帮助解决这个问题。

这是一个使用

concurrent.futures

的示例：

import requests
from concurrent.futures import ThreadPoolExecutor

def get_related_data(technician):
    technician_id = technician['id']
    profession = requests.get(f"https://sample.com/api/v1/technicians/{technician_id}/profession").json()
    responsible = requests.get(f"https://sample.com/api/v1/technicians/{technician_id}/responsible").json()
    building = requests.get(f"https://sample.com/api/v1/technicians/{technician_id}/building").json()
    return {'technician': technician, 'profession': profession, 'responsible': responsible, 'building': building}

# Fetch the list of technicians
result_list = requests.get("https://sample.com/api/v1/technicians").json()

# Fetch related data concurrently
with ThreadPoolExecutor(max_workers=10) as executor:
    results = list(executor.map(get_related_data, result_list))

# `results` will now contain all technicians and their related data

这种方法通过并行获取数据而不是一次一个请求来减少等待时间。

3. 缓存重复或不频繁的数据

如果某些实体（如

profession

或

responsible

）在多个技术人员之间共享或不经常更新，则在本地缓存此数据可以帮助减少冗余 API 调用。

这是一个简单的缓存实现：

cache = {}

def fetch_with_cache(url):
    if url not in cache:
        cache[url] = requests.get(url).json()
    return cache[url]

result_list = requests.get("https://sample.com/api/v1/technicians").json()

for technician in result_list:
    profession = fetch_with_cache(f"https://sample.com/api/v1/technicians/{technician['id']}/profession")
    responsible = fetch_with_cache(f"https://sample.com/api/v1/technicians/{technician['id']}/responsible")
    building = fetch_with_cache(f"https://sample.com/api/v1/technicians/{technician['id']}/building")

此方法减少了多次请求同一资源时的请求次数。

4. 限制请求字段

如果您获取的数据过多，请检查 API 是否允许部分响应，您可以仅请求所需的特定字段。这减少了有效负载大小并缩短了响应时间。

提高Python REST请求效率

问题描述投票：0回答：1

1个回答

1. 检查批量或扩展请求

2. 使用并发请求

3. 缓存重复或不频繁的数据

4. 限制请求字段

最新问题

提高Python REST请求效率

问题描述 投票：0回答：1

1个回答

1. 检查批量或扩展请求

2. 使用并发请求

3. 缓存重复或不频繁的数据

4. 限制请求字段

最新问题

问题描述投票：0回答：1