这是在 Python 中以进程安全的方式使用会话进行多处理的正确方法吗?

问题描述 投票:0回答:1

我将在我的应用程序中使用会话来实现请求,但我已经阅读了一些有关线程和进程安全问题的线程,但这些并不是最近的。

我的应用程序运行 800 多个进程,所以我有点担心从 API 恢复的数据会混乱。

我想到了这种实现模型,所以想知道这是否是在多进程程序中处理请求会话的正确方法。

import multiprocessing
import time
import sys
import requests
from requests.packages.urllib3.util.retry import Retry

session = None

def initialize_session():
    global session
    if session is None:
        session = requests.Session()
        retry_strategy = Retry(
            total=3,
            status_forcelist=[429,500,502,503,504],
            allowed_methods=["POST"],
            backoff_factor=1
            )
        adapter = requests.adapters.HTTPAdapter( max_retries=retry_strategy, 
          pool_connections=1, pool_maxsize=1)
        session.mount("https://", adapter)


def worker(args):
  global session
  initialize_session()
  try:
    for i in range(10):
      print("I am process "+str(j)+" and my cookie is ")
      print(session.cookies.get_dict())
      session.cookies.set('worker', j)
      time.sleep(5) #do some api work, function calls, etc

  except:
    raise

processes = []
for j in range(0, 4):
    p = multiprocessing.Process(target=worker, args=(j,))
    processes.append(p)
    time.sleep(0.1)
    p.start()

for p in processes:
    p.join()

python python-3.x session python-requests python-multiprocessing
1个回答
0
投票

您提供的实现对于如何在多处理上下文中处理全局

session
存在一些问题。具体来说,在 Python 多处理中跨进程共享全局
requests.Session
可能会导致问题,因为每个进程都维护自己的内存空间。尝试在进程之间共享像 requests.Session 这样的可变对象可能会导致竞争条件、损坏的状态或其他意外行为。

以下是如何调整代码以在多进程设置中正确处理会话:

import multiprocessing
import time
import requests
from requests.packages.urllib3.util.retry import Retry


def create_session():
    # Each process gets its own session
    session = requests.Session()
    retry_strategy = Retry(
        total=3,
        status_forcelist=[429, 500, 502, 503, 504],
        allowed_methods=["POST"],
        backoff_factor=1
    )
    adapter = requests.adapters.HTTPAdapter(max_retries=retry_strategy)
    session.mount("https://", adapter)
    return session


def worker(j):
    # Each worker process initializes its own session
    session = create_session()
    
    try:
        for i in range(10):
            print(f"I am process {j} and my cookie is ")
            print(session.cookies.get_dict())
            # setting a cookie unique to the process
            session.cookies.set('worker', str(j))
            time.sleep(5)
    except Exception as e:
        print(f"Error in process {j}: {e}")


if __name__ == "__main__":
    processes = []
    for j in range(4):  # Creating 4 processes
        p = multiprocessing.Process(target=worker, args=(j,))
        processes.append(p)
        time.sleep(0.1)
        p.start()

    for p in processes:
        p.join()
© www.soinside.com 2019 - 2024. All rights reserved.