嵌套 For 循环填充 API 参数

问题描述 投票:0回答:1

我需要为特定的 API 编写一个嵌套的 for 循环。我首先创建一个日历数据框来填充 API 参数和输出数据框中的日期列。

日历的代码似乎有效。 Ms 列是 API 所需的毫秒。我已经将它们变成了下面 for 循环中的函数。

calendar =  pd.date_range(start=(datetime.today() - timedelta(days=6)).date(), end=datetime.today().date(), freq='d')
calendar = pd.DataFrame(calendar)
calendar.rename(columns={0:'Date'}, inplace=True)
calendar['startMs'] = calendar.apply(lambda x: int(round(datetime.combine(x['Date'], datetime.min.time()).astimezone(timezone.utc).timestamp() * 1000)), 1)
calendar['endMs'] = calendar.apply(lambda x: int(round(datetime.combine(x['Date'], datetime.max.time()).astimezone(timezone.utc).timestamp() * 1000)), 1)
日期 开始女士 结束女士
2024-05-29 1716955200000 1717041600000
2024-05-30 1717041600000 1717128000000
2024-05-31 1717128000000 1717214400000
2024-06-01 1717214400000 1717300800000
2024-06-02 1717300800000 1717387200000
2024-06-03 1717387200000 1717473600000
2024-06-04 1717473600000 1717560000000

这是我尝试过的循环,但没有任何效果。

time.sleep(1)
是为了防止 API 超出调用限制(我们的速率为每秒 5 次调用)。目标是为每个驱动程序提取 7 天的数据,然后将其 ETL 到服务器。如果我只拉动一天,驱动程序循环就会起作用,而当我尝试多天时,它就会失败。为了测试目的,我只将驱动程序表设置为 20 条记录,并且输出始终为 20 条记录。我预计有 140 条记录(20 名驾驶员 x 7 天)。今天是 6/4,我收到的输出显示日期列为 6/3(仅当有帮助时

for i in range(0,6):
    date = calendar['Date'][i].date()
    startMs = calendar['startMs'][i]
    endMs = calendar['endMs'][i]
    df = pd.DataFrame()
    for driverId in drivers['id']:
        response = requests.request('GET', url, headers=headers).json()
        url = f'https://api.samsara.com/v1/fleet/drivers/{driverId}/safety/score?startMs={startMs}&endMs={endMs}'
        df = df._append(response, ignore_index=True)
        time.sleep(1)
    df['Date'] = date

这是有关驱动程序数据帧的信息:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20 entries, 0 to 19
Data columns (total 4 columns):
 #   Column    Non-Null Count  Dtype 
---  ------    --------------  ----- 
 0   id        20 non-null     object
 1   name      20 non-null     object
 2   username  20 non-null     object
 3   timezone  20 non-null     object
dtypes: object(4)
memory usage: 772.0+ bytes

这个循环可以单独工作:

# Today
value = 0
date = calendar['Date'][value].date()
startMs = calendar['startMs'][value]
endMs = calendar['endMs'][value]
df0 = pd.DataFrame()
for driverId in drivers['id']:
    response = requests.request('GET', url, headers=headers).json()
    url = f'https://api.samsara.com/v1/fleet/drivers/{driverId}/safety/score?startMs={startMs}&endMs={endMs}'
    df0 = df0._append(response, ignore_index=True)
    time.sleep(1)
df0['Date'] = date

这部分代码也是如此:

for i in range(0,6):
    date = calendar['Date'][i].date()
    startMs = calendar['startMs'][i]
    endMs = calendar['endMs'][i]

或者建议的方式:

for i in range(1):
    row = calendar.iloc[i]
    date = row['Date'].date()
    startMs = row['startMs']
    endMs = row['endMs']

我不知道如何让他们一起工作。

它工作时的响应输出看起来像这样,除了它有真实数据:

for driverId in drivers['id']:
    response = requests.request('GET', url, headers=headers).json()
    url = f'https://api.samsara.com/v1/fleet/drivers/{driverId}/safety/score?startMs={startMs}&endMs={endMs}'
    pprint(response)

{'crashCount': 0,
 'driverId': 1234,
 'harshAccelCount': 0,
 'harshBrakingCount': 0,
 'harshEvents': [],
 'harshTurningCount': 0,
 'safetyScore': 100,
 'safetyScoreRank': '1',
 'timeOverSpeedLimitMs': 0,
 'totalDistanceDrivenMeters': 0,
 'totalHarshEventCount': 0,
 'totalTimeDrivenMs': 0}
{'crashCount': 0,
 'driverId': 1235,
 'harshAccelCount': 0,
 'harshBrakingCount': 0,
 'harshEvents': [],
 'harshTurningCount': 0,
 'safetyScore': 100,
 'safetyScoreRank': '1',
 'timeOverSpeedLimitMs': 0,
 'totalDistanceDrivenMeters': 0,
 'totalHarshEventCount': 0,
 'totalTimeDrivenMs': 0}
python pandas for-loop
1个回答
0
投票

df._append
用于追加行,在您的情况下,字典中的每个键似乎都被视为一个新行。

更正确的方法是这样做:

list_df = []

for i_day in range(0,6):

    row = calendar.iloc[i_day]
    date = row['Date'].date()
    startMs = row['startMs']
    endMs = row['endMs']

    for driverId in drivers['id']:
        # get json from the API
        resp = {'id': 22, 'name': 'John', 'username':'John', 'timezone':'utc'}

        # serialize data in the dataframe 
        # `index = [0]` indicates that we have only single row
        df1 = pd.DataFrame(resp, index=[0])

        # append new rows to final dataframe
        list_df.append(df1)

df = pd.concat(list_df)
print(df)
© www.soinside.com 2019 - 2024. All rights reserved.