Bigframes 包 - Dataframe 应用功能出现问题

问题描述 投票:0回答:1

我是 googleapis 的 bigframes 包的新用户。我正在尝试操作从 Bigquery 加载的数据帧。

我试图执行一些代码,但我遇到了一个我无法解决的问题。

我尝试在参数 axis=1 的 Dataframe 上使用 apply 函数,但它似乎不起作用。我总是收到错误消息。

你能帮我解决这个问题吗?

谢谢。

代码示例

# example
def condition(row):
    print(row )
    if 1 <= row["month"] <= 6:
        return f"{row['year']:02}S1{row['CODPY']}{row['CODDE']}"
    else:
        return f"{row['year']:02}S2{row['CODPY']}{row['CODDE']}"

valodetail_df['IDT'] = valodetail_df.apply(condition,axis=1)

堆栈跟踪

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ubuntu/miniconda3/envs/qback/lib/python3.11/site-packages/bigframes/core/log_adapter.py", line 44, in wrapper
    return method(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/envs/qback/lib/python3.11/site-packages/bigframes/dataframe.py", line 3118, in apply
    results = {name: func(col, *args, **kwargs) for name, col in self.items()}
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/envs/qback/lib/python3.11/site-packages/bigframes/dataframe.py", line 3118, in <dictcomp>
    results = {name: func(col, *args, **kwargs) for name, col in self.items()}
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<stdin>", line 3, in condition
  File "missing.pyx", line 419, in pandas._libs.missing.NAType.__bool__
TypeError: boolean value of NA is ambiguous
>>> valodetail_df['IDTDCI'] = valodetail_df.apply(condition,axis=1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ubuntu/miniconda3/envs/qback/lib/python3.11/site-packages/bigframes/core/log_adapter.py", line 44, in wrapper
    return method(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/envs/qback/lib/python3.11/site-packages/bigframes/dataframe.py", line 3118, in apply
    results = {name: func(col, *args, **kwargs) for name, col in self.items()}
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/envs/qback/lib/python3.11/site-packages/bigframes/dataframe.py", line 3118, in <dictcomp>
    results = {name: func(col, *args, **kwargs) for name, col in self.items()}
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: condition() got an unexpected keyword argument 'axis'
python-3.x dataframe google-cloud-platform google-bigquery google-api-python-client
1个回答
0
投票
目前不支持

axis=1
https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.dataframe.DataFrame#bigframes_dataframe_DataFrame_apply

这是一个功能请求 https://github.com/googleapis/python-bigquery-dataframes/issues/592 同样的。

但是,对于您的特定用例,可以通过其他方式来实现。

这是对您正在使用的 DataFrame 类型的猜测:

import bigframes.pandas as bpd

df = bpd.DataFrame({
        "month": [1,3,6,7,12],
        "year":  ["8", "9", "10", "11", "12"],
        "CODPY": ["PY", "PY", "PY", "PY", "PY"],
        "CODDE": ["DE", "DE", "DE", "DE", "DE"],
     })
df
    month   year    CODPY   CODDE
0       1      8       PY      DE
1       3      9       PY      DE
2       6     10       PY      DE
3       7     11       PY      DE
4      12     12       PY      DE

我们可以使用其他 DataFrame 和 Series API 来创建所需的列:

condition = (df["month"] >= 1) & (df["month"] <= 6)

s1 = df["year"].str.pad(fillchar='0', width=2) + "S1" + df["CODPY"] + df["CODDE"]

s2 = df["year"].str.pad(fillchar='0', width=2) + "S2" + df["CODPY"] + df["CODDE"]

df['IDT'] = s1.where(condition, s2)
df
   month    year    CODPY   CODDE        IDT
0      1       8       PY      DE   08S1PYDE
1      3       9       PY      DE   09S1PYDE
2      6      10       PY      DE   10S1PYDE
3      7      11       PY      DE   11S2PYDE
4     12      12       PY      DE   12S2PYDE

希望这有帮助。

© www.soinside.com 2019 - 2024. All rights reserved.