合并pandas中同名的列

问题描述 投票:0回答:1

我有一个数据框,由几个具有相同名称的列组成,即

items_group quanity items_group quanity 
KIT1259 0       
KIT1260 0       
KIT1261 0       
                       KIT1151  1
                       KIT1198A 4
                       KIT1198D 5
                       KIT1243  29
                       KIT1249  8

如何合并它们并且每个只有一列? 我试过了

    df_expanded2 = df_expanded2.stack().dropna().unstack()

但这有时会出现以下错误:

错误:

Traceback (most recent call last):
  File "/Users/rahulsharma/PycharmProjects/Trakkia-Backend/venv/lib/python3.8/site-packages/django/core/handlers/exception.py", line 34, in inner
    response = get_response(request)
  File "/Users/rahulsharma/PycharmProjects/Trakkia-Backend/venv/lib/python3.8/site-packages/django/core/handlers/base.py", line 115, in _get_response
    response = self.process_exception_by_middleware(e, request)
  File "/Users/rahulsharma/PycharmProjects/Trakkia-Backend/venv/lib/python3.8/site-packages/django/core/handlers/base.py", line 113, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/Users/rahulsharma/PycharmProjects/Trakkia-Backend/venv/lib/python3.8/site-packages/django/views/decorators/csrf.py", line 54, in wrapped_view
    return view_func(*args, **kwargs)
  File "/Users/rahulsharma/PycharmProjects/Trakkia-Backend/venv/lib/python3.8/site-packages/django/views/generic/base.py", line 71, in view
    return self.dispatch(request, *args, **kwargs)
  File "/Users/rahulsharma/PycharmProjects/Trakkia-Backend/venv/lib/python3.8/site-packages/rest_framework/views.py", line 505, in dispatch
    response = self.handle_exception(exc)
  File "/Users/rahulsharma/PycharmProjects/Trakkia-Backend/venv/lib/python3.8/site-packages/rest_framework/views.py", line 465, in handle_exception
    self.raise_uncaught_exception(exc)
  File "/Users/rahulsharma/PycharmProjects/Trakkia-Backend/venv/lib/python3.8/site-packages/rest_framework/views.py", line 476, in raise_uncaught_exception
    raise exc
  File "/Users/rahulsharma/PycharmProjects/Trakkia-Backend/venv/lib/python3.8/site-packages/rest_framework/views.py", line 502, in dispatch
    response = handler(request, *args, **kwargs)
  File "/Users/rahulsharma/PycharmProjects/Trakkia-Backend copy/reports/views.py", line 1890, in get
    df_expanded2 = df_expanded2.stack().dropna().unstack()
  File "/Users/rahulsharma/PycharmProjects/Trakkia-Backend/venv/lib/python3.8/site-packages/pandas/core/series.py", line 4157, in unstack
    return unstack(self, level, fill_value)
  File "/Users/rahulsharma/PycharmProjects/Trakkia-Backend/venv/lib/python3.8/site-packages/pandas/core/reshape/reshape.py", line 491, in unstack
    unstacker = _Unstacker(
  File "/Users/rahulsharma/PycharmProjects/Trakkia-Backend/venv/lib/python3.8/site-packages/pandas/core/reshape/reshape.py", line 140, in __init__
    self._make_selectors()
  File "/Users/rahulsharma/PycharmProjects/Trakkia-Backend/venv/lib/python3.8/site-packages/pandas/core/reshape/reshape.py", line 192, in _make_selectors
    raise ValueError("Index contains duplicate entries, cannot reshape")
ValueError: Index contains duplicate entries, cannot reshape
python pandas
1个回答
0
投票

假设索引唯一,这可能意味着两件事:

  1. 您没有空单元格(NaN/None),而是空字符串。

将它们替换为“无”:

out = df_expanded2.replace('', None).stack().dropna().unstack()
  1. 在给定名称的多个列中存在包含非空值的行

拿第一个:

out = df_expanded2.stack(dropna=True).groupby(level=[0, 1]).first().unstack()

输出:

  items_group quanity
0     KIT1259       0
1     KIT1260       0
2     KIT1261       0
3     KIT1151       1
4    KIT1198A       4
5    KIT1198D       5
6     KIT1243      29
7     KIT1249       8
最新问题
© www.soinside.com 2019 - 2024. All rights reserved.