Pandas DataFrame 索引的自动增量选项

Question

有没有办法在添加新行时设置自动递增 pandas.DataFrame 索引的选项，或者定义一个用于管理新索引创建的函数？

Answer 1

您可以设置

ignore_index=True

当

append

-ing:

In [1]: df = pd.DataFrame([[1,2],[3,4]])

In [2]: row = pd.Series([5,6])

In [3]: df.append(row, ignore_index=True)
Out[3]: 
   0  1
0  1  2
1  3  4
2  5  6

Answer 2

请注意，如果您现有的索引有意义，那么接受的答案是危险的。例如：

df = pd.DataFrame(
    [('Alice', 1010, 'sales'), ('Bob', 1011, 'service')],
    columns = ['name', 'emp_id', 'dept']
).set_index('emp_id')

# here's a new employee to append, who has no id:
row = pd.Series({'name': 'Eve', 'dept': 'r&d'})

# this will wipe all the existing employee id numbers:
df.append(row, ignore_index=True)

解决此问题的一种方法是手动增加索引：

def add_new_row(df, row):
    row.name = max(df.index)+1
    return df.append(row)

# the existing ids are now preserved:
add_new_row(df, row)

Answer 3

fyi

df.append()

自 1.4 版本起已被弃用，自 2.0.0 版本起不再可用。现在应该使用

pd.concat()

（参见pandas文档）。但是，我同意@alex 的观点，即建议的答案并未解决索引丢失的问题。因此，如果我们假设以下情况：

df = pd.DataFrame(
    [('Alice', 1010, 'sales'), ('Bob', 1011, 'service')],
    columns = ['name', 'emp_id', 'dept']
).set_index('emp_id')

# here's a new employee to append, who has no id:
row = pd.Series({'name': 'Eve', 'dept': 'r&d'})

这是他对 pandas >=2 的答案的更新版本：

def add_new_row(df: pd.DataFrame, row: pd.Series) -> pd.DataFrame:
    index_name = df.index.name  # save name because concat deletes index name
    row.name = df.index.max() + 1
    new_df = pd.concat([df, s.to_frame().T])
    new_df.index.name = index_name
    return new_df

concat 不保留索引名称这一事实对我来说似乎有点奇怪，但我在 concat 中找不到任何选项。也许有更好的方法来做到这一点，但这仍然应该有效。

Pandas DataFrame 索引的自动增量选项

问题描述投票：0回答：3

3个回答

最新问题

Pandas DataFrame 索引的自动增量选项

问题描述 投票：0回答：3

3个回答

最新问题

问题描述投票：0回答：3