我有一个数据帧df
有两列gender, score
。
|---------------------|------------------|
| gender | score |
|---------------------|------------------|
| male | 34 |
|---------------------|------------------|
| female | 34 |
|---------------------|------------------|
| male | 34 |
|---------------------|------------------|
| female | 34 |
|---------------------|------------------|
| male | 34 |
|---------------------|------------------|
我想将第3行到第5行的男性分数(gender == 'male'
)改为0,预期输出:
|---------------------|------------------|
| gender | score |
|---------------------|------------------|
| male | 34 |
|---------------------|------------------|
| female | 34 |
|---------------------|------------------|
| male | 0 |
|---------------------|------------------|
| female | 34 |
|---------------------|------------------|
| male | 0 |
|---------------------|------------------|
我怎样才能将iloc
与那种情况结合起来?
你可以用两个面具(条件)来做。这应该是可读的并且有意义。
m1 = (df.gender == 'male')
m2 = (df.gender.duplicated())
df.loc[m1&m2, 'score'] = 0
切掉非零掩模的第一个真值(需要import numpy as np
)。这应该更快。
m = np.nonzero(df.gender=='male')[0][1:]
df.loc[m, 'score'] = 0
完整示例:
import pandas as pd
import numpy as np
df = pd.DataFrame({
'gender': ['male','female','male','female','male'],
'score': 34
})
m1 = (df.gender == 'male')
m2 = (df.gender.duplicated())
m = np.nonzero(df.gender=='male')[0][1:]
df.loc[m, 'score'] = 0
print(df)
返回:
gender score
0 male 34
1 female 34
2 male 0
3 female 34
4 male 0
我想你需要,
m=df.loc[2:5,:].loc[df['gender']=='male']
df.loc[m.index,'score']=0
print(df)
gender score
0 male 34
1 female 34
2 male 0
3 female 34
4 male 0