Python - 如何根据每行违反的累积总和分配值？

Question

我正在尝试更新运行良好的现有代码，通过添加一个逻辑来指示某个变量的累积和等于 1 的位置，一旦违反，就会分配 x 值并用于继续其余的计算。但是，当我尝试运行此命令时，每行都会获得相同的 x 值 [它正在计算最后一行的 x 值并将其分配给所有行]。我已经检查了cumulative_sum并可以确认它不应该对于所有行都相同。

我在循环中将 x 重置为 0，所以我不确定还有什么可能导致此问题。

row_count =-1
for index, row in Input.iterrows():
    row_count+=1
    if index % 1 == 0:
        print(str(index) + ' Accounts Done. ')
    IFRS9_PD_SEGMENT = row['EY_SEG']
    IFRS9_ACCOUNT_STATUS = row['IFRS9_ACCOUNT_STATUS']
    IFRS9_ACCOUNT_ID = row['IFRS9_ACCOUNT_ID']
    IFRS9_DEFAULT_FLAG = row['IFRS9_DEFAULT_FLAG']
    EAD_DEF = row['EAD_DEF']
    LGD_DEF_Base = row['LGD_DEF_Base']
    LGD_DEF_Good = row['LGD_DEF_Good']
    LGD_DEF_Bad = row['LGD_DEF_Bad']
    LGD_DEF_Bad_2 = row['LGD_DEF_Bad_2']
    PD_no_Macro = {}
    PD_base = {}
    PD_good = {}
    PD_bad = {}  
    PD_bad_2 = {}  
    EAD = {}
    LGD_base = {}
    LGD_good = {}
    LGD_bad = {}
    LGD_bad_2 = {}
    ECL_base = {}
    ECL_good = {}
    ECL_bad = {} 
    ECL_bad_2 = {}  
    PD_weighted = {}
    Default_PD = 0.0
    
    if IFRS9_PD_SEGMENT == "DEFAULT":
        Default_PD = 1.0
    if IFRS9_PD_SEGMENT != "DEFAULT" and IFRS9_PD_SEGMENT !="" :
        PD_Timeseries = PD_curves[PD_curves['PD_SEGMENT']==IFRS9_PD_SEGMENT]      
        PD_value = row['BEH_PD']        
        remaining_term = row['IFRS9_REMAINING_TERM']
        cumulative_sum = 0.0
        x = 0.0
        j = 1
        while j <=420:
            PD_no_Macro[j] = 0.0
            PD_base[j] =0.0
            PD_good[j] =0.0
            PD_bad[j] =0.0
            PD_bad_2[j] =0.0
            EAD[j] = 0.0
            LGD_base[j] = 0.0
            LGD_good[j]=0.0
            LGD_bad[j] =0.0
            LGD_bad_2[j] =0.0
            ECL_base[j] = 0.0
            ECL_good[j]=0.0
            ECL_bad[j] =0.0
            ECL_bad_2[j] =0.0
            PD_weighted[j] =0.0
            
            j+=1
        
        j= 1
        while j <= remaining_term:
            if j<= remaining_term:
                PD_no_Macro[j] = PD_value * PD_Timeseries.iloc[0,j+3]
            PD_base[j] = PD_no_Macro[j]*Macro_base.iloc[0,j+4]
            PD_good[j] = PD_no_Macro[j]*Macro_good.iloc[0,j+4]
            PD_bad[j] = PD_no_Macro[j]*Macro_bad.iloc[0,j+4]  
            PD_bad_2[j] = PD_no_Macro[j]*Macro_bad_2.iloc[0,j+4] 
            PD_weighted[j] = P_SC_BASE*PD_base[j] + P_SC_GOOD*PD_good[j] + P_SC_BAD*PD_bad[j] + P_SC_BAD_2*PD_bad_2[j]
            cumulative_sum += PD_weighted[j]
                    
            if cumulative_sum >= 1:
                x = j
                break
            j+=1
        
        j=1
            
        while j <= x:
            discount = 1 / ((1 + EIR_avg) ** (j / 12))
            EADx = float(ead_curves.iloc[index, j + 1])
            if math.isnan(EADx):
                EADx = 0.0
            EAD[j] = EADx
            LGD_base[j] = lgds_curve.iloc[index, j + 421]
            LGD_good[j] = lgds_curve.iloc[index, j + 1]
            LGD_bad[j] = lgds_curve.iloc[index, j + 841]
            LGD_bad_2[j] = lgds_curve.iloc[index, j + 1261]
            ECL_base[j] = PD_base[j] * LGD_base[j] * EAD[j] * discount
            ECL_good[j] = PD_good[j] * LGD_good[j] * EAD[j] * discount
            ECL_bad[j] = PD_bad[j] * LGD_bad[j] * EAD[j] * discount
            ECL_bad_2[j] = PD_bad_2[j] * LGD_bad_2[j] * EAD[j] * discount
            j += 1'

Answer 1

这里的要点包括 -

累积和行为：计算累积和并仅在第一个 while 循环中分配 x。如果每行的累积和在不同点处达到或超过 1，这应该会导致每行产生不同的 x。
覆盖问题：由于您在循环中全局使用变量 x 并可能在每次迭代中重用它而不重置它，因此当 x 在其他行中使用时，您可能最终会保留最后一行的值。

修复方法：

确保 x 是特定于行的：确保为每一行唯一地重新计算 x 并且不会跨行结转。如果 x 是在一行上下文中计算的，则应该为每一行重置它。
修复流程：您已经在每行处理开始时重置了 x = 0.0，这很好。但是，请确保 x 的使用正确且本地化。在正确的点中断循环：确保当相应行的cumulative_sum >= 1 时循环正确中断并移至下一行。

建议的修复：这是经过改进的逻辑版本。我将特别关注隔离每一行的 x：

row_count = -1
for index, row in Input.iterrows():
row_count += 1
if index % 1 == 0:
    print(f"{index} Accounts Done.")

# Extract values
IFRS9_PD_SEGMENT = row['EY_SEG']
IFRS9_ACCOUNT_STATUS = row['IFRS9_ACCOUNT_STATUS']
IFRS9_ACCOUNT_ID = row['IFRS9_ACCOUNT_ID']
IFRS9_DEFAULT_FLAG = row['IFRS9_DEFAULT_FLAG']
EAD_DEF = row['EAD_DEF']
LGD_DEF_Base = row['LGD_DEF_Base']
LGD_DEF_Good = row['LGD_DEF_Good']
LGD_DEF_Bad = row['LGD_DEF_Bad']
LGD_DEF_Bad_2 = row['LGD_DEF_Bad_2']

# Initialize values specific to the current row
PD_no_Macro, PD_base, PD_good, PD_bad, PD_bad_2 = {}, {}, {}, {}, {}
EAD, LGD_base, LGD_good, LGD_bad, LGD_bad_2 = {}, {}, {}, {}, {}
ECL_base, ECL_good, ECL_bad, ECL_bad_2, PD_weighted = {}, {}, {}, {}, {}
Default_PD = 0.0
cumulative_sum = 0.0
x = 0.0

# Check PD segment
if IFRS9_PD_SEGMENT == "DEFAULT":
    Default_PD = 1.0
elif IFRS9_PD_SEGMENT != "DEFAULT" and IFRS9_PD_SEGMENT != "":
    # Retrieve PD timeseries
    PD_Timeseries = PD_curves[PD_curves['PD_SEGMENT'] == IFRS9_PD_SEGMENT]
    PD_value = row['BEH_PD']
    remaining_term = row['IFRS9_REMAINING_TERM']
    
    # Initialize values for the first 420 periods (arbitrary upper bound)
    for j in range(1, 421):
        PD_no_Macro[j] = PD_base[j] = PD_good[j] = PD_bad[j] = PD_bad_2[j] = 0.0
        EAD[j] = LGD_base[j] = LGD_good[j] = LGD_bad[j] = LGD_bad_2[j] = 0.0
        ECL_base[j] = ECL_good[j] = ECL_bad[j] = ECL_bad_2[j] = PD_weighted[j] = 0.0
    
    # Calculate cumulative sum and find the value of x
    for j in range(1, remaining_term + 1):
        PD_no_Macro[j] = PD_value * PD_Timeseries.iloc[0, j + 3]
        PD_base[j] = PD_no_Macro[j] * Macro_base.iloc[0, j + 4]
        PD_good[j] = PD_no_Macro[j] * Macro_good.iloc[0, j + 4]
        PD_bad[j] = PD_no_Macro[j] * Macro_bad.iloc[0, j + 4]
        PD_bad_2[j] = PD_no_Macro[j] * Macro_bad_2.iloc[0, j + 4]
        
        # Weighted PD calculation
        PD_weighted[j] = (P_SC_BASE * PD_base[j] + P_SC_GOOD * PD_good[j] + 
                          P_SC_BAD * PD_bad[j] + P_SC_BAD_2 * PD_bad_2[j])
        
        cumulative_sum += PD_weighted[j]
        
        # Break the loop and store x when cumulative sum >= 1
        if cumulative_sum >= 1:
            x = j  # x is specific to the current row
            break

    # Use the calculated x value for the rest of the computation
    for j in range(1, x + 1):
        discount = 1 / ((1 + EIR_avg) ** (j / 12))
        EADx = float(ead_curves.iloc[index, j + 1])
        if math.isnan(EADx):
            EADx = 0.0
        
        EAD[j] = EADx
        LGD_base[j] = lgds_curve.iloc[index, j + 421]
        LGD_good[j] = lgds_curve.iloc[index, j + 1]
        LGD_bad[j] = lgds_curve.iloc[index, j + 841]
        LGD_bad_2[j] = lgds_curve.iloc[index, j + 1261]
        
        # Calculate ECL for each scenario
        ECL_base[j] = PD_base[j] * LGD_base[j] * EAD[j] * discount
        ECL_good[j] = PD_good[j] * LGD_good[j] * EAD[j] * discount
        ECL_bad[j] = PD_bad[j] * LGD_bad[j] * EAD[j] * discount
        ECL_bad_2[j] = PD_bad_2[j] * LGD_bad_2[j] * EAD[j] * discount

主要变化：

特定于行的 x 值：x 值现在按行计算，并且仅用于该行的上下文。每行根据累积总和获得自己的 x。
重置每行的值：对于每个新行，所有关键变量（PD_no_Macro、cumulative_sum 等）都会重置，以防止先前迭代的结转。
循环效率：内部循环确保计算仅执行到特定于行的 x。

尝试运行此修改后的代码并检查它是否解决了为所有行分配相同 x 值的问题。

Python - 如何根据每行违反的累积总和分配值？

问题描述投票：0回答：1

1个回答

最新问题

Python - 如何根据每行违反的累积总和分配值？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1