有没有其他方法(将一列的值组合到不同的组中),而不是在下面的问题中多次使用 'df.replace( )' ?

问题描述 投票:0回答:1

在:

char_df['Loan_Title'].unique()

出:

array(['debt consolidation', 'credit card refinancing',
       'home improvement', 'credit consolidation', 'green loan', 'other',
       'moving and relocation', 'credit cards', 'medical expenses',
       'refinance', 'credit card consolidation', 'lending club',
       'debt consolidation loan', 'major purchase', 'vacation',
       'business', 'credit card payoff', 'credit card',
       'credit card refi', 'personal loan', 'cc refi', 'consolidate',
       'medical', 'loan 1', 'consolidation', 'card consolidation',
       'car financing', 'debt', 'home buying', 'freedom', 'consolidated',
       'get out of debt', 'consolidation loan', 'dept consolidation',
       'personal', 'cards', 'bathroom', 'refi', 'credit card loan',
       'credit card debt', 'house', 'debt consolidation 2013',
       'debt loan', 'cc refinance', 'home', 'cc consolidation',
       'credit card refinance', 'credit loan', 'payoff',
       'bill consolidation', 'credit card paydown', 'credit card pay off',
       'get debt free', 'myloan', 'credit pay off', 'my loan', 'loan',
       'bill payoff', 'cc-refinance', 'debt reduction', 'medical loan',
       'wedding loan', 'credit', 'pay off bills', 'refinance loan',
       'debt payoff', 'car loan', 'pay off', 'pool', 'credit payoff',
       'credit card refinance loan', 'cc loan', 'debt free', 'conso',
       'home improvement loan', 'loan consolidation', 'lending loan',
       'relief', 'cc', 'loan1', 'getting ahead', 'home loan', 'bills'],
      dtype=object)

在:

char_df=char_df.replace(['debt consolidation','debt consolidation loan','dept consolidation','debt consolidation 2013'], 'dept_consolidation')
char_df = char_df.replace(['personal','personal loan'],'personal_loan')
char_df = char_df.replace(['credit card refinancing','credit card refi','credit card refinance','credit card refinance loan'],'credit_card_refinance') 
python pandas machine-learning data-science feature-engineering
1个回答
2
投票

IIUC(很难读)你可以尝试以下方法:

import pandas as pd

# will use regex pattern as keys and replace string as value
patterns = {
    r'dept consolidation.*': 'dept_consolidation',
    r'personal.*': 'personal_loan',
    r'credit card.*': 'credit_card_refinance'
}

df['Loan_Title'] = df['Loan_Title'].replace(regex=patterns)
© www.soinside.com 2019 - 2024. All rights reserved.