感谢这是一个简单的问题,
使用此处的教程https://introml.analyticsdojo.com/notebooks/nb-04-10-pca.html
使用下面的代码。
df 和 dv 在没有事先声明的情况下如何在函数中用作参数?
感谢您的帮助
import os
import pandas as pd
train = pd.read_csv('https://raw.githubusercontent.com/rpi-techfundamentals/spring2019-materials/master/input/train.csv')
test = pd.read_csv('https://raw.githubusercontent.com/rpi-techfundamentals/spring2019-materials/master/input/test.csv')
print(train.columns, test.columns)
from sklearn.impute import SimpleImputer
import numpy as np
cat_features = ['Pclass', 'Sex', 'Embarked']
num_features = [ 'Age', 'SibSp', 'Parch', 'Fare' ]
def preprocess(df, num_features, cat_features, dv):
features = cat_features + num_features
if dv in df.columns:
y = df[dv]
else:
y=None
#Address missing variables
print("Total missing values before processing:", df[features].isna().sum().sum() )
imp_mode = SimpleImputer(missing_values=np.nan, strategy='most_frequent')
df[cat_features]=imp_mode.fit_transform(df[cat_features] )
imp_mean = SimpleImputer(missing_values=np.nan, strategy='mean')
df[num_features]=imp_mean.fit_transform(df[num_features])
print("Total missing values after processing:", df[features].isna().sum().sum() )
X = pd.get_dummies(df[features], columns=cat_features, drop_first=True)
return y,X
y, X = preprocess(train, num_features, cat_features, 'Survived')
test_y, test_X = preprocess(test, num_features, cat_features, 'Survived')
想象一个更简单的例子:当你有这个函数时:
def add(a, b):
return a + b
x = 1
y = 2
add(x, y)
在函数中,参数通过您在函数定义中选择的名称来识别(即 a
和
b
)。当您使用
add(x, y)
调用函数时,您设置了值,因此它们在函数范围内是已知的。