Datacamp 示例考试:Python Associate 考试 - VoltBike Innovations

问题描述 投票:0回答:1

我正在做这个实践考试,当我提交时,我注意到我有几个点不正确,我不知道为什么。

这是任务 1:

任务1

我的实现是:

import pandas as pd

clean_data = pd.read_csv('ebike_data.csv')

clean_data['bike_type'] = clean_data['bike_type'].fillna('standard') 

clean_data['frame_material'] = clean_data['frame_material'].fillna('unknown') 
clean_data['frame_material'] = clean_data['frame_material'].str.lower() 

clean_data['production_cost'] = clean_data['production_cost'].fillna(clean_data['production_cost'].median()).astype(float)

clean_data['assembly_time'] = clean_data['assembly_time'].fillna(clean_data['assembly_time'].mean()).astype(int)

clean_data['top_speed'] = clean_data['top_speed'].fillna(clean_data['top_speed'].mean()).astype(int)

clean_data['battery_type'] = clean_data['battery_type'].fillna('other') 
clean_data['battery_type'] = clean_data['battery_type'].replace({'-':'other', 'liotherion': 'li-ion'})

clean_data['customer_score'] = clean_data['customer_score'].fillna(clean_data['customer_score'].mean()).clip(lower=1, upper=10).astype(int)

clean_data['motor_power'] = clean_data['motor_power'].str.replace('W','').astype(float)
clean_data['motor_power'] = clean_data['motor_power'].fillna(clean_data['motor_power'].median()).astype(int)

print(clean_data.info())
print(clean_data.isna().sum())

输出:

angeIndex: 2000 entries, 0 to 1999
Data columns (total 8 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   bike_type        2000 non-null   object 
 1   frame_material   2000 non-null   object 
 2   production_cost  2000 non-null   float64
 3   assembly_time    2000 non-null   int64  
 4   top_speed        2000 non-null   int64  
 5   battery_type     2000 non-null   object 
 6   motor_power      2000 non-null   int64  
 7   customer_score   2000 non-null   int64  
dtypes: float64(1), int64(4), object(3)
memory usage: 125.1+ KB
None
bike_type          0
frame_material     0
production_cost    0
assembly_time      0
top_speed          0
battery_type       0
motor_power        0
customer_score     0
dtype: int64

当我提交项目时,这是我得到的反馈:

所有必需的数据已创建并具有所需的列 - 检查 任务 1:识别并替换缺失值 - 未检查 任务 1:在数据类型之间转换值 - 检查 任务 1:通过操作字符串清理分类和文本数据 - CHECK

python certificate
1个回答
0
投票

我今天解决了这个“难题”。

当您将数据帧导出为 CSV 文件以检查结果时,这很有帮助,又名 clean_data.to_csv("result.csv")

棘手的部分是:

  • 在继续本专栏之前,将“STEel”替换为“steel”: prod_df["frame_material"].replace("STEel","钢")

  • 对于所有缺失的字符串值,请使用否定 isin 函数,fo: ~prod_df["bike_type"].isin(['标准', '折叠', '山地', '道路']), "bike_type"] = "标准"

  • 最高速度列平均值按 2 个小数点舍入: prod_df["top_speed"].fillna(prod_df["top_speed"].mean().round(2), inplace=True)

我希望这有帮助!

© www.soinside.com 2019 - 2024. All rights reserved.