我根据调查数据用Python编写了一个简单的代码
`
# Load CSV data
csv_file = 'women_survey.csv'
data_women = pd.read_csv(csv_file)
# Convert object to String
data_women['physical_Assault'] = data_women['physical_Assault'].astype("string")
w_marital_status = data_women['physical_Assault']
# print sample data size and type
print(w_marital_status.shape)
print(w_marital_status.dtype)
# person 1's data
print(w_marital_status[0])
# Create a new List
new_list = [""] * 62
# Read each iteration of survey data and assign String to new List
for count, content in enumerate(w_marital_status, start=0):
new_test = content.split(":", 1)[1]
new_list[count] = new_test
print(new_test)
if type(new_test) == str:
if new_test == "Yes":
print("This is a Yes")
else:
print("This is a No")
else:
print("This is not a string.")`
此代码提供以下结果(部分)。
(62,)
string
Have you ever been physically assaulted?: No
No
This is a No
No
This is a No
No
This is a No
No
This is a No
Yes
This is a No
如结果所示,对于每个人,每一行都包含整个问题“您是否曾受到身体攻击?”以及附加的答案“是”或“否”
我只想存储答案“是”或“否”,以便稍后使用 MatplotLib 并绘制数据图表。所以我编写了 For 循环来将答案分配给一个新列表。但正如您所看到的,字符串比较永远不会起作用。我不确定“new_test”是否实际上是一个字符串,事实证明它是一个字符串,但字符串比较仍然不起作用。我不太确定我做错了什么。另外,如果有比我的代码更简单的方法,请告诉我。谢谢。
试试这个:
new_list = list()
# Read each iteration of survey data and assign String to new List
for content in w_marital_status:
new_test = content.split(":", 1)[1]
new_list.append(new_test)
print(new_test)
if type(new_test) is str:
if new_test.strip().lower() == "yes":
print("This is a Yes")
else:
print("This is a No")
else:
print("This is not a string.")`