当格式很奇怪时，如何让我的代码正确读取 csv 列？

Question

我正在使用一个旧软件，它的导出非常奇怪，所以我的标题为“专业化”的专栏没有被正确阅读。我想让它理解价值观。基本上是每年专家和通才的数量。该图像是导出的 CSV 文件的快照。无论如何，所有专业化的值都读为0。请帮忙！这是我一直在尝试的代码。

import pandas as pd
import glob
import ast

# Specify the directory where your CSV files are located
directory_path = 'myfilepath'  # Ensure this path is correct

# Use glob to find all CSV files matching your naming pattern
file_path_pattern = f"{directory_path}patch_turnover_data_*.csv"
all_files = glob.glob(file_path_pattern)

# Load all CSV files into a list of DataFrames
df_list = [pd.read_csv(file) for file in all_files]  # Assuming a comma delimiter
# Combine all DataFrames into one
df = pd.concat(df_list, ignore_index=True)

# Strip extra spaces from the column names
df.columns = df.columns.str.strip()

# Function to safely convert strings to lists using ast.literal_eval
def safe_literal_eval(value):
    try:
        # Reformat the string to add commas between words and then evaluate
        value = value.replace(' ', ',')  # Add commas between words
        return ast.literal_eval(value)
    except (ValueError, SyntaxError):
        # If conversion fails, return an empty list
        return []

# Apply the conversion function to the "Specialization" column
df['Specialization'] = df['Specialization'].str.strip('[]').apply(safe_literal_eval)

# Check if the conversion was successful
print(df['Specialization'].head())

Answer 1

无需使用

ast.literal_eval()

。只需将字符串拆分为空格字符即可。

df['Specialization'] = df['Specialization'].str.strip('[]').str.split()

这会将单词列表放入数据框列中。

当格式很奇怪时，如何让我的代码正确读取 csv 列？

问题描述投票：0回答：1

1个回答

最新问题

当格式很奇怪时，如何让我的代码正确读取 csv 列？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1