我想在 5 个核心上运行 sklearn 库中的
LinearRegression()
。正如文档所说,除非 n_jobs
> 1,否则 n_targets
参数不会导致多重处理,我创建了具有两个 y 值的随机数据并尝试运行程序。然而,CPU 核心图表显示,只有 1 个核心的使用率超过 50%。图表正常还是代码有问题?
我尝试过的代码:
import os
# Set environment variables to limit the number of threads
for env_var in ["OMP_NUM_THREADS", "OPENBLAS_NUM_THREADS", "MKL_NUM_THREADS", "VECLIB_MAXIMUM_THREADS", "NUMEXPR_NUM_THREADS"]:
os.environ[env_var] = "5"
# Importing the necessary libraries
import time
import psutil
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn import linear_model
from multiprocessing import Process
from sklearn.model_selection import train_test_split
from joblib import parallel_backend, Parallel, delayed
from sklearn.metrics import mean_squared_error, r2_score
# Function to plot CPU usage for all cores
def plot_cpu_usage(usage_data):
plt.figure(figsize=(15, 8)) # Adjust the size as needed
# print(usage_data)
for core, usage in enumerate(usage_data):
ax = plt.subplot(4, 4, core + 1) # 4x4 grid for 16 cores
# Determine the color based on the maximum usage for the core
max_usage = max(usage)
if max_usage > 90:
line_color = 'red'
elif max_usage > 75:
line_color = 'orange'
elif max_usage < 50:
line_color = 'blue'
else:
line_color = 'purple'
# Plot the usage with the determined color
ax.plot(usage, color=line_color)
ax.set_title(f'Core {core}')
ax.set_xlabel('Time (s)')
ax.set_ylabel('Usage (%)')
plt.tight_layout()
plt.savefig('cpu_cores_usage [test#4].png')
plt.show() # This will display the graph in a window
# Function to monitor CPU usage
def monitor_cpu_usage(duration, interval):
# Record the start time
start_time = time.time()
# Initialize usage data
usage_data = [[] for _ in range(psutil.cpu_count())]
while (time.time() - start_time) < duration:
# Get per-core CPU usage
cores_usage = psutil.cpu_percent(percpu=True)
# Append usage data for each core
for i, usage in enumerate(cores_usage):
usage_data[i].append(usage)
# Wait for the specified interval
time.sleep(interval)
# Call the plot function
plot_cpu_usage(usage_data)
# Function to create a dataset using a seed and random generation
def create_data(seed, sample, all_f, real_f):
# Seed set as 42 for reproducible results
np.random.seed(seed)
# Set the no. of samples and features and create a sxf matrix
n_samples, n_features = sample, all_f
X = np.random.randn(n_samples, n_features)
# Let only the first real_f features actually affect value.
# We create Y1 as the sum of first 15 features and random noise
real_p = real_f
Y1 = np.sum(X[:, :real_p], axis=1) + np.random.normal(size=(n_samples,))
# Create Y2 similar to Y1
Y2 = np.sum(X[:, :real_p], axis=1) + np.random.normal(size=(n_samples,))
# Combine Y1 and Y2 into a single matrix Y
Y = np.column_stack((Y1, Y2))
print(X[0:5])
print(Y[0:5])
return X, Y
# Function to run your lr.py program
def run_lr_program():
X, Y = create_data(42, 10000, 5000, 2500)
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.33)
# Fit the model in parallel
with parallel_backend('loky', n_jobs=5):
reg = linear_model.LinearRegression(n_jobs=5)
reg.fit(X_train, Y_train)
# Make predictions using the testing set
Y_pred = reg.predict(X_test)
# The coefficients
print("Coefficients: \n", reg.coef_)
# The intercepts
print("Intercepts: \n", reg.intercept_)
# The mean squared error
print("Mean squared error: %.2f" % mean_squared_error(Y_test, Y_pred))
# The coefficient of determination: 1 is perfect prediction
print("Coefficient of determination: %.2f" % r2_score(Y_test, Y_pred))
try:
# Run the lr.py program in a separate process
lr_process = Process(target=run_lr_program)
lr_process.start()
# Monitor CPU usage while lr.py is running
monitor_cpu_usage(duration=60, interval=1) # Monitor for 60 seconds with 1-second intervals
# Wait for the lr.py program to finish
lr_process.join()
finally:
# Ensure proper cleanup
lr_process.terminate()
预期结果:
5 个核心的图表显示比其他核心的活动更高。 (我是一个初学者,我的任务是学习使用指定数量的核心来训练我的模型,然后再在服务器上运行任何内容)
正如文档所说,除非
> 1,否则n_jobs
参数不会导致多重处理,我创建了具有两个 y 值的随机数据并尝试运行程序。然而,CPU 核心图表显示,只有 1 个核心的使用率超过 50%。图表正常还是代码有问题?n_targets
不,这不会尝试进程级并行性。
这里的文档有点令人困惑:
用于计算的作业数量。这只会提供 在问题足够大的情况下加速,也就是说,如果首先
,其次n_targets > 1
是稀疏的,或者如果X
设置为positive
。True
表示 1,除非在None
中 语境。joblib.parallel_backend
表示使用所有处理器。 [...]-1
用伪代码来说,这句话的意思是:
if (n targets > 1) and (issparse(X) or positive == True):
use parallelism
else:
ignore n_jobs
(如果你想自己检查的话,可以阅读源代码。)
由于 X 不是稀疏的,并且您没有通过
positive
,因此不会尝试进程级并行性。即使这是尝试进程级并行性,并行性级别也仅限于目标数量。由于您有 2 个目标,因此它最多可以创建 2 个进程来完成这项工作。
无论如何,它都会进行一定程度的并行性,这可能是 BLAS 级并行性的结果。 NumPy 可以并行化某些操作,具体取决于您拥有的 BLAS 实现。
请注意,结合进程级并行性和 BLAS 并行性时的最大并行性可能远高于 5。如果您有 5 个进程,每个进程有 5 个线程,那么您可能有 25 个并发线程在运行。