为什么曲线拟合在交互式 Python 模拟中有时有效，有时无效？

Question

我对 Python GUI 比较陌生，一直在开发一个与我的物理课相关的简单项目。

我一直在尝试使用蒙特卡罗方法模拟卢瑟福散射实验数据，以及使用 ipywidgets 进行交互式模拟，您可以使用滑块来更改方程中的值。

出现的问题是，当我尝试对模拟数据进行曲线拟合时，更改 N_i 值/动能/厚度/箔元素的参数，曲线拟合有时会起作用，但是当您将参数更改为某些值时不同的是，它在 10 度（x_values）后变平。然后如果你把参数改回原来的样子，有时会失败或又可以工作。

我在网上搜索，它说问题可能是由于初始参数关闭而引起的，但是当我尝试使用不同的值或以任何方式更新代码时，它仍然会导致曲线拟合中断。这是我能得到的与卢瑟福散射实验最接近的代码。

这是我正在使用的代码。每次按下“运行”按钮时，它都会绘制卢瑟福方程的模拟数据，该数据对于不同的 N_i 值是一致的。然后按“分析曲线”按钮，这将给出方程的理论曲线。然后“拟合曲线”按钮应该给出与模拟数据紧密匹配的曲线。有用于 N_i 值、动能、箔厚度和箔构成元素的滑块。

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
import ipywidgets as widgets
from IPython.display import display, Latex
import time

# Constants
Z1 = 2  # Atomic number of alpha particle
e = 1.602e-19  # Elementary charge (C)
epsilon_0 = 8.854e-12  # Vacuum permittivity (F/m)
M_alpha = 6.644e-27  # Mass of alpha particle (kg)

# Simulation Function: Rutherford's scattering formula
def rutherford_scattering(theta, N_i, a, t, r, Z2, EK):
    """Compute N(theta) based on Rutherford's law."""
    first = (N_i * a * t) / (16 * r**2)
    second = ((Z1 * Z2 * e**2) / (4 * np.pi * epsilon_0 * EK))**2
    return first * second * (1 / np.sin(theta / 2)**4)

# Run the simulation (Monte Carlo simulation)
def monte_carlo_simulation(analytical, theta, EK_slider_value, Ni_slider_value, noise_level=0.2):
    """Add random noise to the analytical data to simulate experimental results."""
    # Apply more noise at larger angles (factor based on angle)
    angle_factor = 1 / np.sin(theta / 2)**4  # Using angle factor for noise
    
    # Scale noise with respect to kinetic energy (lower noise at higher KE)
    # Adjust the scaling factor based on a more realistic energy value in Joules (not multiplied by 1e-13 here)
    scaled_noise_level = 1 / np.sqrt(EK_slider_value)  # Inversely proportional to KE
    
    # Noise scaling based on the number of particles (Ni)
    noise_scaling = 1 / np.sqrt(Ni_slider_value)  # More noise for smaller Ni
    
    # Combine noise scaling
    total_noise_level = scaled_noise_level + noise_scaling + angle_factor

    # Generate noise (normally distributed)
    noise = np.random.normal(0, total_noise_level, size=theta.shape)  # Generate noise for each data point
    
    # Add noise to the analytical values
    simulated_data = analytical + noise
    
    # Ensure that N(theta) does not go below zero
    simulated_data = np.maximum(simulated_data, 0)  # Clamp values to be >= 0

    return simulated_data


# Widgets for input parameters
Ni_slider = widgets.IntSlider(value=13000, min=6000, max=20000, step=500, description="N<sub>i</sub>: No. alpha particles", style={'description_width': 'initial'}, layout={'width': '400px'})
t_input = widgets.FloatText(value=400, min=200, max=1400, step=100, description="t: thickness of foil (nm)", style={'description_width': 'initial'})
EK_slider = widgets.FloatSlider(value=5, min=1, max=10, step=1, description="E<sub>K</sub>: Kinetic energy (J)", style={'description_width': 'initial'}, layout={'width': '400px'})
Z2_dropdown = widgets.Dropdown(
    options={
        'Gold (79)': 79,
        'Silver (47)': 47,
        'Copper (29)': 29,
        'Aluminum (13)': 13,
        'Iron (26)': 26,
        'Titanium (22)': 22,
        'Carbon (6)': 6,
        'Oxygen (8)': 8,
        'Lead (82)': 82,
        'Mercury (80)': 80
    },
    value=79,
    description="Z<sub>2</sub>: Element of foil",
    style={'description_width': 'initial'}
)

# Buttons for running simulation, curve fitting, and displaying results
simulate_button = widgets.Button(description="Run Monte Carlo Simulation")
add_analytical_button = widgets.Button(description="Add Analytical Curve")
fit_button = widgets.Button(description="Fit Curve")

# Output plots and error messages
output_plot = widgets.Output()
output_error = widgets.Output()
output_analytical = widgets.Output()

# Layout for widgets
ui = widgets.VBox([
    widgets.HBox([Ni_slider]),
    widgets.HBox([EK_slider]),
    widgets.VBox([t_input]),
    widgets.VBox([Z2_dropdown]),
    widgets.HBox([simulate_button, add_analytical_button, fit_button]),
    output_plot,  # Main plot for simulation
    output_analytical,  # Dedicated plot for analytical curve
    output_error
])

# Global variables for storing simulated data and analytical data
theta_values = np.linspace(np.radians(1), np.radians(179), 500)
simulated_data = None
analytical_data = None


# Debounce function to prevent rapid successive calls
def debounce(f, wait=1):
    """Debounce a function to delay execution until user stops adjusting sliders."""
    last_call_time = [0]  # Use mutable object to retain state between calls

    def wrapped_function(change):
        now = time.time()
        if now - last_call_time[0] > wait:
            last_call_time[0] = now
            f(change)

    return wrapped_function




@debounce # Updated functions with debouncing applied
# Run the simulation (Monte Carlo simulation)
def run_simulation(change):
    global simulated_data
    with output_plot:
        output_plot.clear_output()  # Clear previous content in the output area

        # Read values from widgets
        Ni = Ni_slider.value
        t_nm = t_input.value  # Thickness in nanometers
        t = t_nm * 1e-9  # Convert thickness to meters
        EK = EK_slider.value * 1e-13  # Ensure the kinetic energy is properly converted to joules
        Z2 = Z2_dropdown.value
        r = 0.1  # Fixed distance from foil (m)
        a = 1e28  # Approx. atoms/m^3 for dense materials

        # Compute the theoretical Rutherford scattering distribution (analytical curve)
        analytical = rutherford_scattering(theta_values, Ni, a, t, r, Z2, EK)
        global analytical_data
        analytical_data = analytical  # Store analytical data for future use

        # Simulate the data by adding noise to the theoretical model
        simulated_data = monte_carlo_simulation(analytical_data, theta_values, EK_slider.value, Ni_slider.value, noise_level=0.1)

        # Scatter plot: Simulated data
        plt.figure(figsize=(8, 6))
        plt.plot(np.degrees(theta_values), simulated_data, '.', color='green', label="Simulated Data")
        plt.xlabel("Scattering Angle (degrees)")
        plt.ylabel(r"$N(\theta)$")
        plt.yscale('log')
        plt.title("Simulated Data")
        plt.grid()
        plt.tight_layout()
        plt.legend()
        plt.show()
        


@debounce
# Add analytical Rutherford scattering curve to the simulated data
def add_analytical_curve(change):
    global simulated_data, analytical_data
    if analytical_data is None or simulated_data is None:
        with output_error:
            output_error.clear_output()
            print("Run the simulation first to generate simulated data and analytical curve.")
        return  # Ensure both simulated and analytical data exist

    # Use the main output area to display both curves together
    with output_plot:
        output_plot.clear_output()  # Clear previous content in the main plot area

        # Create the plot
        plt.figure(figsize=(8, 6))
        plt.plot(np.degrees(theta_values), simulated_data, '.', color='green', label="Simulated Data")
        plt.plot(np.degrees(theta_values), analytical_data, '-', color='blue', label="Analytical Curve")
        plt.xlabel("Scattering Angle (degrees)")
        plt.ylabel(r"$N(\theta)$")
        plt.yscale('log')
        plt.title("Simulated Data with Analytical Curve")
        plt.grid()
        plt.tight_layout()
        plt.legend()
        plt.show()

# Curve fitting function
def fit_curve_function(theta, A, B):
    """Model function for fitting a curve to the data."""
    return A / np.sin(theta / 2)**4 + B

@debounce
# Fit the simulated data
def fit_simulation(change):
    global simulated_data, analytical_data
    if analytical_data is None or simulated_data is None:
        with output_error:
            output_error.clear_output()
            print("Run the simulation first to generate simulated data and analytical curve.")
        return

    try:
        # Better initial parameter estimates
        A_init = np.max(simulated_data) * (np.sin(np.pi/4))**4
        B_init = np.min(simulated_data)

        # Try multiple initial guesses if the first fit fails
        initial_guesses = [
            [A_init, B_init],
            [A_init * 10, B_init],
            [A_init * 0.1, B_init],
            [A_init, 0]
        ]
        
        best_fit = None
        best_residual = np.inf
        
        for guess in initial_guesses:
            try:
                # Wider bounds
                bounds = ([0, 0], [np.inf, np.max(simulated_data)])
                
                # Add method='trf' and increase max iterations
                popt_try, pcov_try = curve_fit(
                    fit_curve_function, 
                    theta_values, 
                    simulated_data, 
                    p0=guess,
                    bounds=bounds,
                    method='trf',
                    maxfev=20000
                )
                
                # Calculate fit quality
                fitted = fit_curve_function(theta_values, *popt_try)
                residual = np.sum((simulated_data - fitted)**2)
                
                if residual < best_residual:
                    best_residual = residual
                    best_fit = (popt_try, pcov_try)
            
            except:
                continue
        
        if best_fit is None:
            raise RuntimeError("Failed to find a good fit with any initial parameters")
            
        popt, pcov = best_fit
        
        # Rest of your plotting code remains the same
        A, B = popt
        fitted_curve = fit_curve_function(theta_values, *popt)
        
        with output_plot:
            output_plot.clear_output() 

            plt.figure(figsize=(8, 6))
            plt.plot(np.degrees(theta_values), simulated_data, '.', color='green', label="Simulated Data")
            plt.plot(np.degrees(theta_values), analytical_data, '-', color='blue', label="Analytical Curve")
            plt.plot(np.degrees(theta_values), fitted_curve, '--', color='red', label="Fitted Curve")
            plt.xlabel("Scattering Angle (degrees)")
            plt.ylabel(r"$N(\theta)$")
            plt.yscale('log')
            plt.title("Curve Fitting to Simulated Data")
            plt.legend()
            plt.grid()

            plt.tight_layout()
            plt.show()

        # Display fit parameters and covariance matrix
        with output_error:
            output_error.clear_output() 
            print(f"Fitted Parameters: A = {A:.3e}, B = {B:.3e}")
            print(f"Covariance Matrix: \n{pcov}")

    except Exception as e:
        with output_error:
            print(f"Error during curve fitting: {e}")


# Connect buttons
simulate_button.on_click(run_simulation)
add_analytical_button.on_click(add_analytical_curve)
fit_button.on_click(fit_simulation)

# Display the widgets
display(ui)

此代码导致前两个按钮完全正常工作，滑块显示的绘图与现实生活中的情况相对接近。然而，有时当按下“拟合曲线”时，曲线拟合平坦度超过 10 度。它适用于某些值，但当再次使用这些值时则不起作用。

导致此问题的格式或布局是否有问题？或者是一个对数问题，因为我的 y 值是大值的对数？我已尽我所能尝试，但我不确定该往哪里更进一步。

谢谢你

编辑：我还注意到 KE 的噪音水平问题。通常，在高 KE 时，需要较少的噪声，因为散射较少，而对于低 KE，需要更多的噪声以产生更多的散射。我注意到它正在做完全相反的事情，但我已经做了相反的事情，但它仍然没有改变它，所以我假设它从其他地方积累了噪音

Answer 1

或者是一个对数问题，因为我的 y 值是大值的对数？

有点，是的。当您拟合这样一个从 10^7 到 10-1 的函数时，curve_fit() 会关心值 10^7 上的 1 错误，就像它关心值 1 上的 1 错误一样换句话说，如果它可以将 10^7 处的值的拟合度提高 0.001%，则值得使 10^-1 处的曲线拟合变得糟糕。

解决此问题的一种方法是通过将函数的对数拟合到数据的对数来从绝对误差切换为相对误差。

示例：

log_offset = 0.001
popt_try, pcov_try = curve_fit(
    lambda *args: np.log(fit_curve_function(*args) + log_offset), 
    theta_values, 
    np.log(simulated_data + log_offset), 
    p0=guess,
    bounds=bounds,
    method='trf',
    maxfev=20000
)

注意：我实际上正在应用函数

np.log(x + log_offset)

，因为你的数据包含零。

log_offset

是一个您可能应该根据您的应用程序进行调整的参数。它控制输入零转换为对数空间时的值有多大。或者，您可以从数据中删除零。

以下是没有进行此更改时的合身效果：

变更前：

更改后：

一旦我们进行对数变换，我们就会得到视觉上看起来更适合的东西，但是如果我们比较前后的残差，我们可以看到新的适合度实际上更差，从绝对意义上来说。

absolute error residual SSE 1.1637203281333856e+16
relative error residual SSE 1.922495566518763e+16

为什么曲线拟合在交互式 Python 模拟中有时有效，有时无效？

问题描述投票：0回答：1

1个回答

最新问题

为什么曲线拟合在交互式 Python 模拟中有时有效，有时无效？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1