在从事个人项目时,我偶然发现了一个问题,可以表述如下:
您有一个包含 N 行和 M 列的网格。该表包含一些红色和一些绿色单元格。目标是找到红色到绿色单元格的分配,使得匹配中红色和绿色单元格之间的总距离最小。此外,两个相邻(具有公共边或顶点)的红色网格单元可以配对在一起并移动到另一对相邻的绿色单元。
假设限制是 N <= 1000, M <= 1000. It is not guaranteed that the number of red cells is equal to the number of green cells, i.e. they are different most of the time. Moving a group into another group that doesn’t have the same look(say two on top of each other into two that are diagonal to each other) is allowed.
我尝试找到类似的问题,很多都接近它,但是没有配对部分。我尝试将其表示为运输问题、分配问题或最大流量问题(容量为 1 的边,可能还包括一个额外的顶点“层”来表示组),但是我还没有找到一种方法来正确包含配对部分。
我已经设法将其表示为 ILP 问题: 红色和绿色单元格表示为完整二分图中的顶点 - R1、R2、.. G1、G2、..,其中 Ri 和 Gj 之间的边权重是网格中这两个单元格之间的欧几里德距离。红色和绿色单元格的所有有效分组形成第二个完整的二部图,其中该组的位置只是两个单元格的平均值。两个图中的每条边都分配有一个布尔变量,该变量表示我们是否将所述单元格/组移动到另一个单元格/组中。约束条件是与 Ri 相关的每条边(即来自 Ri 和包含 Ri 的组的边)的总和等于 1,并且进入 Gi(单元格或组)的边的总和为<= 1. The objective function is the minimum sum of the boolean variable multiplied by the edge weight it is associated with.
这是解决方案的Python实现
import pulp
import math
# manhattan or euclidean
DISTANCE_CALCULATION = "euclidean"
# default or gurobi
SOLVER = "gurobi"
# change these value pairs
red_cell_cords = [(1, 8),(1, 9),(1, 19),(1, 20),(2, 8),(2, 9),(2, 19),(2, 20),(3, 9),(3, 19),(6, 9),(6, 19),(7, 18),(8, 11),(8, 17),(9, 12),(9, 16),(11, 13),(11, 15),(12, 14)]
green_cell_cords = [(0, 6),(0, 21),(0, 22),(1, 6),(1, 23),(2, 5),(2, 6),(2, 23),(2, 24),(3, 4),(3, 5),(3, 6),(3, 24),(3, 25),(4, 4),(4, 5),(4, 9),(4, 19),(4, 25),(5, 5),(5, 25),(6, 5),(6, 7),(6, 21),(6, 25),(7, 4),(7, 5),(7, 25),(8, 4),(8, 25),(9, 4),(9, 6),(9, 22),(9, 26),(10, 5),(10, 27),(11, 5),(11, 27),(12, 27),(14, 24),(17, 25),(19, 25)]
red_cells = [f"R{i}" for i in range(len(red_cell_cords))]
green_cells = [f"G{i}" for i in range(len(green_cell_cords))]
red_group_cords = []
green_group_cords = []
red_groups = []
green_groups = []
# Generate RED groups and coordinates
for r1_idx, r1_cord in enumerate(red_cell_cords):
for r2_idx, r2_cord in enumerate(red_cell_cords):
if r1_idx < r2_idx:
if abs(r1_cord[0] - r2_cord[0]) <= 1 and abs(r1_cord[1] - r2_cord[1]) <= 1:
red_groups.append(f"R{r1_idx}-R{r2_idx}-")
red_group_cords.append(((r1_cord[0] + r2_cord[0]) / 2, (r1_cord[1] + r2_cord[1]) / 2))
# Generate GREEN groups and coordinates
for g1_idx, g1_cord in enumerate(green_cell_cords):
for g2_idx, g2_cord in enumerate(green_cell_cords):
if g1_idx < g2_idx:
if abs(g1_cord[0] - g2_cord[0]) <= 1 and abs(g1_cord[1] - g2_cord[1]) <= 1:
green_groups.append(f"G{g1_idx}-G{g2_idx}-")
green_group_cords.append(((g1_cord[0] + g2_cord[0]) / 2, (g1_cord[1] + g2_cord[1]) / 2))
costs = {}
# Edge weights for singular cells
for r_idx, r_cord in enumerate(red_cell_cords):
for g_idx, g_cord in enumerate(green_cell_cords):
red_label = red_cells[r_idx]
green_label = green_cells[g_idx]
if DISTANCE_CALCULATION == "manhattan":
distance = abs(r_cord[0] - g_cord[0]) + abs(r_cord[1] - g_cord[1])
elif DISTANCE_CALCULATION == "euclidean":
distance = math.sqrt(math.pow(abs(r_cord[0] - g_cord[0]), 2) + math.pow(abs(r_cord[1] - g_cord[1]), 2))
else:
raise BaseException("Invalid parameter as DISTANCE_CALCULATION. Check for typos")
costs[(red_label, green_label)] = distance
# Edge weights for groups
for r_idx, r_cord in enumerate(red_group_cords):
for g_idx, g_cord in enumerate(green_group_cords):
red_label = red_groups[r_idx]
green_label = green_groups[g_idx]
if DISTANCE_CALCULATION == "manhattan":
distance = abs(r_cord[0] - g_cord[0]) + abs(r_cord[1] - g_cord[1])
elif DISTANCE_CALCULATION == "euclidean":
distance = math.sqrt(math.pow(abs(r_cord[0] - g_cord[0]), 2) + math.pow(abs(r_cord[1] - g_cord[1]), 2))
else:
raise BaseException("Invalid parameter as DISTANCE_CALCULATION. Check for typos")
costs[(red_label, green_label)] = distance
prob = pulp.LpProblem("RedGreenAssignment", pulp.LpMinimize)
# Variables
## Singular movement
vars = pulp.LpVariable.dicts("Assign", (red_cells, green_cells), 0, 1, pulp.LpBinary)
## Group movement
vars_groups = pulp.LpVariable.dicts("Assign group", (red_groups, green_groups), 0, 1, pulp.LpBinary)
# Objective function
## Singular movement & Group movement
prob += pulp.lpSum([vars[r][g] * costs[(r, g)] for r in red_cells for g in green_cells]) + pulp.lpSum([vars_groups[r][g] * costs[(r, g)] for r in red_groups for g in green_groups])
# Constraints
## For each red cell, ensure it's either assigned individually or through a group
for i, r in enumerate(red_cells):
related_groups = [group for group in red_groups if f"R{i}-" in group]
prob += (pulp.lpSum([vars[r][g] for g in green_cells]) + \
pulp.lpSum([vars_groups[group][g_group] for group in related_groups for g_group in green_groups])) == 1
## For each green cell, ensure it assigned individually or through a group at most once
for i, g in enumerate(green_cells):
related_groups = [group for group in green_groups if f"G{i}-" in group]
prob += (pulp.lpSum([vars[r][g] for r in red_cells]) + \
pulp.lpSum([vars_groups[r_group][group] for group in related_groups for r_group in red_groups])) <= 1
# Solve
if SOLVER == "default":
prob.solve()
elif SOLVER == "gurobi":
solver = pulp.GUROBI_CMD(path=r'C:\gurobi1103\win64\bin\gurobi_cl.exe') # include your path to the gurobi solver
prob.solve(solver)
else:
raise BaseException("Invalid parameter as SOLVER. Check for typos")
# Result
for r in red_cells:
for g in green_cells:
if pulp.value(vars[r][g]) == 1:
print(f"{r} is assigned to {g}")
for r in red_groups:
for g in green_groups:
if pulp.value(vars_groups[r][g]) == 1:
print(f"{r} is assigned to {g}")
print(f"Total Cost: {pulp.value(prob.objective)}")
拥有大约 2500 个边,可以在 0.5 秒内给出解决方案,即我当前的解决方案运行得相当快。
用黄色箭头标记的是红色和绿色单元格的“匹配”。成对的分组通过在两个分组之间划一条线来标记。 手动找到 22x29 网格的解决方案(不一定是最优的),无需考虑对: ILP 求解器针对 22x29 网格找到的解决方案(最佳),并考虑到配对:
我的问题是:与 ILP 解决方案相比,是否有更有效的算法(仍能找到最佳分配)可以处理红绿单元配对约束并在网格大小限制内提高性能?
与我讨论过该问题的人对 ILP 解决方案在合理的时间内有效感到惊讶(因为 ILP 通常是 NP 难的)。
这是我的第一个问题,请随意纠正我的陈述或建议 ILP 公式的正确数学表示。
cords
拼写为 coords
LpVariable.matrix
而不是 LpVariable.dicts
例如,
import typing
import numpy as np
import pandas as pd
import pulp
from matplotlib import pyplot as plt
def make_colour_frames(
coord_data: np.ndarray, name: typing.Literal['R', 'G'],
) -> tuple[pd.DataFrame, pd.DataFrame]:
coords = pd.DataFrame(
index=name + pd.RangeIndex(name='name', stop=len(coord_data)).astype(pd.StringDtype()),
columns=('y', 'x'), data=coord_data,
)
i, j = np.triu_indices(n=len(coords), k=1)
combos = pd.concat(
(coords.iloc[i].reset_index(), coords.iloc[j].reset_index()),
axis='columns', keys=('1', '2'),
)
yx = combos.loc[:, (slice(None), ['y', 'x'])]
predicates = (
(yx['2'] - yx['1']).abs() <= 1
).all(axis='columns')
combos = combos[predicates]
groups = pd.DataFrame(
data={
'name': combos[('1', 'name')] + '-' + combos[('2', 'name')],
'name1': combos[('1', 'name')],
'name2': combos[('2', 'name')],
'y1': combos[('1', 'y')],
'y2': combos[('2', 'y')],
'x1': combos[('1', 'x')],
'x2': combos[('2', 'x')],
'y': combos.loc[:, (slice(None), 'y')].mean(axis=1),
'x': combos.loc[:, (slice(None), 'x')].mean(axis=1),
},
).set_index('name')
return coords, groups
def get_norm(
combos: pd.DataFrame, norm: typing.Literal['euclidean', 'manhattan'],
) -> pd.Series:
diff = combos[['y_R', 'x_R']] - combos[['y_G', 'x_G']].values
match norm:
case 'euclidean':
return np.linalg.norm(diff, axis=1)
case 'manhattan':
return diff.abs().sum(axis=1)
case _:
raise ValueError('Invalid norm ' + norm)
def make_combos(
red: pd.DataFrame, green: pd.DataFrame, norm: typing.Literal['euclidean', 'manhattan'],
) -> pd.DataFrame:
combos = pd.merge(
left=red.reset_index(), right=green.reset_index(), how='cross',
suffixes=('_R', '_G'),
)
combos['dist'] = get_norm(combos, norm)
combos['name'] = combos['name_R'] + '-' + combos['name_G']
combos.set_index('name', inplace=True)
# Movement
combos['assign'] = pulp.LpVariable.matrix(
name='assign', indices=combos.index, cat=pulp.LpBinary,
)
return combos
def make_vars(
red_coord_data: np.ndarray,
green_coord_data: np.ndarray,
norm: typing.Literal['euclidean', 'manhattan'] = 'euclidean',
) -> tuple[
pd.DataFrame, # single combos
pd.DataFrame, # group combos
]:
# Generate RED groups and coordinates
red_coords, red_groups = make_colour_frames(red_coord_data, name='R')
# Generate GREEN groups and coordinates
green_coords, green_groups = make_colour_frames(green_coord_data, name='G')
# Edge weights for singular cells
single_combos = make_combos(red_coords, green_coords, norm)
# Edge weights for groups
group_combos = make_combos(red_groups, green_groups, norm)
return single_combos, group_combos
def get_cost(combos: pd.DataFrame) -> pulp.LpAffineExpression:
return pulp.lpDot(combos['dist'], combos['assign'])
def make_problem(
single_combos: pd.DataFrame, group_combos: pd.DataFrame,
) -> pulp.LpProblem:
prob = pulp.LpProblem(name='RedGreenAssignment', sense=pulp.LpMinimize)
# Objective function
## Singular movement & Group movement
prob.setObjective(get_cost(single_combos) + get_cost(group_combos))
return prob
def add_constraints(
prob: pulp.LpProblem,
single_combos: pd.DataFrame,
group_combos: pd.DataFrame,
) -> None:
# Each red cell is assigned individually or through a group exactly once
for name, for_name in single_combos.groupby('name_R'):
total = pulp.lpSum(for_name['assign']) + pulp.lpSum(
group_combos.loc[
(group_combos['name1_R'] == name) |
(group_combos['name2_R'] == name), 'assign',
]
)
prob.addConstraint(name='excl_red_' + name, constraint=total == 1)
# Each green cell is assigned individually or through a group at most once
for name, for_name in single_combos.groupby('name_G'):
total = pulp.lpSum(for_name['assign']) + pulp.lpSum(
group_combos.loc[
(group_combos['name1_G'] == name) |
(group_combos['name2_G'] == name), 'assign',
]
)
prob.addConstraint(name='excl_green_' + name, constraint=total <= 1)
def prune(df: pd.DataFrame) -> pd.DataFrame:
return df[
df['assign'].map(pulp.value) > 0.5
].drop(columns='assign')
def solve(
prob: pulp.LpProblem,
single_combos: pd.DataFrame,
group_combos: pd.DataFrame,
solver: str = pulp.LpSolverDefault,
):
prob.solve(solver=solver)
if prob.status != pulp.LpStatusOptimal:
raise ValueError(prob.status)
return prune(single_combos), prune(group_combos)
def dump(
prob: pulp.LpProblem,
single_combos: pd.DataFrame,
group_combos: pd.DataFrame,
) -> None:
# already shown in solver output
# print('Total Cost: ', prob.objective.value())
pd.options.display.max_columns = 20
pd.options.display.width = 200
cols = ['y_R', 'x_R', 'y_G', 'x_G', 'dist']
print(single_combos[cols])
print(group_combos[cols])
def graph(
red_coords: np.ndarray,
green_coords: np.ndarray,
single_combos: pd.DataFrame,
group_combos: pd.DataFrame,
) -> plt.Figure:
fig, ax = plt.subplots()
height = 1 + max(red_coords[:, 0].max(), green_coords[:, 0].max())
width = 1 + max(red_coords[:, 1].max(), green_coords[:, 1].max())
image = np.full(shape=(height, width, 3), fill_value=255, dtype=np.uint8)
source = red = 209, 77, 77
dest_used = green = 115, 218, 80
dest_unused = dark_green = 71, 142, 47
image[red_coords[:, 0], red_coords[:, 1]] = source
image[green_coords[:, 0], green_coords[:, 1]] = dest_unused
image[single_combos['y_G'], single_combos['x_G']] = dest_used
image[group_combos['y1_G'], group_combos['x1_G']] = dest_used
image[group_combos['y2_G'], group_combos['x2_G']] = dest_used
ax.imshow(image)
for name, row in single_combos.iterrows():
ax.arrow(
row['x_R'], row['y_R'],
row['x_G'] - row['x_R'], row['y_G'] - row['y_R'],
length_includes_head=True, head_width=0.5, head_length=0.8,
facecolor='yellow',
)
for name, row in group_combos.iterrows():
ax.fill(
row[['x1_R', 'x2_R', 'x2_G', 'x1_G']],
row[['y1_R', 'y2_R', 'y2_G', 'y1_G']],
edgecolor='black', facecolor='yellow', alpha=0.5,
)
return fig
def main() -> None:
red_coords = np.array((
(1, 8), (1, 9), (1, 19), (1, 20), (2, 8), (2, 9), (2, 19), (2, 20), (3, 9), (3, 19), (6, 9),
(6, 19), (7, 18), (8, 11), (8, 17), (9, 12), (9, 16), (11, 13), (11, 15), (12, 14),
))
green_coords = np.array((
(0, 6), (0, 21), (0, 22), (1, 6), (1, 23), (2, 5), (2, 6), (2, 23), (2, 24), (3, 4), (3, 5),
(3, 6), (3, 24), (3, 25), (4, 4), (4, 5), (4, 9), (4, 19), (4, 25), (5, 5), (5, 25), (6, 5),
(6, 7), (6, 21), (6, 25), (7, 4), (7, 5), (7, 25), (8, 4), (8, 25), (9, 4), (9, 6), (9, 22),
(9, 26), (10, 5), (10, 27), (11, 5), (11, 27), (12, 27), (14, 24), (17, 25), (19, 25),
))
single_combos, group_combos = make_vars(
red_coord_data=red_coords, green_coord_data=green_coords,
)
prob = make_problem(single_combos=single_combos, group_combos=group_combos)
add_constraints(
prob=prob, single_combos=single_combos, group_combos=group_combos,
)
single_combos, group_combos = solve(prob=prob, single_combos=single_combos, group_combos=group_combos)
dump(prob=prob, single_combos=single_combos, group_combos=group_combos)
graph(
red_coords=red_coords, green_coords=green_coords,
single_combos=single_combos, group_combos=group_combos,
)
plt.show()
if __name__ == '__main__':
main()
Result - Optimal solution found
Objective value: 52.33277240
Enumerated nodes: 0
Total iterations: 0
Time (CPU seconds): 0.05
Time (Wallclock seconds): 0.05
Option for printingOptions changed from normal to all
Total time (CPU seconds): 0.05 (Wallclock seconds): 0.05
y_R x_R y_G x_G dist
name
R8-G16 3 9 4 9 1.00000
R9-G17 3 19 4 19 1.00000
R10-G22 6 9 6 7 2.00000
R18-G32 11 15 9 22 7.28011
y_R x_R y_G x_G dist
name
R0-R1-G0-G3 1.0 8.5 0.5 6.0 2.549510
R2-R3-G1-G2 1.0 19.5 0.0 21.5 2.236068
R4-R5-G6-G11 2.0 8.5 2.5 6.0 2.549510
R6-R7-G4-G7 2.0 19.5 1.5 23.0 3.535534
R11-R12-G20-G24 6.5 18.5 5.5 25.0 6.576473
R13-R15-G21-G26 8.5 11.5 6.5 5.0 6.800735
R14-R16-G27-G29 8.5 16.5 7.5 25.0 8.558621
R17-R19-G31-G34 11.5 13.5 9.5 5.5 8.246211