我需要将Excel坐标(例如“AD45”)转换为X=30和Y=45的整数位置。
我有这段 PHP 代码:
/**
* @param String $coordinates
*
* @return array
*/
public function getCoordinatesPositions($coordinates) {
$letters = preg_replace('/[^a-zA-Z]/', '', $coordinates);
$numbers = preg_replace('/[^0-9]/', '', $coordinates);
$letters = strtoupper($letters);
$columnCoordinate = 0;
$alphabetIterate = 0;
$alphabetRange = range('A', 'Z');
$alphabetCount = count($alphabetRange);
$splittedLetters = str_split($letters);
$lettersCount = count($splittedLetters);
$i = 1;
if ($lettersCount === 1) {
$columnCoordinate = array_search($splittedLetters[0], $alphabetRange) + 1;
} else {
foreach ($splittedLetters as $letter) {
if ($i !== $lettersCount) {
$position = (array_search($letter, $alphabetRange) + 1) * $alphabetCount;
} else {
$position = (array_search($letter, $alphabetRange) + 1);
}
$columnCoordinate += $position;
$i++;
}
}
return array('column' => $columnCoordinate, 'row' => $numbers);
}
我的问题是,如果您传递包含 3 个或更多字母(“ABC45”)的坐标,此函数不会返回正确的列值。我的同事说,这个算法的性能也很差。
您对更简单、性能更好的算法有什么想法吗?谢谢。
原则上该算法是好的。您可以通过这种方式简化它并使其更通用:
function getCoordinatesPositions($coordinates) {
$letters = preg_replace('/[^a-zA-Z]/', '', $coordinates);
$numbers = preg_replace('/[^0-9]/', '', $coordinates);
$letters = strtoupper($letters);
$alphabetRange = range('A', 'Z');
$alphabetCount = count($alphabetRange);
$splittedLetters = str_split($letters);
$lettersCount = count($splittedLetters);
$columnCoordinate = 0;
$i = 1;
foreach ($splittedLetters as $letter) {
$columnCoordinate += (array_search($letter, $alphabetRange) + 1) * pow($alphabetCount, $lettersCount - $i);
$i++;
}
return array('column' => $columnCoordinate, 'row' => intval($numbers));
}
var_dump(getCoordinatesPositions("ABC456"));
对于
PHPExcel
请参阅 PHPExcel 如何从单元格获取列索引。
@Axel Richter 的答案是一个很好的解决方案并且工作正常,但可以改进为:
这是建议的版本:
function getCoordinatesPositions($coordinates) {
if (preg_match('/^([a-z]+)(\d+)$/i', $coordinates, $matches)) {
$level = strlen($matches[1]);
$matches[1] = array_reduce(
str_split(strtoupper($matches[1])),
function($result, $letter) use (&$level) {
return $result + (ord($letter) - 64) * pow(26, --$level);
}
);
return array_splice($matches, 1);
}
// (returns NULL when wrong $coordinates)
}
使用初始
preg_match()
确保避免使用错误的坐标,并直接将列部分提取到$matches['1']
中。
现在主要的改进是使用
ord($letter)
来计算字母的单独值:它避免了创建 range('A', 'Z')
的临时数组,并简化了评估。
然后
array_reduce()
可以对列部分进行更紧凑的处理,即进行了原位修改,所以最终的返回也简化为中间的简单部分$matches
。
我已经用 1.000.000 次迭代测试了这两个答案,以在 MacBook Air M2 上查找“XX50”的行/单元格。
与问题一起发布的原始代码在
2.60
秒内完成,而接受的答案在 2.11
秒内完成(18.84%
改进)。
我需要这个项目来处理,在这个项目中,我必须以各种方式处理大量非常大的 Excel 文件,并且需要我能得到的最快速度。所以,这是我的代码,在我的机器上,在
1.46
秒内完成测试,使其比接受的答案更快30.71%
。
function getCoordinatesPositions($coordinates) {
// continue only if the coordinates are valid
if (preg_match('/^([a-z]+)(\d+)$/is', strtolower($coordinates), $matches)) {
// there's nothing faster than having this as static values - ord() is a lot slower
// even having them start at `1` instead of `0` and not having to do an addition
// later on helps
$lookup = array(
'a' => 1, 'b' => 2, 'c' => 3, 'd' => 4, 'e' => 5, 'f' => 6, 'g' => 7,
'h' => 8, 'i' => 9, 'j' => 10, 'k' => 11, 'l' => 12, 'm' => 13, 'n' => 14,
'o' => 15, 'p' => 16, 'q' => 17, 'r' => 18, 's' => 19, 't' => 20, 'u' => 21,
'v' => 22, 'w' => 23, 'x' => 24, 'y' => 25, 'z' => 26
);
// no need to `str_split` to get the letters as an array
// as in PHP $string[$index] works well and it is a lot faster
$length = strlen($matches[1]);
$column = 0;
for ($i = 0; $i <= $length; $i++) {
// this is significantly faster than pow(26, length--) albeit not as pretty
for ($j = 0, $pow = 1; $j < $length - 1; $j++) $pow *= 26;
// multiply the value of each letter by its positional weight
// (26^(position from the right))
$column += $lookup[$matches[1][$i]] * $pow;
$length--;
}
return array('column' => $column, 'row' => $matches[2]);
}
// if we get this far it means the coordinates were invalid
return false;
}
`