PHP 短哈希（类似 URL 缩短网站）

Question

我正在寻找一个 PHP 函数，它可以从字符串或文件中创建一个短哈希，类似于那些 URL 缩短网站，如 tinyurl.com

哈希值不应超过 8 个字符。

Answer 1

TinyURL 不会散列任何内容，它使用 Base 36 整数（甚至是 Base 62，使用小写和大写字母）来指示要访问的记录。

基数 36 转整数：

intval($str, 36);

以 36 为底的整数：

base_convert($val, 10, 36);

那么，不再重定向到像

/url/1234

这样的路线，而是变成

/url/ax

。这比哈希有更多的用途，因为不会发生冲突。有了这个，您可以轻松检查 url 是否存在，并以 36 进制返回正确的现有 ID，而用户无需知道它已经在数据库中。

不要散列，使用其他基础来处理这种事情。（速度更快并且可以防碰撞。）

Answer 2

我编写了一个小库来从整数生成混淆的哈希值。

https://github.com/KevBurnsJr/pseudocrypt

$ids = range(1,10);
foreach($ids as $id) {
  echo PseudoCrypt::unhash($id) . "\n";
}

m8z2p
8hy5e
uqx83
广州瓦斯
38vdh
菲格6
巴克提夫
西子湖
k8ro9
6hqy

7/14/2015：添加下面的实际代码，因为它变得很难找到：

<?php
/**
 * PseudoCrypt by KevBurns (http://blog.kevburnsjr.com/php-unique-hash)
 * Reference/source: http://stackoverflow.com/a/1464155/933782
 * 
 * I want a short alphanumeric hash that’s unique and who’s sequence is difficult to deduce. 
 * I could run it out to md5 and trim the first n chars but that’s not going to be very unique. 
 * Storing a truncated checksum in a unique field means that the frequency of collisions will increase 
 * geometrically as the number of unique keys for a base 62 encoded integer approaches 62^n. 
 * I’d rather do it right than code myself a timebomb. So I came up with this.
 * 
 * Sample Code:
 * 
 * echo "<pre>";
 * foreach(range(1, 10) as $n) {
 *     echo $n." - ";
 *     $hash = PseudoCrypt::hash($n, 6);
 *     echo $hash." - ";
 *     echo PseudoCrypt::unhash($hash)."<br/>";
 * }
 * 
 * Sample Results:
 * 1 - cJinsP - 1
 * 2 - EdRbko - 2
 * 3 - qxAPdD - 3
 * 4 - TGtDVc - 4
 * 5 - 5ac1O1 - 5
 * 6 - huKpGQ - 6
 * 7 - KE3d8p - 7
 * 8 - wXmR1E - 8
 * 9 - YrVEtd - 9
 * 10 - BBE2m2 - 10
 */
 
class PseudoCrypt {
 
    /* Key: Next prime greater than 62 ^ n / 1.618033988749894848 */
    /* Value: modular multiplicative inverse */
    private static $golden_primes = array(
        '1'                  => '1',
        '41'                 => '59',
        '2377'               => '1677',
        '147299'             => '187507',
        '9132313'            => '5952585',
        '566201239'          => '643566407',
        '35104476161'        => '22071637057',
        '2176477521929'      => '294289236153',
        '134941606358731'    => '88879354792675',
        '8366379594239857'   => '7275288500431249',
        '518715534842869223' => '280042546585394647'
    );
 
    /* Ascii :                    0  9,         A  Z,         a  z     */
    /* $chars = array_merge(range(48,57), range(65,90), range(97,122)) */
    private static $chars62 = array(
        0=>48,1=>49,2=>50,3=>51,4=>52,5=>53,6=>54,7=>55,8=>56,9=>57,10=>65,
        11=>66,12=>67,13=>68,14=>69,15=>70,16=>71,17=>72,18=>73,19=>74,20=>75,
        21=>76,22=>77,23=>78,24=>79,25=>80,26=>81,27=>82,28=>83,29=>84,30=>85,
        31=>86,32=>87,33=>88,34=>89,35=>90,36=>97,37=>98,38=>99,39=>100,40=>101,
        41=>102,42=>103,43=>104,44=>105,45=>106,46=>107,47=>108,48=>109,49=>110,
        50=>111,51=>112,52=>113,53=>114,54=>115,55=>116,56=>117,57=>118,58=>119,
        59=>120,60=>121,61=>122
    );
 
    public static function base62($int) {
        $key = "";
        while(bccomp($int, 0) > 0) {
            $mod = bcmod($int, 62);
            $key .= chr(self::$chars62[$mod]);
            $int = bcdiv($int, 62);
        }
        return strrev($key);
    }
 
    public static function hash($num, $len = 5) {
        $ceil = bcpow(62, $len);
        $primes = array_keys(self::$golden_primes);
        $prime = $primes[$len];
        $dec = bcmod(bcmul($num, $prime), $ceil);
        $hash = self::base62($dec);
        return str_pad($hash, $len, "0", STR_PAD_LEFT);
    }
 
    public static function unbase62($key) {
        $int = 0;
        foreach(str_split(strrev($key)) as $i => $char) {
            $dec = array_search(ord($char), self::$chars62);
            $int = bcadd(bcmul($dec, bcpow(62, $i)), $int);
        }
        return $int;
    }
 
    public static function unhash($hash) {
        $len = strlen($hash);
        $ceil = bcpow(62, $len);
        $mmiprimes = array_values(self::$golden_primes);
        $mmi = $mmiprimes[$len];
        $num = self::unbase62($hash);
        $dec = bcmod(bcmul($num, $mmi), $ceil);
        return $dec;
    }
 
}

Answer 3

URL 缩短服务而是使用自动递增的整数值（如补充数据库 ID），并使用 Base64 或其他编码对其进行编码，以获得每个字符的更多信息（64 个而不是 10 个数字）。

Answer 4

最短的哈希值是 32 个字符长度，但是你可以使用 md5 哈希值的前 8 个字符

echo substr(md5('http://www.google.com'), 0, 8);

更新：这是这里找到的另一个类，由Travell Perkins编写，它获取记录号并为其创建短哈希。 14 位数字生成 8 位数字字符串。当你达到这个数字时，你就会比tinyurl更受欢迎;)

class BaseIntEncoder {

    //const $codeset = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
    //readable character set excluded (0,O,1,l)
    const codeset = "23456789abcdefghijkmnopqrstuvwxyzABCDEFGHIJKLMNPQRSTUVWXYZ";

    static function encode($n){
        $base = strlen(self::codeset);
        $converted = '';

        while ($n > 0) {
            $converted = substr(self::codeset, bcmod($n,$base), 1) . $converted;
            $n = self::bcFloor(bcdiv($n, $base));
        }

        return $converted ;
    }

    static function decode($code){
        $base = strlen(self::codeset);
        $c = '0';
        for ($i = strlen($code); $i; $i--) {
            $c = bcadd($c,bcmul(strpos(self::codeset, substr($code, (-1 * ( $i - strlen($code) )),1))
                    ,bcpow($base,$i-1)));
        }

        return bcmul($c, 1, 0);
    }

    static private function bcFloor($x)
    {
        return bcmul($x, '1', 0);
    }

    static private function bcCeil($x)
    {
        $floor = bcFloor($x);
        return bcadd($floor, ceil(bcsub($x, $floor)));
    }

    static private function bcRound($x)
    {
        $floor = bcFloor($x);
        return bcadd($floor, round(bcsub($x, $floor)));
    }
}

这是如何使用它的示例：

BaseIntEncoder::encode('1122344523');//result:3IcjVE
BaseIntEncoder::decode('3IcjVE');//result:1122344523

Answer 5

对于短的 hash，url 友好，考虑到不允许可能的重复内容，我们可以使用

hash()

，尤其是 CRC 或 Adler-32 类型，因为它们正是为此而设计的：

循环冗余检查

循环冗余校验（CRC）是一种常见的错误检测码用于数字网络和存储设备来检测意外事件原始数据的更改。进入这些系统的数据块会变得很短根据多项式除法的余数检查附加值他们的内容。在检索时，重复计算，并且如果检查值不匹配，可以采取纠正措施采取 https://en.wikipedia.org/wiki/Cyclic_redundancy_check

Adler-32 是一种校验和算法（...），与相同长度的循环冗余校验，以可靠性换取速度（首选后者） https://en.wikipedia.org/wiki/Adler-32

echo hash("crc32", "Content of article...");
// Output fd3e7c6e
echo hash("adler32", "Content of article...");
// Output 55df075f

[Youtube] CRC 是如何工作的？

Answer 6

最佳答案：给定唯一数据库 ID 的最小唯一“散列式”字符串 - PHP 解决方案，无需第三方库。

这是代码：

<?php
/*
THE FOLLOWING CODE WILL PRINT:
A database_id value of 200 maps to 5K
A database_id value of 1 maps to 1
A database_id value of 1987645 maps to 16LOD
*/
$database_id = 200;
$base36value = dec2string($database_id, 36);
echo "A database_id value of $database_id maps to $base36value\n";
$database_id = 1;
$base36value = dec2string($database_id, 36);
echo "A database_id value of $database_id maps to $base36value\n";
$database_id = 1987645;
$base36value = dec2string($database_id, 36);
echo "A database_id value of $database_id maps to $base36value\n";

// HERE'S THE FUNCTION THAT DOES THE HEAVY LIFTING...
function dec2string ($decimal, $base)
// convert a decimal number into a string using $base
{
    //DebugBreak();
   global $error;
   $string = null;

   $base = (int)$base;
   if ($base < 2 | $base > 36 | $base == 10) {
      echo 'BASE must be in the range 2-9 or 11-36';
      exit;
   } // if

   // maximum character string is 36 characters
   $charset = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ';

   // strip off excess characters (anything beyond $base)
   $charset = substr($charset, 0, $base);

   if (!ereg('(^[0-9]{1,50}$)', trim($decimal))) {
      $error['dec_input'] = 'Value must be a positive integer with < 50 digits';
      return false;
   } // if

   do {
      // get remainder after dividing by BASE
      $remainder = bcmod($decimal, $base);

      $char      = substr($charset, $remainder, 1);   // get CHAR from array
      $string    = "$char$string";                    // prepend to output

      //$decimal   = ($decimal - $remainder) / $base;
      $decimal   = bcdiv(bcsub($decimal, $remainder), $base);

   } while ($decimal > 0);

   return $string;

}

?>

Answer 7

我写了某种算法。

是

更容易理解、调整和更改
仅使用您允许的符号（因此可以轻松区分大小写）
也独一无二
盒子中不区分大小写
仅适用于正整数（对于 ID）

<?php

class PseudoCrypt1
{
    private static $keychars = 'CZPXD5H2FIWB81KE76JY93V4ORLAMT0QSUNG'; // Dictionary of allowed unique symbols, shuffle it for yourself or remove unwanted chars (don't forget to call testParameters after changing)
    private static $divider = 19; // Tune divider for yourself (don't forget to call testParameters after changing)
    private static $biasDivider = 14; // Tune bias divider for yourself (don't forget to call testParameters after changing)
    private static $noise = 53; // Any positive number

    public static function testParameters()
    {
        if (strlen(static::$keychars) < static::$divider + static::$biasDivider - 1) {
            throw new Exception('Check your divider and biasDivider. It must be less than keychars length');
        }
    }

    public static function encode(int $i): string
    {
        if ($i < 0) {
            throw new Exception('Expected positive integer');
        }

        $keychars = static::$keychars;
        $i = $i + static::$noise; // add noise to a number
        $bias = $i % static::$biasDivider;

        $res = '';

        while ($i > 0) {
            $div = $i % static::$divider;
            $i = intdiv($i, static::$divider);
            $res .= $keychars[$div + $bias];
        }

        // Current version of an algorithm is one of these chars (if in the future you will need to identify a version)
        // Remember this chars on migrating to a new algorithm/parameters
        $res .= str_shuffle('LPTKEZG')[0];
        $res .= $keychars[$bias]; // Encoded bias

        return $res;
    }

    public static function decode($code)
    {
        $keychars = static::$keychars;
        $biasC = substr($code, -1);
        $bias = strpos($keychars, $biasC);
        $code = substr($code, 0, -2);
        $code = str_split(strrev($code));

        $val = 0;

        foreach ($code as $c) {
            $val *= static::$divider;
            $val += strpos($keychars, $c) - $bias;
        }

        return $val - static::$noise;
    }
}

输出

36926 -> 7IWFZX
927331 -> F4WIKP2
9021324 -> AT66R7P1

您可以用这个小测试来测试它（它不包括唯一性测试，但算法是唯一的）：

PseudoCrypt1::testParameters();

for ($i = 4000000; $i < 9500000; $i++) {
    $hash = PseudoCrypt1::encode($i);
    echo $i.':'.strlen($hash).':'.$hash.PHP_EOL;
    if ($i != PseudoCrypt1::decode($hash)) {
        echo 'FAIL:'.$i.PHP_EOL;
        die();
    }
}

Answer 8

实际上拥有“随机”哈希的最佳解决方案是生成一个随机哈希列表，将其放在具有唯一索引的Mysql上（您可以编写一个简单的UDF以在1秒内插入100 000行）。

状态指示该哈希是否免费。

Answer 9

我正在制作一个网址缩短器。就我而言，我每次都使用数据库的“id”创建唯一的短网址。

我所做的是，首先 -

在数据库中插入“原始网址”和“创建日期”等数据，将“短网址”保留为空。然后从那里获取“id”并传入下面的函数。

<?php
    function genUniqueCode($id){
    $id = $id + 100000000000;
    return base_convert($id, 10, 36);
}

//Get Unique Code using ID
/*
id Below is retrived from Database after Inserting Original URL.
*/



$data['id'] =10;
$uniqueCode = genUniqueCode($data['id']);

   // Generating the URL
$protocol = strtolower(substr($_SERVER["SERVER_PROTOCOL"],0,5))=='https'?'https':'http';
echo "<a href='{$protocol}://{$_SERVER['HTTP_HOST']}/{$uniqueCode}'>{$protocol}://{$_SERVER['HTTP_HOST']}/{$uniqueCode}</a>";

?>

然后更新数据库中短网址代码的值。

这里我使用“id”来创建短代码。由于多个条目的 ID 不能相同。它是唯一的，因此唯一代码或网址将是唯一的。

Answer 10

嗯，也许我们可以使用 MurmurHash ？

function hash_murmur3($string)
{
    $string = array_values(unpack('C*', $string));
    $klen = count($string);
    $h1 = 0;
    $remainder = 0;
    $i = 0;

    for ($bytes = $klen - ($remainder = $klen & 3); $i < $bytes;) {
        $k1 = $string[$i] | ($string[++$i] << 8) | ($string[++$i] << 16) | ($string[++$i] << 24);
        ++$i;
        $k1 = (((($k1 & 0xffff) * 0xcc9e2d51) + (((((($k1 >= 0) ? ($k1 >> 16) : (($k1 & 0x7fffffff) >> 16) | 0x8000)) * 0xcc9e2d51) & 0xffff) << 16))) & 0xffffffff;
        $k1 = $k1 << 15 | (($k1 >= 0) ? ($k1 >> 17) : (($k1 & 0x7fffffff) >> 17) | 0x4000);
        $k1 = (((($k1 & 0xffff) * 0x1b873593) + (((((($k1 >= 0) ? ($k1 >> 16) : (($k1 & 0x7fffffff) >> 16) | 0x8000)) * 0x1b873593) & 0xffff) << 16))) & 0xffffffff;
        $h1 ^= $k1;
        $h1 = $h1 << 13 | (($h1 >= 0) ? ($h1 >> 19) : (($h1 & 0x7fffffff) >> 19) | 0x1000);
        $h1b = (((($h1 & 0xffff) * 5) + (((((($h1 >= 0) ? ($h1 >> 16) : (($h1 & 0x7fffffff) >> 16) | 0x8000)) * 5) & 0xffff) << 16))) & 0xffffffff;
        $h1 = ((($h1b & 0xffff) + 0x6b64) + (((((($h1b >= 0) ? ($h1b >> 16) : (($h1b & 0x7fffffff) >> 16) | 0x8000)) + 0xe654) & 0xffff) << 16));
    }

    $k1 = 0;

    switch ($remainder) {
        case 3:
            $k1 ^= $string[$i + 2] << 16;

        case 2:
            $k1 ^= $string[$i + 1] << 8;

        case 1:
            $k1 ^= $string[$i];
            $k1 = ((($k1 & 0xffff) * 0xcc9e2d51) + (((((($k1 >= 0) ? ($k1 >> 16) : (($k1 & 0x7fffffff) >> 16) | 0x8000)) * 0xcc9e2d51) & 0xffff) << 16)) & 0xffffffff;
            $k1 = $k1 << 15 | (($k1 >= 0) ? ($k1 >> 17) : (($k1 & 0x7fffffff) >> 17) | 0x4000);
            $k1 = ((($k1 & 0xffff) * 0x1b873593) + (((((($k1 >= 0) ? ($k1 >> 16) : (($k1 & 0x7fffffff) >> 16) | 0x8000)) * 0x1b873593) & 0xffff) << 16)) & 0xffffffff;
            $h1 ^= $k1;
    }

    $h1 ^= $klen;
    $h1 ^= (($h1 >= 0) ? ($h1 >> 16) : (($h1 & 0x7fffffff) >> 16) | 0x8000);
    $h1 = ((($h1 & 0xffff) * 0x85ebca6b) + (((((($h1 >= 0) ? ($h1 >> 16) : (($h1 & 0x7fffffff) >> 16) | 0x8000)) * 0x85ebca6b) & 0xffff) << 16)) & 0xffffffff;
    $h1 ^= (($h1 >= 0) ? ($h1 >> 13) : (($h1 & 0x7fffffff) >> 13) | 0x40000);
    $h1 = (((($h1 & 0xffff) * 0xc2b2ae35) + (((((($h1 >= 0) ? ($h1 >> 16) : (($h1 & 0x7fffffff) >> 16) | 0x8000)) * 0xc2b2ae35) & 0xffff) << 16))) & 0xffffffff;
    $h1 ^= (($h1 >= 0) ? ($h1 >> 16) : (($h1 & 0x7fffffff) >> 16) | 0x8000);

    return base_convert(sprintf("%u\n", $h1), 10, 32);
}

使用示例：

echo hash_murmur3('foo'); // prints '3rabh10'

PHP 短哈希（类似 URL 缩短网站）

问题描述投票：0回答：10

10个回答

最新问题

PHP 短哈希（类似 URL 缩短网站）

问题描述 投票：0回答：10

10个回答

最新问题

问题描述投票：0回答：10