这只是一个爱好,不涉及官方或严肃的事情。不过我确实有一个问题;运行此代码时,它在第一秒计算大约 4 亿个 md5 哈希值,然后下降到每秒约 1000 万个哈希值,有没有办法加快速度?
这是代码:
using System;
using System.Security.Cryptography;
using System.Threading;
using System.Threading.Tasks;
using System.Runtime.InteropServices;
using Org.BouncyCastle.Crypto.Digests;
using Org.BouncyCastle.Crypto.Paddings;
public class SimdMD5HashFinder
{
private static readonly int numberOfThreads = 16; // Number of threads (cores)
private static readonly CancellationTokenSource cts = new CancellationTokenSource(); // Cancellation token for cancellation management
private const int batchSize = 25000000; // Number of random byte arrays to process in each batch
private const int byteArraySize = 16; // Size of each random byte array (16 bytes for MD5 input)
public static ulong n = 0;
[DllImport("kernel32.dll")]
private static extern IntPtr GetCurrentThread();
[DllImport("kernel32.dll", SetLastError = true)]
private static extern IntPtr SetThreadAffinityMask(IntPtr hThread, IntPtr dwThreadAffinityMask);
public static void Main(string[] args)
{
Console.WriteLine("Starting multiple threads to find MD5 hashes with the first three bytes equal to zero...");
Task[] tasks = new Task[numberOfThreads];
for (int i = 0; i < numberOfThreads; i++)
{
int threadId = i; // Capture the current thread index
tasks[i] = Task.Run(() => FindHash(threadId));
}
// Wait for tasks to complete
Task.WaitAll(tasks);
Console.WriteLine("All threads have completed or a match was found.");
Console.ReadKey();
}
private static void FindHash(int threadId)
{
// Pin the thread to a specific CPU core
PinThreadToCore(threadId);
// Create MD5 digest object from BouncyCastle
MD5Digest md5 = new MD5Digest();
byte[] randomBytes = new byte[batchSize * byteArraySize]; // Single large array to hold all random bytes for the batch
byte[] hashBytes = new byte[md5.GetDigestSize()]; // MD5 hash is always 16 bytes
RandomNumberGenerator rng = RandomNumberGenerator.Create();
while (!cts.Token.IsCancellationRequested) // Loop until cancellation is requested
{
n += (ulong)batchSize; // Increment n by batchSize
if (n % 1000000 == 0)
{
Console.WriteLine(n);
}
// Fill the large buffer with random bytes
rng.GetBytes(randomBytes);
// Process the buffer in chunks, each chunk is a 16-byte random array
for (int i = 0; i < randomBytes.Length; i += byteArraySize)
{
// Calculate MD5 hash of the 16-byte chunk
md5.BlockUpdate(randomBytes, i, byteArraySize);
md5.DoFinal(hashBytes, 0);
// Check if the first three bytes of the hash are all zero
if (hashBytes[0] == 0 && hashBytes[1] == 0 && hashBytes[2] == 0 && hashBytes[3] == 0)
{
Console.WriteLine($"\nThread {threadId} found a match!");
Console.WriteLine($"Random Bytes: {BitConverter.ToString(randomBytes, i, byteArraySize).Replace("-", "").ToLower()}"); // Show the chunk of random bytes as hex
Console.WriteLine($"MD5 Hash Bytes: {BitConverter.ToString(hashBytes).Replace("-", "").ToLower()}");
// Signal cancellation to stop all other threads
cts.Cancel();
return; // Exit the method once a match is found
}
// Reset the MD5 digest for the next round
md5.Reset();
}
}
}
// Method to pin a thread to a specific core
private static void PinThreadToCore(int coreId)
{
IntPtr mask = new IntPtr(1 << coreId); // Create affinity mask for the given core
IntPtr thread = GetCurrentThread(); // Get current thread handle
// Set the affinity mask to bind the thread to the specific core
SetThreadAffinityMask(thread, mask);
}
}
我几乎尝试了所有方法,但据我所知,这是最快的做事方式。我创建了一个大小为 N 的缓冲区,否则它会因为每次创建新的 MD5 对象的开销而减慢速度。我尝试使用 ILGPU 研究 GPU 计算,但发现太复杂了。
有一些可能的改进。
旁注:我对 C# 不太熟悉,所以我可能会犯错误。
不要使用随机字节作为哈希值。任何像样的加密哈希都会将输入转换为“看起来完全随机”的东西,因此使用随机值会增加大量工作,而计数器也同样好。
BigInteger
或无符号长计数器。
BigInteger
更简单,但速度慢得多。另一方面,您可以使用无符号长计数器(每个任务一个)和每个任务唯一的单字节 ID。这将使每个线程在耗尽之前进行约 18 quintillion 迭代。
您应该每 1 000 000 次迭代删除日志记录语句。这对于调试来说是一个有用的功能,但它为每个哈希添加了一个划分和分支。
想要加快速度...
您可以用 C 或汇编语言重新实现它。 C# 有一点开销,汇编/C 允许您使用 AVX 甚至 SHA-NI(如果您能够使用 SHA-1 而不是 MD5)。