跟踪(直接)文件下载的最佳方式

问题描述 投票:0回答:5

跟踪直接文件下载的最佳方法是什么?

Google Analytics 仅适用于 JavaScript,无法跟踪直接文件下载。

最好的是安全且自己的托管解决方案。

php html apache .htaccess statistics
5个回答
14
投票

放心使用:)

.htaccess:

RewriteEngine on    
RewriteRule ^(.*).(rar|zip|pdf)$ http://xy.com/downloads/download.php?file=$1.$2 [R,L]    

mysql:

CREATE TABLE `download` (
    `filename` varchar(255) NOT NULL,
    `stats` int(11) NOT NULL,
    PRIMARY KEY  (`filename`)
)

下载.php

<?php

mysql_connect("localhost", "name", "password")
or die ("Sorry, can't connect to database.");
mysql_select_db("dbname"); 
$baseDir = "/home/public_html/downloads"; 
$path = realpath($baseDir . "/" . basename($_GET['file'])); 

if (dirname($path) == $baseDir) {
if(!is_bot())
mysql_query("INSERT INTO download SET filename='".mysql_real_escape_string(basename($_GET['file']))."' ON DUPLICATE KEY UPDATE stats=stats+1");


header("Cache-Control: public");
header("Content-Description: File Transfer");
header("Content-Disposition: attachment; filename=" . basename($_GET['file']));
header("Content-Length: ".filesize($path));
header("Content-Type: application/force-download");
header("Content-Transfer-Encoding: binary");
ob_clean();
ob_end_flush();
readfile($path);    
}

function is_bot()
{

    $botlist = array("Teoma", "alexa", "froogle", "Gigabot", "inktomi",
    "looksmart", "URL_Spider_SQL", "Firefly", "NationalDirectory",
    "Ask Jeeves", "TECNOSEEK", "InfoSeek", "WebFindBot", "girafabot",
    "crawler", "www.galaxy.com", "Googlebot", "Scooter", "Slurp",
    "msnbot", "appie", "FAST", "WebBug", "Spade", "ZyBorg", "rabaz",
    "Baiduspider", "Feedfetcher-Google", "TechnoratiSnoop", "Rankivabot",
    "Mediapartners-Google", "Sogou web spider", "WebAlta Crawler","TweetmemeBot",
    "Butterfly","Twitturls","Me.dium","Twiceler");

    foreach($botlist as $bot)
    {
        if(strpos($_SERVER['HTTP_USER_AGENT'],$bot)!==false)
        return true;    // Is a bot
    }

    return false;
}

?>

来源 - gayadesign.com


6
投票

您的 apache 日志应该包含大量信息,但我认为您要求的是对记录内容和记录时间进行更多控制。因此,您想要做的是有两个页面:一个包含文件的链接,另一个用于跟踪文件,如下所示:

file_page.php

<a href="download.php?id=1234">Download File!</a>

下载.php

<? // Code to track the file using PHP, whether that means storing data in a database, saving to a log, or emailing you. I'd use a DB, like so:

   // Prep the vars
   $file_id = $_GET['file_id']; // You should sanitize this first.
   $file_path = '/files/'.$file_id.'.pdf';

   // Save data to database
   mysql_query('INSERT INTO download_log
      SET file_id = '.$file_id.',
          date_downloaded = '.date('Y-m-d H:i:s').',
          user_id = '.$_SESSION['user_id']);

   // Now find the file and download it
   header('Content-type: application/pdf');
   header('Content-Disposition: attachment; filename='.$file_id.'.pdf); // or whatever the file name is
   readfile($file_path);

无论如何,类似的事情。

完成后页面将为空白,但所有浏览器都应在页面加载时开始下载文件。

所以我在这里所做的是保存文件 ID、当前日期时间和下载它的人的用户 ID(来自 $_SESSION 变量)。您可能想要存储更多信息,例如用户的 IP 地址、HTTP_REFERRER 或其他 $_SERVER 信息,以便您可以跟踪用户来自哪里、何时下载以及下载了什么内容。

祝你好运。


1
投票

为了使代码能够与 PHP7.3 、Mysqli 和 Mariadb 引擎一起使用,我对原始答案进行了两次修改:

  1. 对我来说,“mysqly_real_escape_string”不起作用,所以我使用“basename($_GET['file'])”

2:Mariadb SQL 在“UPDATE”之后不接受“SET”,因此我使用“ON DUPLICATE KEY UPDATE stats = stats + 1”

因此,我在这里发布了一个新问题:使用 .htaccess、mysql 和 php 直接下载链接的下载计数器

请参阅下面的完整代码...

<?php

$conn =  mysqli_connect('localhost', 'user_name', 'password','database');
if (!$conn) {
      die("Connection failed: " . mysqli_connect_error());
}

$baseDir = '/home/user/domains/mydomain.com/public_html/downloads'; 
$path = realpath($baseDir . '/' . basename($_GET['file'])); 

$file = basename($_GET['file']);// this is what I used instead mysqli_real_escape_string
$sql = 'INSERT INTO downloads VALUES ("'.$file.'", 1) ON DUPLICATE KEY UPDATE  stats = stats + 1';

//***************************************************
// following SQL line inserts record with empty filname field value but updates the counter as it is supposed to:
// $sql = 'INSERT INTO download  VALUES ('.mysqli_real_escape_string(basename($_GET['file'])).", 1)' ON DUPLICATE KEY UPDATE stats=stats+1";
//**********************************************************


if (dirname($path) == $baseDir) {
if(!is_bot())
mysqli_query($conn,$sql);
mysqli_close($conn);
header("Cache-Control: public");
header("Content-Description: File Transfer");
header("Content-Disposition: attachment; filename=" . basename($_GET['file']));
header("Content-Length: ".filesize($path));
header("Content-Type: application/force-download");
header("Content-Transfer-Encoding: binary");
ob_clean();
ob_end_flush();
readfile($path); 

}

function is_bot()
{

    $botlist = array("Teoma", "alexa", "froogle", "Gigabot", "inktomi",
    "looksmart", "URL_Spider_SQL", "Firefly", "NationalDirectory",
    "Ask Jeeves", "TECNOSEEK", "InfoSeek", "WebFindBot", "girafabot",
    "crawler", "www.galaxy.com", "Googlebot", "Scooter", "Slurp",
    "msnbot", "appie", "FAST", "WebBug", "Spade", "ZyBorg", "rabaz",
    "Baiduspider", "Feedfetcher-Google", "TechnoratiSnoop", "Rankivabot",
    "Mediapartners-Google", "Sogou web spider", "WebAlta Crawler","TweetmemeBot",
    "Butterfly","Twitturls","Me.dium","Twiceler");

    foreach($botlist as $bot)
    {
        if(strpos($_SERVER['HTTP_USER_AGENT'],$bot)!==false)
        return true;    // Is a bot
    }

    return false;
}


function alert($msg) {
    echo "<script type='text/javascript'>alert('$msg');</script>";
}

?>

1
投票

我也对此处发布的原始代码进行了一些改进。 我添加了一些额外的统计信息,例如国家、地区、纬度和经度等等......

这是更新后的代码:

<?php

$conn =  mysqli_connect('localhost', 'aeon_dl_usr', 'Turions@1522','aeon_downloads');
if (!$conn) {
      die("Connection failed: " . mysqli_connect_error());
}

if($_GET['f']=="workbook"){
    $file="work_log_book_public.pdf";
}else{
    die("File not found.");
}


$baseDir = substr(__FILE__,0,strpos(__FILE__,"downloads"))."downloads/"; 
$path = realpath($baseDir . '/' . $file); 

$ipAddress=get_client_ip();

$geo = unserialize(file_get_contents("http://www.geoplugin.net/php.gp?ip=".$ipAddress));
$User_Ipaddress=$geo["geoplugin_request"];
$User_Country = $geo["geoplugin_countryName"];
$User_City = $geo["geoplugin_city"];
$User_Region = $geo["geoplugin_region"];
$User_CurrencySymbol = $geo["geoplugin_currencySymbol"];
$User_CurrencyCode = $geo["geoplugin_currencyCode"];
$User_Latitude = $geo["geoplugin_latitude"];
$User_Longitude = $geo["geoplugin_longitude"];


$sql = 'INSERT INTO downloads VALUES ("'.$User_Longitude.'", "'.$User_Latitude.'","'.$User_CurrencyCode.'","'.$User_CurrencySymbol.'","'.$User_Region.'","'.$User_City.'","'.$User_Country.'","'.$User_Ipaddress.'", "'.$file.'", 1) ON DUPLICATE KEY UPDATE  stats = stats + 1';


if (dirname($path) == $baseDir) {
    if(!is_bot()){
        mysqli_query($conn,$sql);
        mysqli_close($conn);

        header("Cache-Control: public");
        header("Content-Description: File Transfer");
        header("Content-Disposition: attachment; filename=".$file);
        header("Content-Length: ".filesize($path));
        header("Content-Type: application/force-download");
        header("Content-Transfer-Encoding: binary");
        ob_clean();
        ob_end_flush();
        readfile($path); 
    }
}else{
    die("error on loading file");
}



function is_bot(){
    $botlist = array("Teoma", "alexa", "froogle", "Gigabot", "inktomi",
    "looksmart", "URL_Spider_SQL", "Firefly", "NationalDirectory",
    "Ask Jeeves", "TECNOSEEK", "InfoSeek", "WebFindBot", "girafabot",
    "crawler", "www.galaxy.com", "Googlebot", "Scooter", "Slurp",
    "msnbot", "appie", "FAST", "WebBug", "Spade", "ZyBorg", "rabaz",
    "Baiduspider", "Feedfetcher-Google", "TechnoratiSnoop", "Rankivabot",
    "Mediapartners-Google", "Sogou web spider", "WebAlta Crawler","TweetmemeBot",
    "Butterfly","Twitturls","Me.dium","Twiceler");

    foreach($botlist as $bot)
    {
        if(strpos($_SERVER['HTTP_USER_AGENT'],$bot)!==false)
        return true;    // Is a bot
    }

    return false;
}


function alert($msg) {
    echo "<script type='text/javascript'>alert('$msg');</script>";
}


function get_client_ip() {
    $ipaddress="";
    if (isset($_SERVER['HTTP_CLIENT_IP']))
        $ipaddress = $_SERVER['HTTP_CLIENT_IP'];
    else if(isset($_SERVER['HTTP_X_FORWARDED_FOR']))
        $ipaddress = $_SERVER['HTTP_X_FORWARDED_FOR'];
    else if(isset($_SERVER['HTTP_X_FORWARDED']))
        $ipaddress = $_SERVER['HTTP_X_FORWARDED'];
    else if(isset($_SERVER['HTTP_FORWARDED_FOR']))
        $ipaddress = $_SERVER['HTTP_FORWARDED_FOR'];
    else if(isset($_SERVER['HTTP_FORWARDED']))
        $ipaddress = $_SERVER['HTTP_FORWARDED'];
    else if(isset($_SERVER['REMOTE_ADDR']))
        $ipaddress = $_SERVER['REMOTE_ADDR'];
    else
        $ipaddress = 'UNKNOWN';

    return $ipaddress;
   
}


?>

0
投票

在与上面的代码斗争了一段时间后,我决定也发布我的工作版本,它可以为大家节省很多时间!!

首先,当您使用这样的代码时,请确保您的缓存被禁用...我有一段时间想知道为什么我要进行更改,但总是我的旧(坏)结果回来...最终我必须进入我的网络主机的管理面板,转到性能和缓存并禁用 Varnish,转到 PHP 设置并禁用 opcache。我还在 PHP 的顶部添加了一些行,告诉它不要缓存...之后故障排除变得更好了:)

MySQL:(我有一个稍微不同的表,所以我也会发布结构)

CREATE TABLE `downloads` (
  `id` int(11) NOT NULL,
  `filename` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `created` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `stats` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

ALTER TABLE `downloads`
  ADD PRIMARY KEY (`id`),
  ADD UNIQUE KEY `filename` (`filename`);

ALTER TABLE `downloads`
  MODIFY `id` int(11) NOT NULL AUTO_INCREMENT;
COMMIT;

下载.php:

<?php
header("Expires: Tue, 01 Jan 2000 00:00:00 GMT");
header("Last-Modified: " . gmdate("D, d M Y H:i:s") . " GMT");
header("Cache-Control: no-store, no-cache, must-revalidate, max-age=0");
header("Cache-Control: post-check=0, pre-check=0", false);
header("Pragma: no-cache");
mysqli_report(MYSQLI_REPORT_ERROR | MYSQLI_REPORT_STRICT);
$baseDir = "/data/sites/web/www/WHEREEVERYOUHAVEYOURFILESLOCATED"; 
$path = realpath($baseDir . '/' . basename($_GET['file'])); 
$file = basename($_GET['file']);

$mysqli = new mysqli("YOURDATABASESERVERNAME", "DBUSERNAME", "DBPASSWORD", "DATABASENAME");
$stmt = $mysqli->prepare("INSERT INTO downloads (filename) VALUES (?) ON DUPLICATE KEY UPDATE  stats = stats + 1");
$stmt->bind_param("s", $file);
$stmt->execute();

if (dirname($path) == $baseDir) {
if(!is_bot())
header("Cache-Control: public");
header("Content-Description: File Transfer");
header("Content-Disposition: attachment; filename=" . basename($_GET['file']));
header("Content-Length: ".filesize($path));
header("Content-Type: application/force-download");
header("Content-Transfer-Encoding: binary");
ob_clean();
ob_end_flush();
readfile($path); 

}

function is_bot()
{

    $botlist = array("Teoma", "alexa", "froogle", "Gigabot", "inktomi",
    "looksmart", "URL_Spider_SQL", "Firefly", "NationalDirectory",
    "Ask Jeeves", "TECNOSEEK", "InfoSeek", "WebFindBot", "girafabot",
    "crawler", "www.galaxy.com", "Googlebot", "Scooter", "Slurp",
    "msnbot", "appie", "FAST", "WebBug", "Spade", "ZyBorg", "rabaz",
    "Baiduspider", "Feedfetcher-Google", "TechnoratiSnoop", "Rankivabot",
    "Mediapartners-Google", "Sogou web spider", "WebAlta Crawler","TweetmemeBot",
    "Butterfly","Twitturls","Me.dium","Twiceler");

    foreach($botlist as $bot)
    {
        if(strpos($_SERVER['HTTP_USER_AGENT'],$bot)!==false)
        return true;    // Is a bot
    }

    return false;
}


function alert($msg) {
    echo "<script type='text/javascript'>alert('$msg');</script>";
}
?>

如您所见,这是上述所有解决方案的混合,并添加了一些行。我是 PHP 新手,所以我并不声称我知道一切,只是尽我所能提供帮助。我希望这有帮助...

用法

https://www.yoursite.com/download.php?file=yourfilename.txt
© www.soinside.com 2019 - 2024. All rights reserved.