如果 PHP 中包含 HTML 结构,如何从变量中仅提取文本? [重复]

问题描述 投票:0回答:1

我在数据库中存储了 HTML 字符串,我只想提取文本内容,去掉所有 HTML 标签。在 PHP 中执行此操作的最佳方法是什么?

我想提取纯文本。我怎样才能实现这个目标?

<?php
    require_once 'includes/connection.inc.php';

    $fetch_data = "SELECT * FROM emails";
    $fetch_data_conn = mysqli_query($connection, $fetch_data);

    while ($row = mysqli_fetch_assoc($fetch_data_conn)) {
        $content = $row["content"];

        $textContent = strip_tags($content);
    
        echo $textContent;
    }
    ?>

电流输出:

Output

php domdocument text-extraction strip-tags
1个回答
1
投票
<?php
    require_once 'includes/connection.inc.php';

    function removeElementsByTagName($tagName, $document)
    {
        $nodeList = $document->getElementsByTagName($tagName);
        for ($nodeIdx = $nodeList->length; --$nodeIdx >= 0;) {
            $node = $nodeList->item($nodeIdx);
            $node->parentNode->removeChild($node);
        }
    }

    $fetch_data = "SELECT * FROM emails";
    $fetch_data_conn = mysqli_query($connection, $fetch_data);

    while ($row = mysqli_fetch_assoc($fetch_data_conn)) {
        $content = $row["content"];

        $doc = new DOMDocument();
        libxml_use_internal_errors(true);
        $doc->loadHTML($content);
        libxml_clear_errors();
    
        removeElementsByTagName('script', $doc);
        removeElementsByTagName('style', $doc);
        removeElementsByTagName('link', $doc);

        $textContent = strip_tags($doc->saveHTML());
    
        echo $textContent;
        echo "<hr>";
    }
    ?>
最新问题
© www.soinside.com 2019 - 2025. All rights reserved.