我有一个 Perl CGI 脚本,用于在不同类型的文本中查找关键字并生成排序的输出。
当前的输出看起来像这样
对于关键字“the”。
但我想在此输出中以粗体突出显示关键字“the”。
怎么可能做到这一点?
我尝试使用
Term::ANSIColor
,但它以粗体打印整个句子,而不仅仅是关键字。
我的 CGI 脚本如下所示:
#!/usr/bin/perl
# require
use warnings;
use diagnostics;
use CGI;
use File::Basename;
my $q = new CGI;
print "Content-type: text/plain\n\n";
#initialize variables
my $target = $q->param( "keyword" );
my $radius = $q->param( "span" );
my $ordinal = $q->param( "ord" );
my $width = 2 * $radius;
#print standard output
print "****************************\n";
print "* SEARCH RESULTS *\n";
print "****************************\n";
print "Your Search word is: $target\n\n";
print "Your Radius is : $radius\n\n";
print "Your Ordinal is : $ordinal\n\n";
print "-----------------------------------------------------------\n\n";
#normal concordance for abstract text
my @files = glob( 'ABS/*.txt' );
for my $file ( @files ) {
my $path = glob( 'ABS/*.txt' );
my $file_name = basename( $path );
# initialize
my $count = 0;
my @lines = ();
$/ = ""; # Paragraph read mode
# open the file, and process each line in it
open( FILE, " < $file" ) or die( "Can not open $file ($!).\n" );
while ( <FILE> ) {
# re-initialize
my $extract = '';
# normalize the data
chomp;
s/\n/ /g; # Replace new lines with spaces
s/\b--\b/ -- /g; # Add spaces around dashes
# process each item if the target is found
while ( $_ =~ /\b$target\b/gi ) {
# find start position
my $match = $1;
my $pos = pos;
my $start = $pos - $radius - length( $match );
# extract the snippets
if ( $start < 0 ) {
$extract = substr( $_, 0, $width + $start + length( $match ) );
$extract = ( " " x -$start ) . $extract;
}
else {
$extract = substr( $_, $start, $width + length( $match ) );
my $deficit = $width + length( $match ) - length( $extract );
if ( $deficit > 0 ) {
$extract .= ( " " x $deficit );
}
}
# add the extracted text to the list of lines, and increment
$lines[$count] = $extract;
++$count;
}
}
sub removePunctuation {
my $string = $_[0];
$string = lc( $string ); # Convert to lowercase
$string =~ s/[^-a-z ]//g; # Remove non-aplhabetic characters
$string =~ s/--+/ /g; # Remove 2+ hyphens with a space
$string =~ s/-//g; # Remove hyphens
$string =~ s/\s=/ /g;
return ( $string );
}
sub onLeft {
#USAGE: $word = onLeft($string, $radius, $ordinal);
my $left = substr( $_[0], 0, $_[1] );
$left = removePunctuation( $left );
my @word = split( /\s+/, $left );
return ( $word[ -$_[2] ] );
}
sub byLeftWords {
my $left_a = onLeft( $a, $radius, $ordinal );
my $left_b = onLeft( $b, $radius, $ordinal );
lc( $left_a ) cmp lc( $left_b );
}
# process each line in the list of lines
my $line_number = 0;
print "File name: $file_name \n\n";
foreach my $x ( sort byLeftWords @lines ) {
++$line_number;
printf "%5d", $line_number;
print " $x\n\n";
}
print "------------------------------------------------------------\n\n";
}
# done
exit;
谢谢。
您无法控制
text/plain
文档中显示的字符样式。 Term::ANSIColor
当然不会工作,因为浏览器无法识别 ANSI 终端转义序列,这只能在您的控制台上工作
如果您将输出编写为 HTML,那么您只需要将
<strong>...</strong>
放在要加粗的单词周围
或者,更好的是,你可以编写这样的 CSS 样式
.bold {
font-weight: bold;
}
并将单词包裹在
<span class="bold">...</span>
中
要将文本文档转换为 HTML,最简单的方法是将其包装在
<pre>...</pre>
(“预格式化”)元素中,并将其放入标准 HTML 结构中。小于<
、大于>
和与号&
字符必须分别替换为对应的实体<
、>
和&
,并且可以将你想要的单词用粗体包裹起来<strong>
标签
对大多数现代浏览器也友好的最小 HTML5 文档看起来像这样
<!doctype html>
<html lang=en>
<head>
<meta charset=UTF-8>
<title>Search Results</title>
</head>
<body>
<pre>
Content of my report
with words to be in bold wrapped in <strong> tags
and characters `<`, `>`, and `&` replaced by
`<`, `>` and `&` respectively
</pre>
</body>
</html>