我有很多链接包含在.html和.txt文件中,我想修改它们。我主要使用凯特作为我的文本编辑器,因此我用kate
标签问了这个问题。以下是链接示例:
<li>
<a href="http://sk1project.org/">
sK1
</a> is an open source vector graphics editor similar to CorelDRAW, Adobe Illustrator, or Freehand. First of all sK1 is oriented for PostScript processing. UniConvertor is a universal vector graphics translator. It uses sK1 engine to convert
one format to another. Development of the import/export modules for this program goes through different stages, quality and feature coverage are different among formats.
</li>
<li>
<a href="http://tango.freedesktop.org/Tango_Desktop_Project">
The Tango Desktop Project
</a> exists to help create a consistent graphical user interface experience for free and Open Source software. While the look and feel of an application is determined by many individual components, some organization is necessary in order to
unify the appearance and structure of individual icon sets used within those components. The Tango Desktop Project defines an icon style guideline to which artists and designers can adhere. A sample implementation of the style is available as an icon
theme based upon a standardized icon naming specification. In addition, the project provides transitional utilities to assist in creating icon themes for existing desktop environments, such as GNOME and KDE.
</li>
我找到了Regular expression to extract URL from an HTML link | python - 从HTML链接中提取URL的正则表达式 - Stack Overflow所以我知道如何使用href=[\'"]?([^\'" >]+">)
将文本从href捕获到“>”,但我不知道如何将文本从href保持到“>之前”和添加以下文本:'rel =“nofollow noopener noreferrer”>'。
我有最终结果如下:
<li>
<a href="http://sk1project.org/" rel="nofollow noopener noreferrer">
sK1
</a> is an open source vector graphics editor similar to CorelDRAW, Adobe Illustrator, or Freehand. First of all sK1 is oriented for PostScript processing. UniConvertor is a universal vector graphics translator. It uses sK1 engine to convert
one format to another. Development of the import/export modules for this program goes through different stages, quality and feature coverage are different among formats.
</li>
<li>
<a href="http://tango.freedesktop.org/Tango_Desktop_Project" rel="nofollow noopener noreferrer">
The Tango Desktop Project
</a> exists to help create a consistent graphical user interface experience for free and Open Source software. While the look and feel of an application is determined by many individual components, some organization is necessary in order to
unify the appearance and structure of individual icon sets used within those components. The Tango Desktop Project defines an icon style guideline to which artists and designers can adhere. A sample implementation of the style is available as an icon
theme based upon a standardized icon naming specification. In addition, the project provides transitional utilities to assist in creating icon themes for existing desktop environments, such as GNOME and KDE
</li>
如何在凯特的正则表达式中完成这项工作?
谢谢。
使用正则表达式解析html不是推荐的东西,但由于你使用Kate
编辑器,你可以使用这个正则表达式捕获<a
标签和href
属性,
(<a\s+.*?href=(['"]?)\S*\2)
并用它替换它,
\1 rel="nofollow noopener noreferrer"
我从来没有使用凯特编辑器所以不确定\1
是否会起作用或$1
让我知道这个是否奏效。