我们使用这种方法来查找单个关键字
Get-Content $SourceFile | Select-String -Pattern "search keyword value"
但是,我们必须提取4个值,即嵌入式磅(£)值(可变货币金额)和文字子串,如下所示:
# Sample input
$String =' in the case of a single acquisition the Total Purchase Price of which (less the amount
funded by Acceptable Funding Sources (Excluding Debt)) exceeds £5,000,000 (or its
equivalent) but is less than or equal to £10,000,000 or its equivalent, the Parent shall
supply to the Agent for the Lenders not later than the date a member of the Group
legally commits to make the relevant acquisition, a copy of any financial due diligence
reports obtained by the Group in relation to the Acquisition Target, on a non-reliance
basis (subject to the Agent and any other relevant Reliance Party signing any required
hold harmless letter) and a copy of the acquisition agreement under which the
Acquisition Target is to be acquired;'
# Values to extract
$Value1 = ' in the case of a single acquisition the Total Purchase Price '
$Value2 = ' £5,000,000'
$Value3 = ' £10,000,000'
$Value4 = ' a copy of any financial due diligence
reports obtained by the Group in relation to the Acquisition Target, on a non-reliance
basis (subject to the Agent and any other relevant Reliance Party signing any required
hold harmless letter) and a copy of the acquisition agreement under which the
Acquisition Target is to be acquired;'
# Define the regex patterns to search for indidvidually, as elements of an array.
$patterns =
# A string literal; escape it, to be safe.
[regex]::Escape(' in the case of a single acquisition the Total Purchase Price '),
# A regex that matches a currency amount in pounds.
# (Literal ' £', followed by at least one ('+') non-whitespace char. ('\S')
# - this could be made more stringent by matching digits and commas only.)
' £\S+',
# A string literal that *needs* escaping due to use of '(' and ')'
# Note the use of a literal here-string (@'<newline>...<newline>'@)
[regex]::Escape(@'
a copy of any financial due diligence
reports obtained by the Group in relation to the Acquisition Target, on a non-reliance
basis (subject to the Agent and any other relevant Reliance Party signing any required
hold harmless letter) and a copy of the acquisition agreement under which the
Acquisition Target is to be acquired;
'@)
# - Use Get-Content -Raw to read the file *as a whole*
# - Use Select-String -AllMatches to find *multiple* matches (per input string)
# - ($patterns -join '|') joins the individual regexes with an alternation (|)
# so that matches of any one of them are returned.
Get-Content -Raw $SourceFile | Select-String -AllMatches -Pattern ($patterns -join '|') |
ForEach-Object {
# Loop over the matches, each of which contains the captured substring
# in index [0], and collect them in an *array*, $capturedSubstrings
# Note: You could use `Set-Variable` to create individual variables $Variable1, ...
# but it's usually easier to work with an array.
$capturedSubstrings = foreach ($match in $_.Matches) { $match[0].Value }
# Output the array elements in diagnostic form.
$capturedSubstrings | % { "[$_]" }
}
请注意,-Pattern
通常接受一个值数组,因此使用-Pattern $patterns
应该可以工作(虽然行为略有不同),但是从PowerShell Core 6.1.0开始并不是因为bug。
警告:假设您的脚本使用与$SourceFile
相同的换行符样式(CRLF与LF-only);如果两者不同,则需要做更多的工作,这将表现为最后一个模式(多线一个)不匹配。
使用包含上述$String
内容的文件,可以得到:
[ in the case of a single acquisition the Total Purchase Price ]
[ £5,000,000]
[ £10,000,000]
[a copy of any financial due diligence
reports obtained by the Group in relation to the Acquisition Target, on a non-reliance
basis (subject to the Agent and any other relevant Reliance Party signing any required
hold harmless letter) and a copy of the acquisition agreement under which the
Acquisition Target is to be acquired;]