我正在编写一种使用 apache POI 替换 docx 文档中文本的方法。在文档本身中有 x 个标有“<<>>”的“变量”。例如,doc midth 看起来像这样:“在 << date >> 上,人 << Name >> 做 << Task >>。”。该方法的要点是获取 XWPFParagraph 和 Map
private void replaceTextInParagraph(XWPFParagraph paragraph, Map<String, String> variableValues) {
List<XWPFRun> runs = paragraph.getRuns();
StringBuilder paragraphText = new StringBuilder();
// Step 1: Gather all text from the paragraph
for (XWPFRun run : runs) {
String text = run.getText(0);
if (text != null) {
paragraphText.append(text);
}
}
// Step 2: Perform replacements on the complete text
String replacedText = paragraphText.toString();
for (Map.Entry<String, String> entry : variableValues.entrySet()) {
replacedText = replacedText.replace("<<" + entry.getKey() + ">>", entry.getValue());
}
// Step 3: Clear all existing runs
for (XWPFRun run : runs) {
run.setText("", 0);
}
// Step 4: Add the replaced text back to the paragraph
if (!replacedText.isEmpty()) {
XWPFRun newRun = paragraph.createRun();
newRun.setText(replacedText);
}
}
这是代码的最新迭代,但这里存在一个根本问题。它使代码的样式变得很混乱。它会改变字体(例如从 11 到 10,但在整个文件中不一致),而且如果一个变量加粗,意味着:<>,那么该变量的值也必须加粗。但实际上有混合的解决方案,在同一个文档中,我们有一个段落完全加粗,另一个段落完全相反并且没有加粗,尽管如此,两个段落都只有一些加粗变量。
这些是我用来浏览文档的方法。
public void replaceVariables(XWPFDocument document, Map<String, String> variableValues) {
for (XWPFParagraph paragraph : document.getParagraphs()) {
replaceTextInParagraph(paragraph, variableValues);
}
for (XWPFHeader header : document.getHeaderList()) {
for (XWPFParagraph paragraph : header.getParagraphs()) {
replaceTextInParagraph(paragraph, variableValues);
}
processTables(header.getTables(), variableValues);
}
for (XWPFFooter footer : document.getFooterList()) {
for (XWPFParagraph paragraph : footer.getParagraphs()) {
replaceTextInParagraph(paragraph, variableValues);
}
processTables(footer.getTables(), variableValues);
}
processTables(document.getTables(), variableValues);
}
private void processTables(List<XWPFTable> tables, Map<String, String> variableValues) {
for (XWPFTable table : tables) {
for (XWPFTableRow row : table.getRows()) {
for (XWPFTableCell cell : row.getTableCells()) {
for (XWPFParagraph paragraph : cell.getParagraphs()) {
replaceTextInParagraph(paragraph, variableValues);
}
}
}
}
}
我还给了你两个旧版本的代码,看看我过去尝试过什么。我给出的最后一个解决方案是我最接近程序正常工作的解决方案。
private void replaceTextInParagraph(XWPFParagraph paragraph, Map<String, String> variableValues) {
List<XWPFRun> runs = paragraph.getRuns();
// List<XWPFRun> newRuns = new ArrayList<>();
StringBuilder paragraphText = new StringBuilder();
for (XWPFRun run : runs) {
// XWPFRun newRun = run;
// newRun.setText("", 0);
// newRuns.add(newRun);
String text = run.getText(0);
if (text != null) {
paragraphText.append(text);
}
}
String replacedText = paragraphText.toString();
for (Map.Entry<String, String> entry : variableValues.entrySet()) {
replacedText = replacedText.replace("<<" + entry.getKey() + ">>", entry.getValue());
}
// while (!paragraph.getRuns().isEmpty()) {
// paragraph.removeRun(0);
// }
String [] textParts = replacedText.split(" ");
for (int i = 0; i < runs.size(); i++) {
int j = 0;
runs.get(i).setText(textParts[j], 0);
paragraph.addRun(runs.get(i));
}
//
// XWPFRun newRun = paragraph.createRun();
// newRun.setText(replacedText);
}
private void replaceTextInParagraph(XWPFParagraph paragraph, Map<String, String> variableValues) {
StringBuilder paragraphText = new StringBuilder();
List<XWPFRun> runs = paragraph.getRuns();
// First, get the entire text of the paragraph
for (XWPFRun run : runs) {
String text = run.getText(0);
if (text != null) {
paragraphText.append(text);
}
}
// Perform the replacement on the entire paragraph text
String replacedText = paragraphText.toString();
for (Map.Entry<String, String> entry : variableValues.entrySet()) {
replacedText = replacedText.replace("<<" + entry.getKey() + ">>", entry.getValue());
}
// Split the replaced text back into runs
int currentIndex = 0;
for (XWPFRun run : runs) {
String text = run.getText(0);
if (text != null) {
int remainingLength = replacedText.length() - currentIndex; // Calculate remaining length in the replaced text
String subText;
if (text.length() <= remainingLength) {
// Safe to use the entire length of the run
subText = replacedText.substring(currentIndex, currentIndex + text.length());
} else {
// Limit the length to the remaining text length
subText = replacedText.substring(currentIndex);
}
run.setText(subText, 0); // Update text in each run
currentIndex += subText.length();
}
}
}
这一点我怎么强调都不过分,文档的样式必须保持一致。因此,如果您尝试删除所有运行并创建新的一次,您将丢失文档中的所有图像,并且如果您尝试替换旧运行中的新文本,您会发现文本更长,因此会被剪切例如名称或日期的结尾。所以请,我需要帮助,我不知道如何解决它,目前已经尝试了一切。
附注我知道我可以切换到书签,但在此之前我想看看是否可以使用简单的文本。
以下逻辑不必要遍历每个变量的文本
String replacedText = paragraphText.toString();
for (Map.Entry<String, String> entry : variableValues.entrySet()) {
replacedText = replacedText.replace("<<" + entry.getKey() + ">>", entry.getValue());
}
您可以使用正则表达式通过一次运行来完成此操作
Pattern pattern = Pattern.compile("<<([^>]+)>>");
StringBuffer buffer = new StringBuffer ();
Matcher matcher = pattern.matcher(paragraphText.toString());
while (matcher.find()) {
String key = matcher.group(1);
String replacement = variableValues.get(key);
matcher.appendReplacement(bufer, replacement);
}
matcher.appendTail(buffer);
String replacedText = buffer.toString();