在 Java 中比较两个 Excel 文件的最简单方法？

Question

我正在为一些生成 Excel 文件（二进制文件）的代码编写 JUnit 测试。我有另一个 Excel 文件，其中包含我的预期输出。将实际文件与预期文件进行比较的最简单方法是什么？

当然我可以自己编写代码，但我想知道受信任的第三方库（例如 Spring 或 Apache Commons）中是否有现有方法可以做到这一点。

Answer 1

您可能会考虑使用我的项目simple-excel，它提供了一堆 Hamcrest Matchers 来完成这项工作。

当您执行以下操作时，

assertThat(actual, WorkbookMatcher.sameWorkbook(expected));

例如，您会看到，

java.lang.AssertionError:
Expected: entire workbook to be equal
     but: cell at "C14" contained <"bananas"> expected <nothing>,
          cell at "C15" contained <"1,850,000 EUR"> expected <"1,850,000.00 EUR">,
          cell at "D16" contained <nothing> expected <"Tue Sep 04 06:30:00">
    at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)

这样，您就可以从自动化测试中运行它，并在开发时获得有意义的反馈。

您可以在我网站上的这篇文章

阅读更多相关信息

Answer 2

这就是我最终所做的（繁重的工作由DBUnit完成）：

/**
 * Compares the data in the two Excel files represented by the given input
 * streams, closing them on completion
 * 
 * @param expected can't be <code>null</code>
 * @param actual can't be <code>null</code>
 * @throws Exception
 */
private void compareExcelFiles(InputStream expected, InputStream actual)
  throws Exception
{
  try {
    Assertion.assertEquals(new XlsDataSet(expected), new XlsDataSet(actual));
  }
  finally {
    IOUtils.closeQuietly(expected);
    IOUtils.closeQuietly(actual);
  }
}

这会比较两个文件中的数据，不存在因任何可能不同的不相关元数据而导致误报的风险。希望这对某人有帮助。

Answer 3

使用一些校验和（如 MD5）或仅读取两个文件可以轻松完成简单的文件比较。

但是，正如 James Burgess 指出的那样，由于 Excel 文件包含大量元数据，这些文件可能永远不会逐字节相同。因此，您需要另一种比较来进行测试。

我建议以某种方式从 Excel 文件生成“规范”表单，即读取生成的 Excel 文件并将其转换为更简单的格式（CSV 或类似格式），这只会保留您想要检查的信息。然后你可以使用“规范形式”与你的预期结果进行比较（当然也是规范形式）。

Apache POI 可能对于读取文件很有用。

顺便说一句：读取整个文件来检查其正确性通常不会被视为单元测试。这是集成测试...

Answer 4

我需要做类似的事情，并且已经在我的项目中使用 Apache POI 库来创建 Excel 文件。因此，我选择使用附带的 ExcelExtractor 接口将两个工作簿导出为文本字符串，并断言这些字符串相等。 .xls 的 HSSF 以及 .xlsx 的 XSSF 都有实现。

转储到字符串：

XSSFWorkbook xssfWorkbookA = ...;
String workbookA = new XSSFExcelExtractor(xssfWorkbookA).getText();

ExcelExtractor 对于字符串转储中应包含的所有内容有一些选项。我发现它具有包含工作表名称的有用默认值。此外，它还包括单元格的文本内容。

Answer 5

我发现最简单的方法是使用 Tika。我这样使用：

private void compareXlsx(File expected, File result) throws IOException, TikaException {
     Tika tika = new Tika();
     String expectedText = tika.parseToString(expected);
     String resultText = tika.parseToString(result);
     assertEquals(expectedText, resultText);
}


<dependency>
    <groupId>org.apache.tika</groupId>
    <artifactId>tika-parsers</artifactId>
    <version>1.13</version>
    <scope>test</scope>
</dependency>

Answer 6

您可以使用javaxdelta来检查两个文件是否相同。可以从这里获得：

http://javaxdelta.sourceforge.net/

Answer 7

刚刚发现 commons-io 的 FileUtils 中有一些东西。谢谢其他的回答。

Answer 8

请查看网站来比较二进制文件，http://www.velocityreviews.com/forums/t123770-re-java-code-for-确定-binary-file-equality.html

老虎

Answer 9

您可以使用Beyond Compare 3，它可以从命令行启动，支持不同的方式来比较Excel文件，包括：

将 Excel 工作表与数据库表进行比较
检查所有文字内容
通过一些格式检查文本内容

Answer 10

仅测试 Kotlin 中第一个工作表的内容（可以轻松转换为 java）。

private fun checkEqualityExcelDocs(doc : XSSFWorkbook, doc1 : XSSFWorkbook) : Boolean{
        val mapOfCellDoc = doc.toList().first().toList().flatMap { row -> row.map { Pair(PivotExcelCreator.IndexInThePivotTable(it.rowIndex,it.columnIndex),it.stringCellValue) }}.toMap()
        val mapOfCellDoc1 = doc1.toList().first().toList().flatMap { row -> row.map { Pair(PivotExcelCreator.IndexInThePivotTable(it.rowIndex,it.columnIndex),it.stringCellValue) }}.toMap()
        if(mapOfCellDoc.size == mapOfCellDoc1.size){
            return mapOfCellDoc.entries.all { mapOfCellDoc1.containsKey(it.key) && mapOfCellDoc[it.key] == mapOfCellDoc1[it.key]}
        }
        return false
    }

data class IndexInThePivotTable(val row: Int, val col: Int)

并在您的代码中添加断言

    assertTrue(checkEqualityExcelDocs(expected, actual), "Docs aren't equal!")

如您所见，

doc.toList().first()

仅获取文档的第一页，如果您需要分别比较每页，请稍微更改代码。

另外，不考虑“”空字符串单元格也是一个好主意，我不需要此功能（如果需要，也只需添加此部分）。

它也可能是有用的信息

//first doc I've got from outputstream such way
val out = ByteArrayOutputStream()
//some method which writes excel to outputstream
val firstDoc = XSSFWorkbook(ByteArrayInputStream(out.toByteArray()))

和文件中要比较的第二个文档

val secondDoc = XSSFWorkbook(Test::class.java.getClassLoader().getResource("yourfile.xlsx").path)

Answer 11

经过长期研究后对我来说最好的方法：使用 Apachi POI 作为外部库对 xlsx 文件使用 XSSF。如果你想比较xls文件必须使用HSSF

祝你好运

public static void compareTwoExcellFiles04(String file01, String file02) throws IOException {
    try
    {
        File file1 = new File(file01);
        File file2 = new File(file02);
        FileInputStream fis1 = new FileInputStream(file1);
        FileInputStream fis2 = new FileInputStream(file2);
        XSSFWorkbook wb1 = new XSSFWorkbook(fis1);
        XSSFWorkbook wb2 = new XSSFWorkbook(fis2);
        XSSFSheet sheet1 = wb1.getSheetAt(0);
        XSSFSheet sheet2 = wb2.getSheetAt(0);
        ArrayList<String> arrayList1 = new ArrayList<>();
        ArrayList<String> arrayList2 = new ArrayList<>();
        for (Row row1 : sheet1) {
            Iterator<Cell> cellIterator = row1.cellIterator();
            while (cellIterator.hasNext()) {
                Cell cell1 = cellIterator.next();
                arrayList1.add(cell1.getStringCellValue());
                }
            }
        for (Row row2 : sheet2) {
            Iterator<Cell> cellIterator = row2.cellIterator();
            while (cellIterator.hasNext()) {
                Cell cell2 = cellIterator.next();
                arrayList2.add(cell2.getStringCellValue());
            }
        }
    if (arrayList1.equals(arrayList2)){
        System.out.println("the files are equal");
    }else{
        System.out.println("NOT EQUAL");
    }
    }
    catch(Exception e)
    {
        e.printStackTrace();
    }
}

Answer 12

import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;

public class ExcelComparatorNew {
    public static void main(String[] args) throws IOException {
        FileInputStream file1 = new FileInputStream("C:\\Users\\xxxx\\Downloads\\xlsx1.xlsx");
        FileInputStream file2 = new FileInputStream("C:\\Users\\xxxx\\Downloads\\xlsx2.xlsx");
        Workbook workbook1 = WorkbookFactory.create(file1);
        Workbook workbook2 = WorkbookFactory.create(file2);

        Sheet sheet1 = workbook1.getSheetAt(0);
        Sheet sheet2 = workbook2.getSheetAt(0);

        Workbook diffWorkbook = new XSSFWorkbook();
        Sheet diffSheet = diffWorkbook.createSheet("Differences");

        int columnCount1 = sheet1.getRow(0).getLastCellNum();
        int columnCount2 = sheet2.getRow(0).getLastCellNum();
        if (columnCount1 != columnCount2) {
            System.out.println("Column counts are different");
            return;
        }

        Row headerRow1 = sheet1.getRow(0);
        Row headerRow2 = sheet2.getRow(0);
        for (int i = 0; i < columnCount1; i++) {
            String columnName1 = getCellValueAsString(headerRow1.getCell(i));
            String columnName2 = getCellValueAsString(headerRow2.getCell(i));
            if (!columnName1.equals(columnName2)) {
                System.out.println("Column names are different");
                return;
            }
        }

        int diffRowCount = 0;

        // Compare rows from file1 with every row in file2
        for (int i = 1; i <= sheet1.getLastRowNum(); i++) {
            Row row1 = sheet1.getRow(i);
            boolean rowMatched = false;

            // Compare row1 with every row in file2
            for (int j = 1; j <= sheet2.getLastRowNum(); j++) {
                Row row2 = sheet2.getRow(j);
                if (compareRows(row1, row2, columnCount1)) {
                    rowMatched = true;
                    break;
                }
            }

            if (!rowMatched) {
                // Row from file1 not found in file2, write to differences sheet
                Row diffRow = diffSheet.createRow(diffRowCount++);
                diffRow.createCell(0).setCellValue("Row from file1 not found in file2");
                for (int k = 0; k < columnCount1; k++) {
                    Cell cell1 = row1.getCell(k);
                    diffRow.createCell(k + 1).setCellValue(cellToString(cell1));
                }
            }
        }

        // Write differences to a new Excel file
        FileOutputStream diffFileOut = new FileOutputStream("C:\\Users\\XXXX\\Downloads\\differences.xlsx");
        diffWorkbook.write(diffFileOut);

        file1.close();
        file2.close();
        diffFileOut.close();
        diffWorkbook.close();
    }

    private static boolean compareRows(Row row1, Row row2, int columnCount) {
        if (row1 == null || row2 == null) {
            return false;
        }
        for (int i = 0; i < columnCount; i++) {
            Cell cell1 = row1.getCell(i);
            Cell cell2 = row2.getCell(i);
            if (!compareCells(cell1, cell2)) {
                return false;
            }
        }
        return true;
    }

    private static boolean compareCells(Cell cell1, Cell cell2) {
        if (cell1 == null && cell2 == null) {
            return true;
        }
        if (cell1 == null || cell2 == null) {
            return false;
        }
        if (cell1.getCellType() == CellType.BLANK && cell2.getCellType() == CellType.BLANK) {
            return true;
        }
        if (cell1.getCellType() != cell2.getCellType()) {
            return false;
        }
        switch (cell1.getCellType()) {
            case NUMERIC:
                return Double.compare(cell1.getNumericCellValue(), cell2.getNumericCellValue()) == 0;
            case BOOLEAN:
                return cell1.getBooleanCellValue() == cell2.getBooleanCellValue();
            case STRING:
                return cell1.getStringCellValue().equals(cell2.getStringCellValue());
            case FORMULA:
                return cell1.getCellFormula().equals(cell2.getCellFormula());
            case BLANK:
                return true; // Both are BLANK
            default:
                return false;
        }
    }

    private static String getCellValueAsString(Cell cell) {
        if (cell == null) {
            return "";
        }
        switch (cell.getCellType()) {
            case NUMERIC:
                if (DateUtil.isCellDateFormatted(cell)) {
                    return cell.getDateCellValue().toString();
                } else {
                    return String.valueOf(cell.getNumericCellValue());
                }
            case BOOLEAN:
                return String.valueOf(cell.getBooleanCellValue());
            case STRING:
                return cell.getStringCellValue();
            case FORMULA:
                return cell.getCellFormula();
            case BLANK:
                return "";
            default:
                return "";
        }
    }

    private static String cellToString(Cell cell) {
        if (cell == null) {
            return "";
        }
        switch (cell.getCellType()) {
            case NUMERIC:
                return String.valueOf(cell.getNumericCellValue());
            case BOOLEAN:
                return String.valueOf(cell.getBooleanCellValue());
            case STRING:
                return cell.getStringCellValue();
            case FORMULA:
                return cell.getCellFormula();
            case BLANK:
                return "";
            default:
                return "";
        }
    }
}

Answer 13

也许...比较每个文件的 MD5 摘要？我确信有很多方法可以做到这一点。您可以打开两个文件并比较每个字节。

编辑：James 阐述了 XLS 格式在元数据中可能存在的差异。也许您应该使用与生成 xls 文件相同的界面来打开它们并比较单元格之间的值？

在 Java 中比较两个 Excel 文件的最简单方法？

问题描述投票：0回答：13

13个回答

最新问题

在 Java 中比较两个 Excel 文件的最简单方法？

问题描述 投票：0回答：13

13个回答

最新问题

问题描述投票：0回答：13