Apache Avro 不会将 BigDecimal 打印到 parquet 文件中 错误:java.math.BigDecimal 无法转换为 java.nio.ByteBuffer

问题描述 投票:0回答:1

我需要使用 Java8 中的 apache-avro 库创建镶木地板文件。使用 Maven 生成资源从“.avsc”文件自动创建的 POJO。但我在处理文件中的 BigDecimal 字段时遇到问题。我研究了 apache-avro 库文档(apache-avro)。 我能够使用我需要的字段类型成功创建 POJO。但我在写入阶段遇到异常。我看到已经问过类似的问题,但没有解决方案来解决我的问题。

这是我正在处理的代码github代码

Employee_schema.avsc

{"type": "record",
  "namespace": "com.avro.example",
  "name": "Employee",
  "fields": [
    {"name": "name","type": "string"},
    {"name": "email","type": "string"},
    {"name": "salary",
      "type": { "type": "bytes",
                "logicalType": "decimal",
                "precision": 4,
                "scale": 2
      }}]
}

主.类

public static void main(String[] args) throws IOException {

    File outputParquet = new File("./output.parquet");
    Files.deleteIfExists(outputParquet.toPath());
    Employee employee = new Employee("john", "[email protected]",BigDecimal.TEN);
    ParquetWriter<Employee> writer = AvroParquetWriter.<Employee>builder(new Path(outputParquet.getAbsolutePath()))
            .withCompressionCodec(CompressionCodecName.SNAPPY)
            .withSchema(employee.getSchema())
            .build();
    try {
        writer.write(employee);
    } catch (Exception e) {
        e.printStackTrace();
    }
    writer.close();
}

pom.xml

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.avro.example</groupId>
    <artifactId>avro-example</artifactId>
    <version>1.0-SNAPSHOT</version>

    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <maven.compiler.source>8</maven.compiler.source>
        <maven.compiler.target>8</maven.compiler.target>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.apache.parquet</groupId>
            <artifactId>parquet-avro</artifactId>
            <version>1.11.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.avro</groupId>
            <artifactId>avro</artifactId>
            <version>1.11.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-core</artifactId>
            <version>1.2.1</version>
        </dependency>
    </dependencies>
    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.avro</groupId>
                <artifactId>avro-maven-plugin</artifactId>
                <version>1.11.1</version>
                <executions>
                    <execution>
                        <goals>
                            <goal>schema</goal>
                        </goals>
                        <phase>generate-sources</phase>
                        <configuration>
                            <sourceDirectory>${project.basedir}/src/main/avro/</sourceDirectory>
                            <outputDirectory>${project.basedir}/src/main/java/</outputDirectory>
                            <enableDecimalLogicalType>true</enableDecimalLogicalType>
                            <fieldVisibility>private</fieldVisibility>
                            <stringType>String</stringType>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>
</project>
java parquet avro classcastexception
1个回答
0
投票

我不知道技术原因,但你必须配置一个十进制转换器。

首先创建一个数据模型:

GenericData dataModel = new SpecificData();

然后添加十进制转换器:

dataModel.addLogicalTypeConversion(new DecimalConversion());

AvroParquetWriter
构建器中,您可以使用方法
dataModel
 配置新的 
withDataModel

结果:

GenericData dataModel = new SpecificData();
dataModel.addLogicalTypeConversion(new DecimalConversion());
ParquetWriter<Employee> writer = AvroParquetWriter.<Employee>builder(output)
    .withSchema(employee.getSchema())
    .withDataModel(dataModel)
    .build();

例如,要写入

java.time.LocalDateTime
值,您需要添加此转换器:

genericDataModel.addLogicalTypeConversion(new TimeConversions.LocalTimestampMillisConversion());
© www.soinside.com 2019 - 2024. All rights reserved.