我有一个Hadoop集群,有一个主服务器和3个从服务器。现在,我想在此集群上添加Apache Impala功能。我从here下载了tarball。我想构建Impala,但不确定先决条件是什么。有两种不同的来源:
README.md
目录中的apache-impala
文件是在解开tar球之后创建的。引用它:
Impala可以使用从S3下载的预构建组件构建,也可以使用位于第三方目录中的就地工具链构建(不推荐)。构建Impala所需的组件是Apache Hadoop,Hive,HBase和Sentry。我对这两个来源感到困惑。我该怎么办? Apache Impala的一组明确的依赖关系会很棒!
如果你仔细阅读Impala Requirements,你会看到Hadoop支持是隐含的,而Sentry的要求被隐藏在页面底部附近的Impala Security链接中。
在Java Dependencies部分下,它说:
所有Java依赖项都打包在impala-dependencies.jar文件中,该文件位于/ usr / lib / impala / lib /。这些映射到在fe / target / dependency下构建的所有内容。
查看相应的pom.xml,您将看到所有依赖项。 grepping artifactId
显示以下内容:
$ grep artifactId fe/pom.xml
<artifactId>impala-parent</artifactId>
<artifactId>impala-frontend</artifactId>
<artifactId>json-smart</artifactId>
<artifactId>impala-data-source-api</artifactId>
<artifactId>hadoop-hdfs</artifactId>
<artifactId>hadoop-common</artifactId>
<artifactId>json-smart</artifactId>
<artifactId>hadoop-auth</artifactId>
<artifactId>json-smart</artifactId>
<artifactId>hadoop-aws</artifactId>
<artifactId>hadoop-azure-datalake</artifactId>
<artifactId>json-smart</artifactId>
<artifactId>sentry-core-common</artifactId>
<artifactId>yarn-extras</artifactId>
<artifactId>sentry-core-model-db</artifactId>
<artifactId>json-smart</artifactId>
<artifactId>sentry-provider-common</artifactId>
<artifactId>sentry-provider-db</artifactId>
<artifactId>json-smart</artifactId>
<artifactId>sentry-provider-file</artifactId>
<artifactId>sentry-provider-cache</artifactId>
<artifactId>json-smart</artifactId>
<artifactId>sentry-policy-common</artifactId>
<artifactId>sentry-binding-hive</artifactId>
<artifactId>json-smart</artifactId>
<artifactId>sentry-policy-engine</artifactId>
<artifactId>sentry-service-api</artifactId>
<artifactId>json-smart</artifactId>
<artifactId>parquet-hadoop-bundle</artifactId>
<artifactId>hbase-client</artifactId>
<artifactId>json-smart</artifactId>
<artifactId>hbase-common</artifactId>
<artifactId>json-smart</artifactId>
<artifactId>hbase-protocol</artifactId>
<artifactId>commons-lang</artifactId>
<artifactId>java-cup</artifactId>
<artifactId>libthrift</artifactId>
<artifactId>hive-service</artifactId>
<artifactId>hive-llap-server</artifactId>
<artifactId>json-smart</artifactId>
<artifactId>hive-serde</artifactId>
所以README.md说明你需要Hadoop,Hive,HBase和Sentry来构建Impala是正确的。