我正在尝试运行典型的Flume第一个示例来获取推文,并使用Apache FLume将其存储在HDFS中。
[Hadoop version 3.1.3; Apache Flume 1.9.0]
我已经配置了flume-env.sh:`
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/
export CLASSPATH=$CLASSPATH:/FLUME_HOME/lib/*
如TwitterStream.properties配置文件中所示配置代理:
# Naming the components on the current agent.
TwitterAgent.sources = Twitter
TwitterAgent.channels = MemChannel
TwitterAgent.sinks = HDFS
# Describing/Configuring the source
TwitterAgent.sources.Twitter.type = org.apache.flume.source.twitter.TwitterSource
TwitterAgent.sources.Twitter.consumerKey = **********************
TwitterAgent.sources.Twitter.consumerSecret = **********************
TwitterAgent.sources.Twitter.accessToken = **********************
TwitterAgent.sources.Twitter.accessTokenSecret = **********************
TwitterAgent.sources.Twitter.keywords = tutorials point, java, bigdata, mapreduce, mahout, hbase, nosql
# Describing/Configuring the sink
TwitterAgent.sinks.HDFS.type = hdfs
TwitterAgent.sinks.HDFS.hdfs.path = hdfs://localhost:9000/user/twitter_data/
TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream
TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text
TwitterAgent.sinks.HDFS.hdfs.batchSize = 1000
TwitterAgent.sinks.HDFS.hdfs.rollSize = 0
TwitterAgent.sinks.HDFS.hdfs.rollCount = 10000
# Describing/Configuring the channel
TwitterAgent.channels.MemChannel.type = memory
TwitterAgent.channels.MemChannel.capacity = 10000
TwitterAgent.channels.MemChannel.transactionCapacity = 100
# Binding the source and sink to the channel
TwitterAgent.sources.Twitter.channels = MemChannel
TwitterAgent.sinks.HDFS.channel = MemChannel
然后运行命令:
bin/flume-ng agent -c /home/jiglesia/hadoop/flume/conf/ -f TwitterStream.properties -n TwitterAgent -Dflume.root.logger=INFO, console -n TwitterAgent
在执行过程中获取以下ERROR:
Info: Sourcing environment configuration script /home/jiglesia/hadoop/flume/conf/flume-env.sh
Info: Including Hadoop libraries found via (/home/jiglesia/hadoop/bin/hadoop) for HDFS access
/home/jiglesia/hadoop/libexec/hadoop-functions.sh: line 2360: HADOOP_ORG.APACHE.FLUME.TOOLS.GETJAVAPROPERTY_USER: bad substitution
/home/jiglesia/hadoop/libexec/hadoop-functions.sh: line 2455: HADOOP_ORG.APACHE.FLUME.TOOLS.GETJAVAPROPERTY_OPTS: bad substitution
Info: Including Hive libraries found via () for Hive access
我不知道为什么说换人不好。
如果可以对您说什么,我终于附上了整个日志:
Info: Sourcing environment configuration script /home/jiglesia/hadoop/flume/conf/flume-env.sh
Info: Including Hadoop libraries found via (/home/jiglesia/hadoop/bin/hadoop) for HDFS access
/home/jiglesia/hadoop/libexec/hadoop-functions.sh: line 2360: HADOOP_ORG.APACHE.FLUME.TOOLS.GETJAVAPROPERTY_USER: bad substitution
/home/jiglesia/hadoop/libexec/hadoop-functions.sh: line 2455: HADOOP_ORG.APACHE.FLUME.TOOLS.GETJAVAPROPERTY_OPTS: bad substitution
Info: Including Hive libraries found via () for Hive access
+ exec /usr/lib/jvm/java-8-openjdk-amd64/jre//bin/java -Xmx20m -Dflume.root.logger=INFO, -cp '/home/jiglesia/hadoop/flume/conf:/home/jiglesia/hadoop/flume/lib/*:/home/jiglesia/hadoop/etc/hadoop:/home/jiglesia/hadoop/share/hadoop/common/lib/*:/home/jiglesia/hadoop/share/hadoop/common/*:/home/jiglesia/hadoop/share/hadoop/hdfs:/home/jiglesia/hadoop/share/hadoop/hdfs/lib/*:/home/jiglesia/hadoop/share/hadoop/hdfs/*:/home/jiglesia/hadoop/share/hadoop/mapreduce/lib/*:/home/jiglesia/hadoop/share/hadoop/mapreduce/*:/home/jiglesia/hadoop/share/hadoop/yarn:/home/jiglesia/hadoop/share/hadoop/yarn/lib/*:/home/jiglesia/hadoop/share/hadoop/yarn/*:/lib/*' -Djava.library.path=:/home/jiglesia/hadoop/lib/native org.apache.flume.node.Application -f TwitterStream.properties -n TwitterAgent console -n TwitterAgent
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/jiglesia/hadoop/flume/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/jiglesia/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
log4j:WARN No appenders could be found for logger (org.apache.flume.node.Application).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
bashrc文件中配置的环境变量:
# HADOOP VARIABLES START
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/
export HADOOP_HOME=/home/jiglesia/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
# HADOOP VARIABLES END
# FLUME VARIABLES START
FLUME_HOME=/home/jiglesia/hadoop/flume
PATH=$PATH:/FLUME_HOME/bin
CLASSPATH=$CLASSPATH:/FLUME_HOME/lib/*
# FLUME VARIABLES END
感谢您的帮助!
您未正确引用bash变量。
尝试此
FLUME_HOME=$HADOOP_HOME/flume
PATH=$PATH:$FLUME_HOME/bin
CLASSPATH=$CLASSPATH:$FLUME_HOME/lib/*.jar
注意:我建议不要将flume作为Hadoop的子目录。
我建议使用Apache Ambari来安装和配置Hadoop和Flume进程