我已经代表我尝试了很多次,但我一次又一次地面临这个问题,有人能帮助我为pyspark和flume集成添加sbt依赖,下面是我的代码。
spark-submit --packages 'org.apache.spark:spark-streaming-flume-assembly_2.12:2.4.5' spark_flume.py
Ivy Default Cache set to: /home/hduser/.ivy2/cache
The jars for the packages stored in: /home/hduser/.ivy2/jars
:: loading settings :: url = jar:file:/usr/local/spark/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
org.apache.spark#spark-streaming-flume-assembly_2.12 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent-ab867e8f-f121-4402-a63c-942bac3932c1;1.0
confs: [default]
found org.apache.spark#spark-streaming-flume-assembly_2.12;2.4.5 in central
found org.spark-project.spark#unused;1.0.0 in central
:: resolution report :: resolve 812ms :: artifacts dl 15ms
:: modules in use:
org.apache.spark#spark-streaming-flume-assembly_2.12;2.4.5 from central in [default]
org.spark-project.spark#unused;1.0.0 from central in [default]
---------------------------------------------------------------------
| | modules || artifacts |
| conf | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
| default | 2 | 0 | 0 | 0 || 2 | 0 |
---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent-ab867e8f-f121-4402-a63c-942bac3932c1
confs: [default]
0 artifacts copied, 2 already retrieved (0kB/18ms)
20/05/15 15:35:18 WARN Utils: Your hostname, localhost.localdomain resolves to a loopback address: 127.0.0.1; using 192.168.19.137 instead (on interface ens33)
20/05/15 15:35:18 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
20/05/15 15:35:19 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
File "/home/hduser/pyspark_data1/spark_stream1/spark_flume.py", line 6
artifactID=spark-streaming-flume_2.12
^
SyntaxError: invalid syntax
log4j:WARN No appenders could be found for logger (org.apache.spark.util.ShutdownHookManager).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
[hduser@localhost spark_stream1]$
这是一个 SyntaxError
在你 spark_flume.py
第6行的文件,涉及
artifactID=spark-streaming-flume_2.12
我相信你需要把 spark-streaming-flume_2.12
作为字符串 "..."