在Pycharm上运行Pyspark

问题描述 投票:0回答:1

在Mac(版本10.14.5)上,我试图在PyCharm专业版,版本19.2)中运行PySpark程序。

我知道我简单的PySpark程序很好,因为当我从终端通过Spark在终端[之外的spark-submit上运行brew时,它的工作方式是预期。我尝试将PyCharm链接到此版本的Spark,但是遇到其他问题。

我按照网上的多个说明在

Pycharm

Preferences-> Project Interpreter)中安装了pyspark,并将SPARK_HOME环境变量设置为适当的venv目录( 运行->编辑配置->环境变量)。例如,此stackoverflow thread。但是,运行程序时收到错误消息:Failed to find Spark jars directory (/Users/rahul/PycharmProjects/spark-demoII/venv/assembly/target/scala-2.12/jars). You need to build Spark with the target "package" before running this program. Traceback (most recent call last): File "/Users/rahul/PycharmProjects/spark-demoII/run.py", line 6, in <module> sc = SparkContext("local", "SimpleApp") File "/Users/rahul/virtualenvs/pyspark/lib/python3.7/site-packages/pyspark/context.py", line 133, in __init__ SparkContext._ensure_initialized(self, gateway=gateway, conf=conf) File "/Users/rahul/virtualenvs/pyspark/lib/python3.7/site-packages/pyspark/context.py", line 316, in _ensure_initialized SparkContext._gateway = gateway or launch_gateway(conf) File "/Users/rahul/virtualenvs/pyspark/lib/python3.7/site-packages/pyspark/java_gateway.py", line 46, in launch_gateway return _launch_gateway(conf) File "/Users/rahul/virtualenvs/pyspark/lib/python3.7/site-packages/pyspark/java_gateway.py", line 108, in _launch_gateway raise Exception("Java gateway process exited before sending its port number") Exception: Java gateway process exited before sending its port number Process finished with exit code 1
任何人都知道如何获取

PyCharm在类似计算机上运行Pyspark

程序? 回应@pissal的建议:我以前曾尝试过,但该版本的spark确实有效。无论如何,我还是再次尝试了:切换到虚拟环境后,我做了pip install pyspark。为了确保此版本的spark有效,我运行了spark-submit run.py(在[[PyCharm之外),这是错误消息。

WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/Users/rahul/.virtualenvs/test1/lib/python3.7/site-packages/pyspark/jars/spark-unsafe_2.11-2.4.4.jar) to method java.nio.Bits.unaligned() WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release Exception in thread "main" java.lang.ExceptionInInitializerError at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:80) at org.apache.hadoop.security.SecurityUtil.getAuthenticationMethod(SecurityUtil.java:611) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:273) at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:261) at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:791) at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:761) at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:634) at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2422) at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2422) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2422) at org.apache.spark.SecurityManager.<init>(SecurityManager.scala:79) at org.apache.spark.deploy.SparkSubmit.secMgr$lzycompute$1(SparkSubmit.scala:348) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$secMgr$1(SparkSubmit.scala:348) at org.apache.spark.deploy.SparkSubmit$$anonfun$prepareSubmitEnvironment$7.apply(SparkSubmit.scala:356) at org.apache.spark.deploy.SparkSubmit$$anonfun$prepareSubmitEnvironment$7.apply(SparkSubmit.scala:356) at scala.Option.map(Option.scala:146) at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:355) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:774) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.StringIndexOutOfBoundsException: begin 0, end 3, length 2 at java.base/java.lang.String.checkBoundsBeginEnd(String.java:3720) at java.base/java.lang.String.substring(String.java:1909) at org.apache.hadoop.util.Shell.<clinit>(Shell.java:52) ... 25 more

macos pyspark pycharm
1个回答
0
投票
pyspark尚未更新为使用

Java的最新版本。在删除Java版本13之后,我确保<< spark的home brew安装使用的是java版本1.8。然后将以下内容添加到Pycharm中的运行->编辑配置中的环境变量中:SPARK_HOME=/usr/local/Cellar/apache-spark/2.4.4/libexec

使用这些设置,我可以在PyCharm中运行

pyspark

作业。
© www.soinside.com 2019 - 2024. All rights reserved.