我需要在pyspark中执行postgressql 我试过以下:-
spark = SparkSession.builder.appName("ReadFromPostgreSQL").getOrCreate()
url = "jdbc:postgresql://localhost:5432/database_example"
properties = {"user": "postgres", "password": "1234", "driver": "org.postgresql.Driver"}
query = "SELECT * FROM arpan.check_master_planning_family"
jdbcDF = spark.read.jdbc(url=url, table=query, properties=properties)
在这里,我用我的服务器 url、用户名和密码替换了用户名、密码和 url。
但是我收到这个错误:-
Py4JJavaError:-org.postgresql.util.PSQLException: ERROR: syntax error at or near "SELECT"
此查询在 postgres 中运行完美,但在这里我收到错误。请解决这个
将
table=query
更改为table="check_master_planning_family"
spark = SparkSession.builder.appName("ReadFromPostgreSQL").getOrCreate()
tabe_name = "check_master_planning_family"
query = "SELECT * FROM arpan.check_master_planning_family"
url = "jdbc:postgresql://localhost:5432/database_example"
properties = {"user": "postgres", "password": "1234", "driver": "org.postgresql.Driver"}
jdbcDF = spark.read.jdbc(url=url, table=tabe_name, properties=properties)
如果你喜欢阅读表格,试试这个
jdbcDF.sql('''
SELECT * FROM arpan.check_master_planning_family;
''').show()
或
jdbcDF.sql(query).show()