我已经在Jupyter Notebooks中使用Python / SQL函数创建了一条插入语句。我有一些表的PK列名称为“ id”,还有一个表的PK列名为“ order_id”。但是,我想在两个表WHERE id = '"+str(key)+"'
或WHERE order_id = '"+str(key)+"'
上使用该函数。我尝试过使用OR
运算符,但会引发错误。也许使用某种通配符是一种解决方案?
下面的脚本显示了所涉及的函数,其中的WHERE
子句指向PK列名为“ id”的表。但是,问题在于我希望该表也适用于其PK名称为“ order_id”的表。
import mysql.connector
import pandas as pd
...
op_cursor = op_connector.cursor
dwh_cursor = dwh_connector.cursor
...
def load_dim(instance):
for key in instance.dim_id:
sql = "SELECT "+instance.op_args+" FROM "+instance.dwh_table_name+" WHERE id = '"+str(key)+"'"
dwh_cursor.execute(sql)
result = dwh_cursor.fetchone()
if result == None:
sqlQuery = "SELECT "+instance.dwh_args+" FROM "+instance.op_table_name+" WHERE id = '"+str(key)+"'"
op_cursor.execute(sqlQuery)
result_from_op = op_cursor.fetchone()
dim_dict = dict(zip(instance.op_cols,result_from_op))
dwh_values = ",".join(map(escape,result_from_op))
sqlInsert = "INSERT INTO "+instance.dwh_table_name+" ("+instance.dwh_args+") VALUES ("+instance.dwh_values+")"
dwh_cursor.execute(sqlInsert)
dwh_connector.commit()
billing_profile_op_id = dwh_cursor.lastrowid
else:
print('No updates to: ' + instance.dwh_table_name)
break enter code here
此问题是ETL练习的一部分,因为我正在创建DWH,并且我不想更改任何列名。
考虑try/except
并通过使用一个带有fetch
子句的纯插入选择SQL查询来避免所有查询生成和IN
检查,因为这反映了非重复的追加查询需求。参见NOT IN vs. NOT EXISTS vs. LEFT JOIN / IS NULL。
以下使用LIMIT 1
替代fetchone()
,否则根据RDBMS使用TOP 1
或fetch first 1 rows only
。另外,参数占位符使用%s
,否则使用?
,具体取决于Python DB-API。在以后的帖子中,请始终标记RDBMS并使用import
行显示DB-API。
def load_dim(instance):
sql = """INSERT INTO {dwh} ({dwh_cols})
SELECT {op_cols}
FROM {op}
WHERE {pk} NOT IN
(SELECT {pk} FROM {dwh} WHERE id = %s)
LIMIT 1
"""
for key in instance.dim_id:
try:
# ID APPEND
dwh_cursor.execute(sql.format(dwh = instance.dwh_table_name,
dwh_cols = instance.dwh_args,
op_cols = instance.op_args,
op = instance.op_table_name,
pk = 'id'),
str(key))
dwh_connector.commit()
except Exception as e: # ADJUST TO DB-API SPECIFIC Error
# ORDER_ID APPEND
dwh_cursor.execute(sql.format(dwh = instance.dwh_table_name,
dwh_cols = instance.dwh_args,
op_cols = instance.op_args,
op = instance.op_table_name,
pk = 'order_id'),
str(key))
dwh_connector.commit()
billing_profile_op_id = dwh_cursor.lastrowid # RETURNS 0 IF NO DATA APPENDED