我正在从Python脚本运行一个配置单元查询。当我使用subprocess.getstatusoutput时,我能够在没有任何问题的情况下运行它,但无法将结果存储到变量中。所以我试着用Popen,我得到一个错误,说不能
dd1 = '10-Sep-12'
table = 'testing_table'
1> query = "select distinct(input__file__name) from <db_name>." + table + " where as_of_date =" +"'"+ dd1 +"'"+ " limit 2"
2> cmd = 'beeline -u "jdbc:hive2:<connection string>" -e "'+query + ';"'
3> stat, query_output = subprocess.getstatusoutput(cmd)
这是有效但当我尝试打印query_output时,它会打印所有输出(例如关于所有阶段的'info'标签和查询的确切o / p)
而不是getstatusoutput,当我使用subprocess.Popen或subprocess.check_output时,我收到如下错误:
FileNotFoundError: [Errno 2] No such file or directory: 'beeline -u "<connection string>" -e "select distinct(input__file__name) from <db_name>.<table_name> where as_of_date =\'10-Sep-12\' limit 2;"'
cmd变量存储要从子进程fns调用的要执行的命令,并将输出存储到稍后写入文件的变量中。下一组步骤将读取在第一步中创建的文件,并执行另一个查询并写入另一个文件。
import subprocess
cmd= """ hive -e "use database; show tables;" """
val= subprocess.check_output(cmd,shell=True)
fl = open('/home/ouput_all_table_list.txt', 'w')
fl.write(val)
fl.close()
fl = open('/home/ouput_all_table_list.txt', 'r')
content = fl.read().splitlines()
for var in content:
tbl_nm= "'" + var + "'"
cmd_ay= 'hive -e "use database; select collect_list(cast(file_dt as string)) as dt, collect_list(cast(cnt as string)) as cnt, '+ tbl_nm +' from (select count(1) cnt,file_dt from database.' + var + ' group by file_dt having count(1) > 0 order by file_dt desc) a;"'
print cmd_ay
cmd_out= subprocess.check_output(cmd_ay,shell=True)
print cmd_out
fh = open('/home/ouput_all_hive_count_data.txt', 'a')
fh.write(cmd_out)
fh.close()