visible in the same way as the other columns. In the description of the HBase Catalog it is mentioned:
def catalog = s"""{
|"table":{"namespace":"default", "name":"table1"},
|"rowkey":"key",
|"columns":{
|"col0":{"cf":"rowkey", "col":"key", "type":"string"},
|"col1":{"cf":"cf1", "col":"col1", "type":"boolean"},
|"col2":{"cf":"cf2", "col":"col2", "type":"double"},
|"col3":{"cf":"cf3", "col":"col3", "type":"float"},
|"col4":{"cf":"cf4", "col":"col4", "type":"int"},
|"col5":{"cf":"cf5", "col":"col5", "type":"bigint"},
|"col6":{"cf":"cf6", "col":"col6", "type":"smallint"},
|"col7":{"cf":"cf7", "col":"col7", "type":"string"},
|"col8":{"cf":"cf8", "col":"col8", "type":"tinyint"}
|}
|}""".stripMargin
Note that the rowkey also has to be defined in details as a column (col0), which has a specific cf (rowkey).
Therefore, it will not show up although you have specified it in the
{
"columns": {
"RXSJ": {
"col": "RXSJ",
"cf": "info",
"type": "bigint"
},
"LATITUDE": {
"col": "LATITUDE",
"cf": "info",
"type": "float"
},
"ZJHM": {
"col": "ZJHM",
"cf": "rowkey",
"type": "string"
},
"AGE": {
"col": "AGE",
"cf": "info",
"type": "int"
}
},
"rowkey": "ZJHM",
"table": {
"namespace": "default",
"name": "mongo_hbase_spark_out"
}
}
section of your catalog.
The
is only visible as actual rowkey as your screenshot also shows.After testing, I solved the problem.The whole idea is to output the same column twicerowkey
This is my new generated SHC catalog: I think rowkey column is Hortonworks-spark shc special column,it always output first column. Only think other ways to output to other cf.Let me know if you have any better Suggestions
Thanks!
我使用shc-core将spark数据集写到hbase,更多细节请看下文。columns
此处
.rowkey
这是我现在的shc目录。
其他字段正常输出,但是rowkey列没有输出。
怎样才能把rowkey作为一列额外输出?
{
"columns": {
"rowkey_ZJHM": {
"col": "ZJHM",
"cf": "rowkey",
"type": "string"
},
"ZJHM": {
"col": "ZJHM",
"cf": "info",
"type": "string"
},
"AGE": {
"col": "AGE",
"cf": "info",
"type": "int"
}
},
"rowkey": "ZJHM",
"table": {
"namespace": "default",
"name": "mongo_hbase_spark_out"
}
}
我是用shc-core把spark Dataset写到hbase的,详情请看这里。这是我目前的shc目录: def catalog = s""""{ {