"table":{"namespace":"default", "name":"table1"}, ...

Question

You will not get the

visible in the same way as the other columns. In the description of the HBase Catalog it is mentioned:

def catalog = s"""{
        |"table":{"namespace":"default", "name":"table1"},
        |"rowkey":"key",
        |"columns":{
          |"col0":{"cf":"rowkey", "col":"key", "type":"string"},
          |"col1":{"cf":"cf1", "col":"col1", "type":"boolean"},
          |"col2":{"cf":"cf2", "col":"col2", "type":"double"},
          |"col3":{"cf":"cf3", "col":"col3", "type":"float"},
          |"col4":{"cf":"cf4", "col":"col4", "type":"int"},
          |"col5":{"cf":"cf5", "col":"col5", "type":"bigint"},
          |"col6":{"cf":"cf6", "col":"col6", "type":"smallint"},
          |"col7":{"cf":"cf7", "col":"col7", "type":"string"},
          |"col8":{"cf":"cf8", "col":"col8", "type":"tinyint"}
        |}
      |}""".stripMargin

Note that the rowkey also has to be defined in details as a column (col0), which has a specific cf (rowkey).

Therefore, it will not show up although you have specified it in the

{
    "columns": {
        "RXSJ": {
            "col": "RXSJ",
            "cf": "info",
            "type": "bigint"
        },
        "LATITUDE": {
            "col": "LATITUDE",
            "cf": "info",
            "type": "float"
        },
        "ZJHM": {
            "col": "ZJHM",
            "cf": "rowkey",
            "type": "string"
        },
        "AGE": {
            "col": "AGE",
            "cf": "info",
            "type": "int"
        }
    },
    "rowkey": "ZJHM",
    "table": {
        "namespace": "default",
        "name": "mongo_hbase_spark_out"
    }
}

section of your catalog.

The

is only visible as actual rowkey as your screenshot also shows.

Answer 1

After testing, I solved the problem.The whole idea is to output the same column twicerowkeyThis is my new generated SHC catalog: I think rowkey column is Hortonworks-spark shc special column,it always output first column. Only think other ways to output to other cf.Let me know if you have any better Suggestions

Thanks!

我使用shc-core将spark数据集写到hbase，更多细节请看下文。columns此处

.rowkey这是我现在的shc目录。

因为sof规则代码不能太长，我只能给你一部分。

Answer 2

其他字段正常输出，但是rowkey列没有输出。

怎样才能把rowkey作为一列额外输出？

{
    "columns": {
        "rowkey_ZJHM": {
            "col": "ZJHM",
            "cf": "rowkey",
            "type": "string"
        },
        "ZJHM": {
            "col": "ZJHM",
            "cf": "info",
            "type": "string"
        },
        "AGE": {
            "col": "AGE",
            "cf": "info",
            "type": "int"
        }
    },
    "rowkey": "ZJHM",
    "table": {
        "namespace": "default",
        "name": "mongo_hbase_spark_out"
    }
}

我是用shc-core把spark Dataset写到hbase的，详情请看这里。这是我目前的shc目录： def catalog = s""""{ {

"table":{"namespace":"default", "name":"table1"}, ...

问题描述投票：0回答：2

2个回答

最新问题

"table":{"namespace":"default", "name":"table1"}, ...

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2