Java自学者论坛

 找回密码
 立即注册

手机号码,快捷登录

恭喜Java自学者论坛(https://www.javazxz.com)已经为数万Java学习者服务超过8年了!积累会员资料超过10000G+
成为本站VIP会员,下载本站10000G+会员资源,会员资料板块,购买链接:点击进入购买VIP会员

JAVA高级面试进阶训练营视频教程

Java架构师系统进阶VIP课程

分布式高可用全栈开发微服务教程Go语言视频零基础入门到精通Java架构师3期(课件+源码)
Java开发全终端实战租房项目视频教程SpringBoot2.X入门到高级使用教程大数据培训第六期全套视频教程深度学习(CNN RNN GAN)算法原理Java亿级流量电商系统视频教程
互联网架构师视频教程年薪50万Spark2.0从入门到精通年薪50万!人工智能学习路线教程年薪50万大数据入门到精通学习路线年薪50万机器学习入门到精通教程
仿小米商城类app和小程序视频教程深度学习数据分析基础到实战最新黑马javaEE2.1就业课程从 0到JVM实战高手教程MySQL入门到精通教程
查看: 873|回复: 0

解决hue/hiveserver2对于hive date类型显示为NULL的问题

[复制链接]
  • TA的每日心情
    奋斗
    2024-4-6 11:05
  • 签到天数: 748 天

    [LV.9]以坛为家II

    2034

    主题

    2092

    帖子

    70万

    积分

    管理员

    Rank: 9Rank: 9Rank: 9

    积分
    705612
    发表于 2021-5-26 11:32:15 | 显示全部楼层 |阅读模式

     

    用户报在Hue中执行一条sql:select admission_date, discharge_date,birth_date from hm_004_20170309141149.inpatient_visit limit 20; 返回的结果中date类型的列全部显示为null,可是在hive客户端中能正常显示。

    验证一下:beeline -u jdbc:hive2://0.0.0.0:10000 -e "select admission_date, discharge_date,birth_date from hm_004_20170309141149.inpatient_visit limit 20;"

     怀疑是hiveserver2的问题,可查询另一个包含date的表,却显示正常:select part_dt from default.kylin_sales limit 50;

    于是怀疑是serde的问题,hm_004_20170309141149.inpatient_visit用的是org.openx.data.jsonserde.JsonSerDe,default.kylin_sales用的是TextInputFormat.

    这个JsonSerDe看着有点怪,一查果然是第三方的,同事之前引入 已向开发者报问题: https://github.com/rcongiu/Hive-JSON-Serde/issues/187 
    官方自带的是org.apache.hive.hcatalog.data.JsonSerDe( https://cwiki.apache.org/confluence/display/Hive/SerDe), 位于$HIVE_HOME/hcatalog/share/hcatalog/hive-hcatalog-core-1.2.1.jar, 用此jar测试:
    CREATE EXTERNAL TABLE `default.inpatient_visit`(
      `age_m` int COMMENT 'from deserializer', 
      `discharge_date` date COMMENT 'from deserializer', 
      `address_code` string COMMENT 'from deserializer', 
      `admission_date` date COMMENT 'from deserializer', 
      `visit_dept_name` string COMMENT 'from deserializer', 
      `birth_date` date COMMENT 'from deserializer', 
      `outcome` string COMMENT 'from deserializer', 
      `age` int COMMENT 'from deserializer')
    ROW FORMAT SERDE 
      'org.apache.hive.hcatalog.data.JsonSerDe' 
    STORED AS INPUTFORMAT 
      'org.apache.hadoop.mapred.TextInputFormat' 
    OUTPUTFORMAT 
      'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
    LOCATION
      'hdfs://xxxx/user/hive/warehouse/xx.db/inpatient_visit';

    本地测试beeline -u jdbc:hive2://0.0.0.0:10000 -e "add jar /home/work/hive/hcatalog/share/hcatalog/hive-hcatalog-core-1.2.1.jar;select admission_date, discharge_date,birth_date from default.inpatient_visit limit 20;"

     

    在Hue中测试:

     


     

    【测试系统自带JsonSerDe是否功能一样】

    CREATE TABLE json_nested_test (
        count string,
        usage string,
        pkg map<string,string>,
        languages array<string>,
        store map<string,array<map<string,string>>>)
    ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
    STORED AS TEXTFILE;

    遇到个报错:

    2017-04-25 15:46:38,655 WARN  [main]: data.JsonSerDe (JsonSerDe.java:deserialize(181)) - Error [java.io.IOException: Start of Array expected] parsing json text [{"count":2,"usage":91273,"pkg":{"weight":8,"type":"apple"},"languages":["German","French","Italian"],"store":{"fruit":[{"weight":8,"type":"apple"},{"weight":9,"type":"pear"}]}}].
    2017-04-25 15:46:38,656 ERROR [main]: CliDriver (SessionState.java:printError(960)) - Failed with exception java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: java.io.IOException: Start of Array expected
    java.io.IOException: org.apache.hadoop.hive.serde2.SerDeException: java.io.IOException: Start of Array expected
            at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507)
            at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414)
            at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:140)
            at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1670)
            at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
            at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
            at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
            at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
            at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
            at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:498)
            at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
            at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
    Caused by: org.apache.hadoop.hive.serde2.SerDeException: java.io.IOException: Start of Array expected
            at org.apache.hive.hcatalog.data.JsonSerDe.deserialize(JsonSerDe.java:183)
            at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:488)
            ... 15 more
    Caused by: java.io.IOException: Start of Array expected
            at org.apache.hive.hcatalog.data.JsonSerDe.extractCurrentField(JsonSerDe.java:332)
            at org.apache.hive.hcatalog.data.JsonSerDe.extractCurrentField(JsonSerDe.java:356)
            at org.apache.hive.hcatalog.data.JsonSerDe.populateRecord(JsonSerDe.java:218)
            at org.apache.hive.hcatalog.data.JsonSerDe.deserialize(JsonSerDe.java:174)
            ... 16 more

    经过多轮测试(具体测试过程见:http://www.cnblogs.com/aprilrain/p/6916359.html),发现这个SerDe对于复杂些的嵌套会报此错,例如map<string,array<string>>

    CREATE TABLE s6 (
        store map<string,array<string>>
    )
    ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
    STORED AS TEXTFILE;
    load data local inpath '/home/work/s6.txt' overwrite into table s6;
    select * from s6;
    6s.txt的内容
    {"store":{"fruit":["weight","8","type","apple"]}}
    {"store":{"fruit":["weight","9","type","orange"]}}

    向社区报了一个issue: https://issues.apache.org/jira/browse/HIVE-16526

    另外还有问题:不支持数据文件中的空行: https://issues.apache.org/jira/browse/HIVE-15475,见下面的例子

    org.openx.data.jsonserde.JsonSerDe 不支持空行的例子:
    CREATE TABLE json_nested_test_openx (
        count string,
        usage string,
        pkg map<string,string>,
        languages array<string>,
        store map<string,array<map<string,string>>>)
    ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
    STORED AS TEXTFILE;
    hive> select pkg['weight'],languages[0],store['fruit'][0]['type'] from json_nested_test_openx;
    OK
    Failed with exception java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating store['fruit'][0]['type']
    解决:以上错误是由于数据文件的最后一行多了一个空行,去掉空行即可解决。

     

    哎...今天够累的,签到来了1...
    回复

    使用道具 举报

    您需要登录后才可以回帖 登录 | 立即注册

    本版积分规则

    QQ|手机版|小黑屋|Java自学者论坛 ( 声明:本站文章及资料整理自互联网,用于Java自学者交流学习使用,对资料版权不负任何法律责任,若有侵权请及时联系客服屏蔽删除 )

    GMT+8, 2024-5-13 22:48 , Processed in 0.072014 second(s), 29 queries .

    Powered by Discuz! X3.4

    Copyright © 2001-2021, Tencent Cloud.

    快速回复 返回顶部 返回列表