`

sqoop导数类型不支持解决办法:Hive does not support the SQL type for column

 
阅读更多
sqoop导数从oracle到hive中,遇到RAW等类型时,会报错:

013-09-17 19:33:12,184 ERROR org.apache.sqoop.tool.ImportTool: Encountered IOException running import job: java.io.IOException: [color=darkred]Hive does not support the SQL type for column RAW_TYPE_ID[/color]
  at rg.apache.sqoop.hive.TableDefWriter.getCreateTableStmt(TableDefWriter.java:195)
        at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:187)
        at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:425)
        at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:502)
        at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
        at org.apache.sqoop.Sqoop.main(Sqoop.java:238)


看到这个错,原以为是sqoop在导入数据时,不认识RAW类型,但是查询资料发现,RAW类型在oracle中表示的类型是 java.sql.Types.BINARY 或 java.sql.Types.VARBINARY,而这两种类型,在sqoop转java类型时都有处理,它转成了BytesWritable类型,这个类型是sqoop专门为处理byte[]类型处理的。
ConnManager中对oracle类型的转换对应关系
  public String toJavaType(int sqlType) {
    // Mappings taken from:
    // http://java.sun.com/j2se/1.3/docs/guide/jdbc/getstart/mapping.html
    if (sqlType == Types.INTEGER) {
      return "Integer";
    } else if (sqlType == Types.VARCHAR) {
      return "String";
    } else if (sqlType == Types.CHAR) {
      return "String";
    } else if (sqlType == Types.LONGVARCHAR) {
      return "String";
    } else if (sqlType == Types.NVARCHAR) {
      return "String";
    } else if (sqlType == Types.NCHAR) {
      return "String";
    } else if (sqlType == Types.LONGNVARCHAR) {
      return "String";
    } else if (sqlType == Types.NUMERIC) {
      return "java.math.BigDecimal";
    } else if (sqlType == Types.DECIMAL) {
      return "java.math.BigDecimal";
    } else if (sqlType == Types.BIT) {
      return "Boolean";
    } else if (sqlType == Types.BOOLEAN) {
      return "Boolean";
    } else if (sqlType == Types.TINYINT) {
      return "Integer";
    } else if (sqlType == Types.SMALLINT) {
      return "Integer";
    } else if (sqlType == Types.BIGINT) {
      return "Long";
    } else if (sqlType == Types.REAL) {
      return "Float";
    } else if (sqlType == Types.FLOAT) {
      return "Double";
    } else if (sqlType == Types.DOUBLE) {
      return "Double";
    } else if (sqlType == Types.DATE) {
      return "java.sql.Date";
    } else if (sqlType == Types.TIME) {
      return "java.sql.Time";
    } else if (sqlType == Types.TIMESTAMP) {
      return "java.sql.Timestamp";
    } else if (sqlType == Types.BINARY
        || sqlType == Types.VARBINARY) {
      return BytesWritable.class.getName();
    } else if (sqlType == Types.CLOB) {
      return ClobRef.class.getName();
    } else if (sqlType == Types.BLOB
        || sqlType == Types.LONGVARBINARY) {
      return BlobRef.class.getName();
    } else {
      // TODO(aaron): Support DISTINCT, ARRAY, STRUCT, REF, JAVA_OBJECT.
      // Return null indicating database-specific manager should return a
      // java data type if it can find one for any nonstandard type.
      return null;
    }


后来再查看源码,发现这个报错是在创建hive表是报出来的,是在oracle类型转为对应的hive类型时报的:

TableDefWriter.getCreateTableStmt()方法中:

 String hiveColType = userMapping.getProperty(col);
      if (hiveColType == null) {
        hiveColType = connManager.toHiveType(inputTableName, col, colType);
      }
      if (null == hiveColType) {
        throw new IOException("Hive does not support the SQL type for column "
            + col);
      }


再查发现:org.apache.sqoop.hive.HiveTypes中确实没有对应的BINARY和VARBINARY的处理类型:
 public static String toHiveType(int sqlType) {

      switch (sqlType) {
          case Types.INTEGER:
          case Types.SMALLINT:
              return "INT";
          case Types.VARCHAR:
          case Types.CHAR:
          case Types.LONGVARCHAR:
          case Types.NVARCHAR:
          case Types.NCHAR:
          case Types.LONGNVARCHAR:
          case Types.DATE:
          case Types.TIME:
          case Types.TIMESTAMP:
          case Types.CLOB:
              return "STRING";
          case Types.NUMERIC:
          case Types.DECIMAL:
          case Types.FLOAT:
          case Types.DOUBLE:
          case Types.REAL:
              return "DOUBLE";
          case Types.BIT:
          case Types.BOOLEAN:
              return "BOOLEAN";
          case Types.TINYINT:
              return "TINYINT";
          case Types.BIGINT:
              return "BIGINT";
          default:
        // TODO(aaron): Support BINARY, VARBINARY, LONGVARBINARY, DISTINCT,
        // BLOB, ARRAY, STRUCT, REF, JAVA_OBJECT.
        return null;
      }
  }


于是问题定位到了:
    在默认创建hive表时,sqoop根据oracle的RAW类型无法找到对应的HIVE类型,所以报错。

于是解决方法为:
1.通过 --map-column-hive 自己提供列对应的类型
  如:
--map-column-hive RAW_TYPE_ID=STRING

这样就指定RAW_TYPE_ID对应的HIVE类型为STRING类型


擦,sqoop导入数据时居然每次都强迫创建一遍hive表,不能自动取消...

分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics