sqoop导数从oracle到hive中,遇到RAW等类型时,会报错:
013-09-17 19:33:12,184 ERROR org.apache.sqoop.tool.ImportTool: Encountered IOException running import job: java.io.IOException: [color=darkred]Hive does not support the SQL type for column RAW_TYPE_ID[/color]
at rg.apache.sqoop.hive.TableDefWriter.getCreateTableStmt(TableDefWriter.java:195)
at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:187)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:425)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:502)
at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
at org.apache.sqoop.Sqoop.main(Sqoop.java:238)
看到这个错,原以为是sqoop在导入数据时,不认识RAW类型,但是查询资料发现,RAW类型在oracle中表示的类型是 java.sql.Types.BINARY 或 java.sql.Types.VARBINARY,而这两种类型,在sqoop转java类型时都有处理,它转成了BytesWritable类型,这个类型是sqoop专门为处理byte[]类型处理的。
ConnManager中对oracle类型的转换对应关系
public String toJavaType(int sqlType) {
// Mappings taken from:
// http://java.sun.com/j2se/1.3/docs/guide/jdbc/getstart/mapping.html
if (sqlType == Types.INTEGER) {
return "Integer";
} else if (sqlType == Types.VARCHAR) {
return "String";
} else if (sqlType == Types.CHAR) {
return "String";
} else if (sqlType == Types.LONGVARCHAR) {
return "String";
} else if (sqlType == Types.NVARCHAR) {
return "String";
} else if (sqlType == Types.NCHAR) {
return "String";
} else if (sqlType == Types.LONGNVARCHAR) {
return "String";
} else if (sqlType == Types.NUMERIC) {
return "java.math.BigDecimal";
} else if (sqlType == Types.DECIMAL) {
return "java.math.BigDecimal";
} else if (sqlType == Types.BIT) {
return "Boolean";
} else if (sqlType == Types.BOOLEAN) {
return "Boolean";
} else if (sqlType == Types.TINYINT) {
return "Integer";
} else if (sqlType == Types.SMALLINT) {
return "Integer";
} else if (sqlType == Types.BIGINT) {
return "Long";
} else if (sqlType == Types.REAL) {
return "Float";
} else if (sqlType == Types.FLOAT) {
return "Double";
} else if (sqlType == Types.DOUBLE) {
return "Double";
} else if (sqlType == Types.DATE) {
return "java.sql.Date";
} else if (sqlType == Types.TIME) {
return "java.sql.Time";
} else if (sqlType == Types.TIMESTAMP) {
return "java.sql.Timestamp";
} else if (sqlType == Types.BINARY
|| sqlType == Types.VARBINARY) {
return BytesWritable.class.getName();
} else if (sqlType == Types.CLOB) {
return ClobRef.class.getName();
} else if (sqlType == Types.BLOB
|| sqlType == Types.LONGVARBINARY) {
return BlobRef.class.getName();
} else {
// TODO(aaron): Support DISTINCT, ARRAY, STRUCT, REF, JAVA_OBJECT.
// Return null indicating database-specific manager should return a
// java data type if it can find one for any nonstandard type.
return null;
}
后来再查看源码,发现这个报错是在创建hive表是报出来的,是在oracle类型转为对应的hive类型时报的:
TableDefWriter.getCreateTableStmt()方法中:
String hiveColType = userMapping.getProperty(col);
if (hiveColType == null) {
hiveColType = connManager.toHiveType(inputTableName, col, colType);
}
if (null == hiveColType) {
throw new IOException("Hive does not support the SQL type for column "
+ col);
}
再查发现:org.apache.sqoop.hive.HiveTypes中确实没有对应的BINARY和VARBINARY的处理类型:
public static String toHiveType(int sqlType) {
switch (sqlType) {
case Types.INTEGER:
case Types.SMALLINT:
return "INT";
case Types.VARCHAR:
case Types.CHAR:
case Types.LONGVARCHAR:
case Types.NVARCHAR:
case Types.NCHAR:
case Types.LONGNVARCHAR:
case Types.DATE:
case Types.TIME:
case Types.TIMESTAMP:
case Types.CLOB:
return "STRING";
case Types.NUMERIC:
case Types.DECIMAL:
case Types.FLOAT:
case Types.DOUBLE:
case Types.REAL:
return "DOUBLE";
case Types.BIT:
case Types.BOOLEAN:
return "BOOLEAN";
case Types.TINYINT:
return "TINYINT";
case Types.BIGINT:
return "BIGINT";
default:
// TODO(aaron): Support BINARY, VARBINARY, LONGVARBINARY, DISTINCT,
// BLOB, ARRAY, STRUCT, REF, JAVA_OBJECT.
return null;
}
}
于是问题定位到了:
在默认创建hive表时,sqoop根据oracle的RAW类型无法找到对应的HIVE类型,所以报错。
于是解决方法为:
1.通过 --map-column-hive 自己提供列对应的类型
如:
--map-column-hive RAW_TYPE_ID=STRING
这样就指定RAW_TYPE_ID对应的HIVE类型为STRING类型
擦,sqoop导入数据时居然每次都强迫创建一遍hive表,不能自动取消...
分享到:
相关推荐
at org.apache.sqoop.manager.SqlManager.importQuery(SqlManager.java:748) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:515) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:...
/qybpm/ods/ods_INCIDENTS
Hadoop hbase hive sqoop集群环境安装配置及使用文档
自己用的解决 "The Network Adapter could not establish the connection" 的连接oracle问题。网上查的一些片段资料。主要是用自己去公司用的。
运行Sqoop报错:找不到或无法加载主类 org.apache.sqoop.sqoop 将sqoop-1.4.7.jar包放到Sqoop的lib目录下,问题解决。
flume、hive和sqoop的实用案例:flume收集日志hive负责处理数据sqoop负责将数据导出到mysql中供页面展示
Hadoop HBbase HIVE Sqoop概念介绍说明,及和RDBMS的比较
Sqoop导Oracle数据到Hive,代码清晰一目了然
原生的sqoop在迁移联合主键的表至hive时,只能识别一个主键,导致数据覆盖。本资源是把sqoop的源码按照官方推荐的issue解决方案重新编译后的版本。sqoop-1.4.7.bin__hadoop-2.6.0-2020.0508.tar.gz是已完成编译的tar...
sqoop连接sqlserver的驱动工具,没有这个sqoop是连不上滴
Sqoop2和Sqoop1的功能性对比 Sqoop的版本区别 功能 Sqoop1 Sqoop2 用于所有主要 RDBMS 的连接器 支持 不支持解决办法: 使用已在以下数据库上执行测试的通用 JDBC 连接器: Microsoft SQL Server 、 PostgreSQL 、 ...
yinian_hive_increase_sqoop sqoop从mysql同步数据到hive
mv /usr/local/sqoop-1.4.6-cdh5.13.2/conf/sqoop-env.template.sh /usr/local/sqoop-1.4.6-cdh5.13.2/conf/sqoop-env.sh vi /usr/local/sqoop-1.4.6-cdh5.13.2/conf/sqoop-env.sh export HADOOP_COMMON_HOME=/usr/...
sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException,没遇到可以跳过 19/09/20 09:57:47 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException at org.json...
运行Sqoop报错:Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/avro/LogicalType,下载此资源放到Sqoop的lib目录下即可
sqoop 导入数据时候报错ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: Could not load db driver class: oracle.jdbc.OracleDriver 缺少驱动包。
sqoop导入数据到hive
2、sqoop导入(RMDB-mysql、sybase到HDFS-hive) 网址:https://blog.csdn.net/chenwewi520feng/article/details/130572275 介绍sqoop从关系型数据库mysql、sybase同步到hdfs、hive中
利用sqoop导出sql server的数据时所需的驱动包,只需要将该包放到hive或sqoop的lib目录下即可。
NULL 博文链接:https://ylzhj02.iteye.com/blog/2051729