Details
-
Type:
Bug
-
Status: Resolved
-
Priority:
Major
-
Resolution: Duplicate
-
Affects Version/s: 0.17.0
-
Fix Version/s: None
-
Component/s: None
-
Labels:None
Description
I'm using Kite's Parquet support from Sqoop and I'm running into an issue importing a somewhat wide table (142 columns): Hive throws a metastore exception b/c the value of the serialized schema is longer than 4000 chars (which I'm assuming is the limit Postgres sets for table properties, including schema literals):
Caused by: org.postgresql.util.PSQLException: ERROR: value too long for type character varying(4000)
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:334)
at com.jolbox.bonecp.PreparedStatementHandle.executeUpdate(PreparedStatementHandle.java:205)
at org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeUpdate(ParamLoggingPreparedStatement.java:399)
at org.datanucleus.store.rdbms.SQLController.executeStatementUpdate(SQLController.java:439)
at org.datanucleus.store.rdbms.scostore.JoinMapStore.internalPut(JoinMapStore.java:1069)
... 40 more
)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result$create_table_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:24255)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result$create_table_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:24223)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result.read(ThriftHiveMetastore.java:24149)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table_with_environment_context(ThriftHiveMetastore.java:893)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table_with_environment_context(ThriftHiveMetastore.java:879)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:569)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:558)
at org.kitesdk.data.spi.hive.MetaStoreUtil$4.call(MetaStoreUtil.java:179)
at org.kitesdk.data.spi.hive.MetaStoreUtil$4.call(MetaStoreUtil.java:176)
at org.kitesdk.data.spi.hive.MetaStoreUtil.doWithRetry(MetaStoreUtil.java:66)
at org.kitesdk.data.spi.hive.MetaStoreUtil.createTable(MetaStoreUtil.java:191)
I would think the right solution here would be to have an option for writing large schemas to a file in HDFS that could be referenced from the Hive metastore via a URL.
Attachments
Issue Links
- duplicates
-
KITE-696 Store Avro schema in HDFS by default for Hive datasets
-
- Resolved
-