Details
-
Type:
Improvement
-
Status: Open
-
Priority:
Major
-
Resolution: Unresolved
-
Affects Version/s: 0.2.0
-
Fix Version/s: None
-
Component/s: Server
-
Labels:None
Description
Got the following error when running mr job on a parquet table created from hive:
I0316 09:59:46.307365 49245 status.cc:112] couldn't deserialize thrift msg:
No more data to read.
@ 0x82a459 impala::Status::Status()
@ 0xc648c9 impala::DeserializeThriftMsg<>()
@ 0xc59302 impala::HdfsParquetScanner::BaseScalarColumnReader::ReadDataPage()
@ 0xc5a476 impala::HdfsParquetScanner::BaseScalarColumnReader::NextPage()
@ 0xc67735 impala::HdfsParquetScanner::BaseScalarColumnReader::NextLevels<>()
@ 0xc68a7c impala::HdfsParquetScanner::ScalarColumnReader<>::ReadNonRepeatedValue()
@ 0xc6a92f impala::HdfsParquetScanner::AssembleRows<>()
@ 0xc5f1c0 impala::HdfsParquetScanner::ProcessSplit()
@ 0xc386a1 impala::HdfsScanNode::ScannerThread()
@ 0xbe8faf impala::Thread::SuperviseThread()
@ 0xbe9ef4 boost::detail::thread_data<>::run()
@ 0xe59c7a thread_proxy
@ 0x3e49807851 (unknown)
@ 0x3e494e894d (unknown)
When set max_tasks to 0, this error disappeared sometimes. We need to find out the reason.
Here is the code to create parquet table from hive:
create table parquet_table stored as parquet as select * from text_table;