The readSequenceFile morphline command should not reuse the "key" and "value" Hadoop Writeable objects across rows.
Downstream commands such as loadSolr or HBase indexer buffer up a bunch of records before sending them off to Solr. If the buffered records contain a reference to the same Hadoop Writeable object as the primary key id, this leads to nonsensical behaviour as all the records suddently appear to be the same record (same id).
A work-around is to insert the commands
immediately after the readSequenceFile command in your morphline. This converts the key and value from the Hadoop Writable to a distinct String object, which means the identity of the key and object are different for each row.