Package org.apache.avro.hadoop.io
Class AvroSequenceFile
java.lang.Object
org.apache.avro.hadoop.io.AvroSequenceFile
A wrapper around a Hadoop
SequenceFile that also
supports reading and writing Avro data.
The vanilla Hadoop SequenceFile contains a header
followed by a sequence of records. A record consists of a
key and a value. The key and value must either:
- implement the
Writableinterface, or - be accepted by a
Serializationregistered with theSerializationFactory.
Since Avro data are Plain Old Java Objects (e.g., Integer for
data with schema "int"), they do not implement Writable.
Furthermore, a Serialization
implementation cannot determine whether an object instance of type
CharSequence that also implements Writable should
be serialized using Avro or WritableSerialization.
The solution implemented in AvroSequenceFile is to:
- wrap Avro key data in an
AvroKeyobject, - wrap Avro value data in an
AvroValueobject, - configure and register
AvroSerializationwith theSerializationFactory, which will accept only objects that are instances of eitherAvroKeyorAvroValue, and - store the Avro key and value schemas in the SequenceFile header.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classA reader for SequenceFiles that may contain Avro data.static classA writer for an uncompressed SequenceFile that supports Avro data. -
Field Summary
Fields -
Method Summary
Modifier and TypeMethodDescriptionstatic SequenceFile.WriterCreates a writer from a set of options.
-
Field Details
-
METADATA_FIELD_KEY_SCHEMA
The SequenceFile.Metadata field for the Avro key writer schema. -
METADATA_FIELD_VALUE_SCHEMA
The SequenceFile.Metadata field for the Avro value writer schema.
-
-
Method Details
-
createWriter
public static SequenceFile.Writer createWriter(AvroSequenceFile.Writer.Options options) throws IOException Creates a writer from a set of options.Since there are different implementations of
Writerdepending on the compression type, this method constructs the appropriate subclass depending on the compression type given in theoptions.- Parameters:
options- The options for the writer.- Returns:
- A new writer instance.
- Throws:
IOException- If the writer cannot be created.
-