Package org.apache.avro.file
Class DataFileWriter<D>
java.lang.Object
org.apache.avro.file.DataFileWriter<D>
- All Implemented Interfaces:
- Closeable,- Flushable,- AutoCloseable
Stores in a file a sequence of data conforming to a schema. The schema is
 stored in the file with the data. Each datum in a file is of the same schema.
 Data is written with a 
DatumWriter. Data is grouped into
 blocks. A synchronization marker is written between blocks, so that
 files may be split. Blocks may be compressed. Extensible metadata is stored
 at the end of the file. Files may be appended to.- See Also:
- 
Nested Class SummaryNested ClassesModifier and TypeClassDescriptionstatic classThrown byappend(Object)when an exception occurs while writing a datum to the buffer.
- 
Constructor SummaryConstructors
- 
Method SummaryModifier and TypeMethodDescriptionvoidAppend a datum to the file.voidappendAllFrom(DataFileStream<D> otherFile, boolean recompress) Appends data from another file. otherFile must have the same schema.voidappendEncoded(ByteBuffer datum) Expert: Append a pre-encoded datum to the file.Open a writer appending to an existing file.appendTo(SeekableInput in, OutputStream out) Open a writer appending to an existing file.voidclose()Flush and close the file.Open a new file for data matching a schema with a random sync.create(Schema schema, OutputStream outs) Open a new file for data matching a schema with a random sync.create(Schema schema, OutputStream outs, byte[] sync) Open a new file for data matching a schema with an explicit sync.voidflush()Calls sync() and then flushes the current state of the file.voidfSync()If this writer was instantiated using a File, FileOutputStream or Syncable instance, this method flushes all buffers for this writer to disk.booleanstatic booleanisReservedMeta(String key) Configures this writer to use the given codec.setEncoder(Function<OutputStream, BinaryEncoder> initEncoderFunc) Allows setting a different encoder than the default DirectBinaryEncoder.voidsetFlushOnEveryBlock(boolean flushOnEveryBlock) Set whether this writer should flush the block to the stream every time a sync marker is written.Set a metadata property.Set a metadata property.Set a metadata property.setSyncInterval(int syncInterval) Set the synchronization interval for this file, in bytes.longsync()Return the current position as a value that may be passed toDataFileReader.seek(long).
- 
Constructor Details- 
DataFileWriterConstruct a writer, not yet open.
 
- 
- 
Method Details- 
setCodecConfigures this writer to use the given codec. May not be reset after writes have begun.
- 
setSyncIntervalSet the synchronization interval for this file, in bytes. Valid values range from 32 to 2^30 Suggested values are between 2K and 2M The stream is flushed by default at the end of each synchronization interval. If setFlushOnEveryBlock(boolean) is called with param set to false, then the block may not be flushed to the stream after the sync marker is written. In this case, the flush() must be called to flush the stream. Invalid values throw IllegalArgumentException- Parameters:
- syncInterval- the approximate number of uncompressed bytes to write in each block
- Returns:
- this DataFileWriter
 
- 
setEncoderAllows setting a different encoder than the default DirectBinaryEncoder.- Parameters:
- initEncoderFunc- Function to create a binary encoder
- Returns:
- this DataFileWriter
 
- 
createOpen a new file for data matching a schema with a random sync.- Throws:
- IOException
 
- 
createOpen a new file for data matching a schema with a random sync.- Throws:
- IOException
 
- 
createOpen a new file for data matching a schema with an explicit sync.- Throws:
- IOException
 
- 
setFlushOnEveryBlockpublic void setFlushOnEveryBlock(boolean flushOnEveryBlock) Set whether this writer should flush the block to the stream every time a sync marker is written. By default, the writer will flush the buffer each time a sync marker is written (if the block size limit is reached or the sync() is called.- Parameters:
- flushOnEveryBlock- - If set to false, this writer will not flush the block to the stream until flush() is explicitly called.
 
- 
isFlushOnEveryBlockpublic boolean isFlushOnEveryBlock()- Returns:
- - true if this writer flushes the block to the stream every time a sync marker is written. Else returns false.
 
- 
appendToOpen a writer appending to an existing file.- Throws:
- IOException
 
- 
appendToOpen a writer appending to an existing file. Since 1.9.0 this method does not close in.- Parameters:
- in- reading the existing file.
- out- positioned at the end of the existing file.
- Throws:
- IOException
 
- 
setMetaSet a metadata property.
- 
isReservedMeta
- 
setMetaSet a metadata property.
- 
setMetaSet a metadata property.
- 
appendAppend a datum to the file.- Throws:
- IOException
- See Also:
 
- 
appendEncodedExpert: Append a pre-encoded datum to the file. No validation is performed to check that the encoding conforms to the file's schema. Appending non-conforming data may result in an unreadable file.- Throws:
- IOException
 
- 
appendAllFromAppends data from another file. otherFile must have the same schema. Data blocks will be copied without de-serializing data. If the codecs of the two files are compatible, data blocks are copied directly without decompression. If the codecs are not compatible, blocks from otherFile are uncompressed and then compressed using this file's codec. If the recompress flag is set all blocks are decompressed and then compressed using this file's codec. This is useful when the two files have compatible compression codecs but different codec options. For example, one might append a file compressed with deflate at compression level 1 to a file with deflate at compression level 7. If recompress is false, blocks will be copied without changing the compression level. If true, they will be converted to the new compression level.- Parameters:
- otherFile-
- recompress-
- Throws:
- IOException
 
- 
syncReturn the current position as a value that may be passed toDataFileReader.seek(long). Forces the end of the current block, emitting a synchronization marker. By default, this will also flush the block to the stream. If setFlushOnEveryBlock(boolean) is called with param set to false, then this method may not flush the block. In this case, the flush() must be called to flush the stream.- Throws:
- IOException
 
- 
flushCalls sync() and then flushes the current state of the file.- Specified by:
- flushin interface- Flushable
- Throws:
- IOException
 
- 
fSyncIf this writer was instantiated using a File, FileOutputStream or Syncable instance, this method flushes all buffers for this writer to disk. In other cases, this method behaves exactly like flush().- Throws:
- IOException
 
- 
closeFlush and close the file.- Specified by:
- closein interface- AutoCloseable
- Specified by:
- closein interface- Closeable
- Throws:
- IOException
 
 
-