Module datarush.library
Class DelimitedTextFormat
- java.lang.Object
-
- com.pervasive.datarush.operators.io.textfile.DelimitedTextFormat
-
- All Implemented Interfaces:
DataFormat
public class DelimitedTextFormat extends Object implements DataFormat
Describes the format of a delimited text file. Normally, it is not necessary construct these directly. Instead, useReadDelimitedText
andWriteDelimitedText
to access data stored as delimited text.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface com.pervasive.datarush.operators.io.DataFormat
DataFormat.DataFormatter, DataFormat.DataParser
-
-
Constructor Summary
Constructors Constructor Description DelimitedTextFormat(RecordTextSchema<?> schema, FieldDelimiterSettings delimiters, CharsetEncoding encoding)
Create a data format for accessing delimited text data.DelimitedTextFormat(RecordTextSchema<?> schema, FieldDelimiterSettings delimiters, CharsetEncoding encoding, FileMetadata metadata)
Create a data format for accessing delimited text data.DelimitedTextFormat(RecordTextSchema<?> schema, FieldDelimiterSettings delimiters, CharsetEncoding encoding, FileMetadata metadata, boolean hasHeader, String lineComment, int skipCount)
Create a data format for accessing delimited text data.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description DataFormat.DataParser
createParser(ParsingOptions options)
Create a new parser for the format using the specified parsing options.DataFormat.DataFormatter
createWriter(FormattingOptions options)
Create a new writer for the format using the specified formatting options.FileMetadata
getMetadata()
Gets the metadata associated with the format.RecordTokenType
getType()
Gets the record type associated with the format.boolean
isSplittable()
Indicates if the format supports parsing of subsections of a file.FileMetadata
readMetadata(FileClient fileClient, ByteSource source)
Reads the metadata associated with the format.void
setMetadata(FileMetadata metadata)
Sets the metadata associated with the format.void
writeMetadata(FileMetadata metadata, FileClient fileClient, ByteSink target)
Writes the provided metadata associated with the format.
-
-
-
Constructor Detail
-
DelimitedTextFormat
public DelimitedTextFormat(RecordTextSchema<?> schema, FieldDelimiterSettings delimiters, CharsetEncoding encoding)
Create a data format for accessing delimited text data. The text is assumed to have no header and to use '#' as the line comment marker.- Parameters:
schema
- the schema to use for records. This provides fields names as well as formatting information for field values.delimiters
- a description of the delimiters used in the textencoding
- character set definition for data encoding in the text
-
DelimitedTextFormat
public DelimitedTextFormat(RecordTextSchema<?> schema, FieldDelimiterSettings delimiters, CharsetEncoding encoding, FileMetadata metadata)
Create a data format for accessing delimited text data. The text is assumed to have no header and to use '#' as the line comment marker.- Parameters:
schema
- the schema to use for records. This provides fields names as well as formatting information for field values.delimiters
- a description of the delimiters used in the textencoding
- character set definition for data encoding in the textmetadata
- the metadata associated with the data
-
DelimitedTextFormat
public DelimitedTextFormat(RecordTextSchema<?> schema, FieldDelimiterSettings delimiters, CharsetEncoding encoding, FileMetadata metadata, boolean hasHeader, String lineComment, int skipCount)
Create a data format for accessing delimited text data.- Parameters:
schema
- the schema to use for records. This provides fields names as well as formatting information for field values.delimiters
- a description of the delimiters used in the textencoding
- character set definition for data encodingmetadata
- the metadata associated with the datahasHeader
- indicates whether the first record is a header.lineComment
- characters used to indicate line comments in the textskipCount
-
-
-
Method Detail
-
isSplittable
public boolean isSplittable()
Indicates if the format supports parsing of subsections of a file.A format should only return
true
if it can, at least in some situations, support this sort of parsing. If a format requires reading the entire file, it must returnfalse
.If a format is not splittable, a file in the format cannot be parsed in parallel; however, individual files can still be parsed independently in parallel, as when reading the contents of a directory or using a file globbing pattern.
Generally, delimited text data is splittable. However, if any fields contain the record separator in their delimited value, it may not be.
- Specified by:
isSplittable
in interfaceDataFormat
- Returns:
true
if the format supports parsing only a portion of the file,false
otherwise
-
getType
public RecordTokenType getType()
Description copied from interface:DataFormat
Gets the record type associated with the format. Records produced by the associated parser or consumed by the associated formatter will be of this type.For many formats, this may be derived from a schema object describing the format layout.
- Specified by:
getType
in interfaceDataFormat
- Returns:
- the format's record type
-
getMetadata
public FileMetadata getMetadata()
Description copied from interface:DataFormat
Gets the metadata associated with the format. Records produces by the associated parser or consumed by the associated formatter will use this metadata.- Specified by:
getMetadata
in interfaceDataFormat
- Returns:
- the format's metadata
-
setMetadata
public void setMetadata(FileMetadata metadata)
Description copied from interface:DataFormat
Sets the metadata associated with the format.- Specified by:
setMetadata
in interfaceDataFormat
-
readMetadata
public FileMetadata readMetadata(FileClient fileClient, ByteSource source)
Description copied from interface:DataFormat
Reads the metadata associated with the format.- Specified by:
readMetadata
in interfaceDataFormat
- Parameters:
fileClient
- client used to read filesource
- location of the files
-
writeMetadata
public void writeMetadata(FileMetadata metadata, FileClient fileClient, ByteSink target)
Description copied from interface:DataFormat
Writes the provided metadata associated with the format.- Specified by:
writeMetadata
in interfaceDataFormat
- Parameters:
metadata
- the metadata to writefileClient
- client used to write file
-
createParser
public DataFormat.DataParser createParser(ParsingOptions options)
Description copied from interface:DataFormat
Create a new parser for the format using the specified parsing options.- Specified by:
createParser
in interfaceDataFormat
- Parameters:
options
- parsing options to use- Returns:
- a new parser for reading external data
-
createWriter
public DataFormat.DataFormatter createWriter(FormattingOptions options)
Description copied from interface:DataFormat
Create a new writer for the format using the specified formatting options.- Specified by:
createWriter
in interfaceDataFormat
- Parameters:
options
- formatting options to use- Returns:
- a new formatter for writing external data
-
-