java.lang.Object
- com.pervasive.datarush.operators.io.textfile.DelimitedTextAnalyzer

```
public class DelimitedTextAnalyzer
extends Object
```
An analyzer for files containing delimited text. An analysis can perform a basic parsing of the file, permitting validation of delimiter configuration. The following information is provided as a result of analyzing a file:
- The values of fields for analyzed records.
- The record separator. If the properties specify auto-detection of newline style, the analyzer will determine whether the file uses Windows-style CRLF or UNIX-style LF.
- The field separator. If the properties specify auto-detection of the field separator, the analyzer will attempt to determine the appropriate separator from a known set: comma (','), tab ('\t'), semicolon (';'), pipe ('|'), and space (' ').
- The field delimiter. If the properties specify auto-detection of the field separator, the analyzer will attempt to determine the appropriate delimiter from a known set: single quote (') or double quote ("). If one cannot be determined, the text is assumed to be undelimited.
- The comment marker. If the properties specify auto-detection of the comment marker, the analyzer will attempt to determine the appropriate comment marker from a known set: #), %, and //. If one cannot be determined, it is assumed there is no comment marker.
This information can be used to generate a schema for the records, but also could be used to provide a preview of how a file would be parsed with given settings.

Nested Class Summary

Nested Classes
Modifier and Type Class Description

static class DelimitedTextAnalyzer.Analysis
Contains the results of an analysis of a delimited text file.

Constructor Summary

Constructors
Constructor Description

DelimitedTextAnalyzer(FieldDelimiterSpecifier delimiters)
Creates a new analyzer which uses the given delimiter information.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method	Description
`DelimitedTextAnalyzer.Analysis`	`analyze(Path file, CharsetEncoding charsetSpec)`	Analyzes the specified file based on current configuration.
`DelimitedTextAnalyzer.Analysis`	`analyze(Path file, CharsetEncoding charsetSpec, FileClient client)`	Analyzes the specified file based on current configuration.
`DelimitedTextAnalyzer.Analysis`	`analyze(Reader input)`	Analyzes the specified text stream based on current configuration.
`DelimitedTextAnalyzer.Analysis`	`analyze(String file, CharsetEncoding charsetSpec)`	Analyzes the specified file based on current configuration.
`void`	`setAnalysisSize(int count)`	Sets the maximum number of characters to use in analysis.
`void`	`setHeaderSkipCount(int count)`	Sets the number of lines to skip at the beginning of the file.
`void`	`setLineComment(String lineComment)`	Set the value of the indicator that a line is commented and should be ignored.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - DelimitedTextAnalyzer
```
public DelimitedTextAnalyzer(FieldDelimiterSpecifier delimiters)
```
    Creates a new analyzer which uses the given delimiter information. Initially, the analyzer is configured to allow unlimited record length and only parses the first row.
    
    Parameters:
    
    delimiters - field structure information from which to initialize settings
- Method Detail
  - setAnalysisSize
```
public void setAnalysisSize(int count)
```
    Sets the maximum number of characters to use in analysis. This value should be large enough to contain at least one records. By default, 1MB is analyzed.
    
    Parameters:
    
    count - the number of characters to analyze
  - setLineComment
```
public void setLineComment(String lineComment)
```
    Set the value of the indicator that a line is commented and should be ignored. This line comment indicator must be found at the beginning of a line to be considered a comment.
    
    Parameters:
    
    lineComment - the string value indicating a line is commented out
  - setHeaderSkipCount
```
public void setHeaderSkipCount(int count)
```
    Sets the number of lines to skip at the beginning of the file. Skipped lines are only analyzed for newline discovery; they are ignored in the remainder of the analysis. By default, no lines are skipped.
    
    Parameters:
    
    count - the number lines at the start of the file to skip
  - analyze
```
public DelimitedTextAnalyzer.Analysis analyze(String file,
                                              CharsetEncoding charsetSpec)
                                       throws IOException
```
    Analyzes the specified file based on current configuration. The file will be processed assuming the delimiters with which the analyzer was constructed. The analysis will also indicate the delimiters used in the file. This will be the set of delimiters provided initially to the analyzer plus any discovered delimiters.
    
    Parameters:
    
    file - path to the delimited text file to analyze
    
    charsetSpec - description of the file's character set encoding
    
    Returns:
    
    an analysis of the delimited text file
    
    Throws:
    
    IOException - if an error occurs while reading the file
    
    RowTooLongException - if the first row exceeds the configured length
  - analyze
```
public DelimitedTextAnalyzer.Analysis analyze(Path file,
                                              CharsetEncoding charsetSpec)
                                       throws IOException
```
    Analyzes the specified file based on current configuration. The file will be processed assuming the delimiters with which the analyzer was constructed. The analysis will also indicate the delimiters used in the file. This will be the set of delimiters provided initially to the analyzer plus any discovered delimiters.
    
    Parameters:
    
    file - path to the delimited text file to analyze
    
    charsetSpec - description of the file's character set encoding
    
    Returns:
    
    an analysis of the delimited text file
    
    Throws:
    
    IOException - if an error occurs while reading the file
    
    RowTooLongException - if the first row exceeds the configured length
  - analyze
```
public DelimitedTextAnalyzer.Analysis analyze(Path file,
                                              CharsetEncoding charsetSpec,
                                              FileClient client)
                                       throws IOException
```
    Analyzes the specified file based on current configuration. The file will be processed assuming the delimiters with which the analyzer was constructed. The analysis will also indicate the delimiters used in the file. This will be the set of delimiters provided initially to the analyzer plus any discovered delimiters.
    
    Parameters:
    
    file - path to the delimited text file to analyze
    
    charsetSpec - description of the file's character set encoding
    
    client - the authorization context to use for accessing the file
    
    Returns:
    
    an analysis of the delimited text file
    
    Throws:
    
    IOException - if an error occurs while reading the file
    
    RowTooLongException - if the first row exceeds the configured length
  - analyze
```
public DelimitedTextAnalyzer.Analysis analyze(Reader input)
                                       throws IOException,
                                              RowTooLongException
```
    Analyzes the specified text stream based on current configuration. The file will be processed assuming the delimiters with which the analyzer was constructed. The analysis will also indicate the delimiters used in the file. This will be the set of delimiters provided initially to the analyzer plus any discovered delimiters.
    
    Parameters:
    
    input - the text data to analyze
    
    Returns:
    
    an analysis of the delimited text
    
    Throws:
    
    IOException - if an error occurs while reading the file
    
    RowTooLongException - if the first row exceeds the configured length

Class DelimitedTextAnalyzer

Nested Class Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

DelimitedTextAnalyzer

Method Detail

setAnalysisSize

setLineComment

setHeaderSkipCount

analyze

analyze

analyze

analyze