pl.edu.agh.cast.importer.base.tokenizer
Interface IImportTokenizer

All Known Implementing Classes:
AbstractImportTokenizer, AckTokenizer, CsvTokenizer, FixedWidthTokenizer, XlsTokenizer

public interface IImportTokenizer

Interface for classes responsible for initial input stream analysis and conversion to an appropriate list of record tables. The IImportTokenizer is dependent on the input file format. It can also define its own specialised ITokenizerOptions.

Author:
AGH CAST Team

Method Summary
 List<ITokenizerOption> getTokenizerOptions()
          Returns tokenizer options.
 void setEncoding(Charset charset)
          Sets the input stream encoding.
 void setTokenizerOptions(List<ITokenizerOption> options)
          Sets the tokenizer options.
 List<RawTabularData> tokenize(InputStream dataIs, long rowsLimit, org.eclipse.core.runtime.IProgressMonitor monitor)
          Splits a given input stream into tokens, using specified tokenizer options.
 

Method Detail

tokenize

List<RawTabularData> tokenize(InputStream dataIs,
                              long rowsLimit,
                              org.eclipse.core.runtime.IProgressMonitor monitor)
                              throws IOException
Splits a given input stream into tokens, using specified tokenizer options.

Parameters:
dataIs - the data input stream to tokenize
rowsLimit - the maximum number of rows to be imported
monitor - the progress monitor for the tokenization operation
Returns:
the tokenized data in an unanalyzed tabular form
Throws:
IOException

setEncoding

void setEncoding(Charset charset)
Sets the input stream encoding.

Parameters:
charset - the encoding of data contained in the input stream

setTokenizerOptions

void setTokenizerOptions(List<ITokenizerOption> options)
Sets the tokenizer options.

Parameters:
options - the options passed to data tokenizer

getTokenizerOptions

List<ITokenizerOption> getTokenizerOptions()
Returns tokenizer options.

Returns:
tokenizer options.


Copyright © 2007-2009 IISG AGH-UST Krakow, Poland. All Rights Reserved.