File item reader message source · eMagiz Documentation

Auto create directory

Specify whether to automatically create the source directory if it does not yet exist when this adapter is being initialized.

If set to false and the directory does not exist upon initialization, an Exception will be thrown.

Default: true

Filename pattern

Only filenames matching this ant style expression will be picked up by this adapter.

The ant style expression uses the following rules:

'?' matches one character
'*' matches zero or more characters

Some examples:

t?st.xml - matches test.xml but also tast.xml or txst.xml
*.xml - matches all .xml files

Note: Use filename-regex for for more advanced patterns.

Ignore hidden

Whether hidden files shall be ignored by this adapter.

If disabled, hidden files will be processed just like normal files. If enabled, an IgnoreHiddenFileListFilter will be added to filter the hidden files.

Default is true.

Prevent duplicates

Specifies whether duplicates should be prevented, by keeping a (unbounded) list of file names in memory and only passing files the first time they are polled.

If enabled, this duplicate prevention is done before any other filtering, i.e. before applying the filename pattern, the filename regex or a custom filter.

Default is true.

Delete files

If set to 'true', files that are done processing (all items that successfully could be read have been converted to messages) will be deleted.

Default is 'false'.

Item reader type

This item reader will be used to read items from the Files and convert them to message payload objects.

Required.

Options: 1. Flat file item reader Flat File Item Readers read lines of data from a flat file that typically describe records with fields of data defined by fixed positions in the file or delimited by some special character (e.g. Comma). 2. Stax event item reader Item reader for reading XML input based on StAX. It extracts fragments from the input XML document which correspond to records for processing.

Fragment root element names

Comma-separated list of names of the root element(s) of the fragment(s). Can be either the local name of an XML element (in this case the namespace is not checked at all), or it can contain the namespace using the {namespace}localname notation (in this case the namespace must be a match).

Required

Strict

In strict mode the reader will throw an exception if the input resource does not exist.

Default is 'true'.

Result type

Sets the result type for the unmarshal operation.

Default is "String".

Queue size

Specify the maximum number of file names read into memory when scanning the directory. This is useful to limit the memory footprint of this endpoint.

A larger queue size reduces the number of directory listings needed, but it increases the chances of the internal queue being out of whack with the actual files listed in the directory. Use 0 for small but volatile directories, use a large number for large directories that are only written to.

Using a stateful filter would counter this benefit, so accept once file list filter is not used when this attribute is specified.

If not specified (the default) all files names are read into memory. This makes it possible to apply stateful filters (such as the accept once file list filter), but this setting should not be used with directories that contain a vast number of files.

Filename regex

Only files matching this regular expression will be picked up by this adapter.

Examples:

^.*.xml$ - matches all .xml files ^test\d{4}.xml$ - matches all filenames that start with test followed by 4 digits and end with .xml ${FILENAME_REGEX} - matches all files that match the regular expression inside the global FILENAME_REGEX property

See the Java documentation about patterns:

Pattern documentation

Filter

You can supply a custom filter to prevent creating messages for certain files. Use this if you need more control over the filtering process than is possible with the filename pattern or filename regex options.

Note that if prevent duplicates is enabled, an accept once file list filter with an unbounded queue is automatically applied before your custom filter is called.

Use watch service

By default this channel adapter will scan all items (files and directories!) in the specified source directory, but not in any of its subdirectories. By enabling this option you can change this default behaviour.

The watch service relies on file system events when new files are added to the directory. During initialization, the directory is registered to generate events; the initial file list is also built. While walking the directory tree, any subdirectories encountered are also registered to generate events. On the first poll, the initial file list from walking the directory is returned. On subsequent polls, files from new creation events are returned. If a new subdirectory is added, its creation event is used to walk the new subtree to find existing files, as well as registering any new subdirectories found.

Note that any specified filters are still applied after the watch service returns the list of files.

Watch events

Comma-separated list of system event types (CREATE, MODIFY, DELETE) the watch service will listen to.

Default is CREATE.

Channel

Channel where the generated messages should be sent to.

You can select the nullChannel here to silently drop the messages.

Required

Documentation

Specifies when and how the reading task is executed.

Default global poller is used when empty

Use default poller

Specifies if the global (default) poller should be used or an included poller.

The poller specifies when and how the reading task is executed.

If the global poller is used it should be added as separate support object.

Id

Name that uniquely identifies this flow component.

Required

Resource

Input resource file for the reader. The resource can be created using the JobLaunchRequest job parameters.

Example

file:#{jobParameters['input.file']}

Note that you need to set the scope to Step

Line mapper type

Sets the type of the used line mapper which maps lines to objects.

Required.

Options: 1. Default line mapper Default used when all of the records in a file have the same format. Converts and maps lines to domain objects. 2. Pattern matching composite line mapper Used when there are multiple line types within a single input file. It selects the correct mapping for each line using pattern matching. 3. Pass through line mapper Useful for passing the original String back directly rather than a mapped object.

Line tokenizer type

Specifies the type of the used Line Tokenizer

Given line of input, a Field Set representing the line will be returned by the tokenizer. This Field Set can then be passed to a Field Set Mapper.

Delimited Line Tokenizer - Used for files where fields in a record are separated by a delimiter. Fixed Length Tokenizer - Used for files where fields in a record are each a 'fixed width'.

Field set mapper type

Specifies the field set mapper type

Required.

Options: 1. Xml field set mapper It converts a FieldSet to an XML document, by adding an XML element for each field to the XML root element. These XML elements will have the same name as the field they represent, and will have the same field value as their (text) content.

2. Pass through field set mapper Useful for passing a FieldSet back directly rather than a mapped object.

Scope

Allows to do a late binding of references accessible from the StepContext using #{..} placeholders. Using this feature, bean properties can be pulled from the step or job execution context and the job parameters.

Lines to skip

The number of lines to skip at the start of a file. Can be used if the file contains a header without useful (column name) information, and without a comment delimiter at the beginning of the lines.

Default is '0'.

Charset

Sets the character set used for reading the input data.

Default is the default character set of this Java virtual machine.

Comment prefix

Comment prefixes (seperated by commas). Can be used to ignore header lines as well by using e.g. the first couple of column names as a prefix.

Default is '#'.

Strict

In strict mode the reader will throw an exception if the input resource does not exist.

Default is 'true'.

Line separator policy

Used to determine where the line endings are and do things like continue over a line ending if inside a quoted string.

Simple record separator policy (default) - treats all lines as record endings. Default record separator policy - treats all lines as record endings, as long as they do not have unterminated quotes, and do not end in a continuation marker. Suffix record separator policy - looks for an exact match for a String at the end of a line (e.g. a semicolon).

Line suffix

Lines ending in this terminator String signal the end of a record.

Ignore whitespace

Flag to indicate that the decision to terminate a record should ignore whitespace at the end of the line.

Quote character

The quote character.

Defaults to double quote mark.

Continuation character

The continuation marker.

Defaults to back slash.

Delimiter

The delimiter character.

Default is a comma.

Strict

If true (the default) then number of tokens in line must match the number of tokens defined (by Range, columns, etc.) in LineTokenizer. If false then lines with less tokens will be tolerated and padded with empty columns, and lines with more tokens will simply be truncated.

Default is 'true'.

Include fields

The fields to include in the output by position (comma separated list of field numbers, starting at 0).

By default all fields are included, but this property can be set to pick out only a few fields from a larger set. Note that if field names are provided, their number must match the number of included fields.

Max messages per poll

Specifies the maximum number of messages to receive within a given poll operation.

The poller will continue trying to receive without waiting until either no message is available or this maximum is reached.

For example, if a poller has a 10 second interval trigger and a maxMessagesPerPoll setting of 25, and it is polling a channel that has 100 messages in its queue, all 100 messages can be retrieved within 40 seconds. It grabs 25, waits 10 seconds, grabs the next 25, and so on.

Default is 1.

Receive timeout

Specifies the amount of time the poller should wait if no messages are available when receiving.

Send timeout

Specifies the timeout for sending out messages.

Task executor

Task executor to execute the scheduled tasks.

Default when empty: TaskScheduler with name 'taskScheduler', created if not exists.

Error channel

The channel that error messages will be sent to if a failure occurs in this poller's invocation. To completely suppress exceptions, provide a reference to the nullChannel here.

Trigger type

A trigger specifies the schedule of the poller.

Trigger types:

1. Fixed delay trigger Triggers with a periodic constant interval. Each execution is scheduled relative to the actual execution time of the previous execution. If an execution is delayed for any reason (such as garbage collection or other background activity), subsequent executions will be delayed as well.

2. Fixed rate trigger Triggers with a periodic constant interval. Each execution is scheduled relative to the scheduled execution time of the initial execution.If an execution is delayed for any reason , two or more executions will occur in rapid succession to "catch up."

3. Cron trigger Enables the scheduling of tasks based on cron expressions. Consider using a cron trigger for hourly, daily, and monthly settings.

Time unit

Specifies the time unit of the fixed delay or fixed rate value.

For hourly, daily or monthly settings, consider using a cron trigger instead.

Default is Milliseconds.

Fixed delay

Time between each two subsequent executions, measured from completion time.

Fixed rate

Time between each two subsequent executions, measured from start time.

Cron

Pattern used by a cron-trigger to specify the trigger schedule.

The pattern is a list of six single space-separated fields, representing second minute hour day month weekday. Month and weekday names can be given as the first three letters of the English names.

Example patterns:

0 0 * * * * = the top of every hour of every day 0 0 8-10 * * * = 8, 9 and 10 o'clock of every day 0 0/30 8-10 * * * = 8:00, 8:30, 9:00, 9:30 and 10 o'clock every day 0 0 9-17 * * MON-FRI = on the hour nine-to-five weekdays 0 0 0 25 12 ? = every Christmas Day at midnight

Directory

Auto create directory

Filename pattern

Ignore hidden

Prevent duplicates

Delete files

Item reader type

Fragment root element names

Strict

Result type

Queue size

Filename regex

Filter

Use watch service

Watch events

Channel

Use default poller

Id

Resource

Line mapper type

Line tokenizer type

Field set mapper type

Scope

Lines to skip

Charset

Comment prefix

Strict

Line separator policy

Line suffix

Ignore whitespace

Quote character

Continuation character

Delimiter

Strict

Include fields

Max messages per poll

Receive timeout

Send timeout

Task executor

Error channel

Trigger type

Time unit

Fixed delay

Fixed rate

Cron