Contents
About
ILINX Capture Format Converter, a key component of ILINX Capture and the ILINX ECM Platform, provides the ability to automatically convert docs to PDF or TIFF format without user interaction.
This IXM allows for content format conversions on batch documents. These format transformations can include the conversion of Microsoft Office documents into either PDF, PDF/A, or multipage TIFF files. The ILINX Capture Format Converter can also leverage other market-leading technologies to perform advanced conversions, including the generation of searchable PDF documents and numerous image file formats.
This guide contains information on how to configure the ILINX Capture Format Converter application, Administrator tools and options.
Note: This guide assumes that installation of the ILINX Capture product has taken place. For complete installation documentation, please see the ILINX Installation Guide.
User IDs and access
It is a recommended best practice to create a dedicated user account for your ILINX Format Converter service to run on. The type of account, either domain or local, should be dictated by your corporate security guidelines. This account will allow you to grant read and write privileges to the service so that it can deliver metadata and/or content to the file system and maintain a working directory.
System requirements
For general information on supported and recommended hardware, OS, web browsers, databases etc. for ILINX products, please see document titled: “ILINX Support Matrix.”
Installing
To install ILINX Capture Format Converter perform the following steps:
Step 1: Run the .msi installer file.
Step 2: Accept the license agreement and follow the instructions in the installation wizard.
Step 3: Specify a different installation folder if desired. Otherwise, accept the default folder and complete the installation.
Note: If you wish to use the ABBYY FineReader and ABBYY Recognition Server methods, you will need to install and configure the ABBYY products separately. Otherwise, you can use ILINX Capture Format Converter’s built-in method. If you are converting MS Office documents or using ABBYY functions, you will need to change configuration settings and permissions.
Note: Though the configuration for the ILINX Format Converter takes place within the workflow, the format conversion service is actually performed on the server where the ILINX Format Converter service resides. Thus, the use of this IXM requires the ILINX Format Converter service to be installed on the same or different server.
Post-Installation Configuration
Account permissions
Step 1: Go to Windows Start > run, type “dcomcnfg” and then enter.
Step 2: Click on Component Services in the left panel, double click Computers then My Computer and then on DCOM Config.
Step 3: Right-click on ABBYY FineReader 10.0 Engine Loader (Local Server), and click Properties.
Step 4: When the dialogue box opens, go to Launch Permissions and click on the Security tab.
Step 5: Click Customize and then on Edit to specify the accounts that can launch the application.
Note: On a 64-bit operating system the registered DCOM-application is available in the 32-bit MMC console, which can be run using the following command line (Run CMD as Administrator): mmc comexp.msc /32.
To register the FREngine.dll when installing your application on the server, use the following command line (Run CMD as Administrator): regsvr32 <path to FREngine.dll>.
We recommend that you use a Network license both for debugging your server application and for running it.
Registry configuration
To ensure ILINX Format Converter does not encounter errors when converting certain file formats (for example, from Excel documents to PDF), you will need to make this brief change to the registry. Follow these steps to complete the process.
Important Note: This fix involves manual registry changes. Before proceeding, a full registry backup is recommended.
Step 1: Launch Windows Registry Editor (click the Windows Start button –> Go to Run… –> Type in regedit)
Step 2: Navigate to the following key: HKEY_CURRENT_USER\Software\Microsoft\Office\[OfficeVersionNumber]\Excel\Options
The [OfficeVersionNumber] must match the version number of your installed Microsoft Office. The version number corresponds to the release of Office as follows:
Office 2013 –> 15.0
Step 3: Insert a new DWORD value:
> Value name: FullCalcOnLoadOldFile
> Value type: REG_DWORD
> Value data: 0
Configuring
ILINX Capture Format Converter’s functionality is controlled by configurable jobs that you will create for different scenarios. The job configuration interface is housed within the ILINX Capture workflow designer. This allows you to configure jobs for specific batch profiles and to easily automate the conversion process within the workflows.
GetBatchInfo/SetBatchInfo: These are not required. If your system administrator does not provide a batch variable to this activity, the IXM will retrieve and use the current batch instance from the database.
Parameters
Name
|
Type
|
Description
|
---|---|---|
Batch | BatchInstance | (Optional) If you do not provide a batch variable to this activity, the IXM will retrieve and use the current batch instance from the database. |
Configure | N/A | Displays additional configuration screen. See below for more information. |
DisplayName | string | Enter a name to identify this FormatConverter in your workflow. |
Result | BatchInstance | The BatchInstance object that holds your batch as modified by the FormatConverter. Place your batch variable here if you passed it as a parameter into the Batch field described above. |
Initial Setup
Drag a FormatConverter IXM into the workflow. Enter the name of your batch variable into the Batch field in the right-hand panel. If you wish to store the Boolean result of the success of the converter, create a workflow variable and enter it in the Result field in the right-hand panel. Click OK to finish.
Step 1: Click the configure button next to the Configure This will open a dialog in which you connect to the ILINX Capture Format Converter web service.
Replace “localhost” with the server name where your ILINX Capture Format Converter service is installed.
Step 2: Click OK. You will now see the job configuration screen.
> Enter a job name.
> A web activation ID is automatically generated.
> Specify a temporary working folder.
> Selecting Searchable PDF will generate searchable PDF files from TIFFs as well as image-based PDF files without requiring a separate engine installation.
> When Merge to single PDF is checked, the individual pages of each document in the batch will be combined into a single PDF per document in the batch.
> If you want to use PDF/A-1b release mode, check the box.
You may then choose to activate database auditing, to keep the original file after conversion, and to resize the converted file to 8.5 by 11 inches.
Step 3: If you would like to maintain a log of the job’s activity, select the Use auditing check box. This is an optional function. If you select the checkbox, click on the Update audit.
> In the resulting dialog box, enter the Server name of your SQL Server or select it from the drop down list.
> Enter your User name and Password
> Select your Database name from the drop down list.
> Select your Table name from the drop down list or click on the Or generate new audit table button.
> Map the fields between the database and audit fields.
> Set the Audit log retention time (days) (optional) using the up or down buttons provided.
Next, you will choose your method, mode, and details settings. These are each covered in detail in the following section.
Method, Mode, & Detail Settings
Each job will have a combination of method and mode options. The detail settings displayed in the configuration interface will change depending on the combination of method and mode you choose.
Built-in Method and To PDF Mode
> Built-in method paired with To PDF mode:
– Every page of the document will be converted to PDF. The batch can contain any mix of supported document types. The pages will not automatically be merged.
– You can choose to merge all of the pages into one PDF. This option will only work for image-only PDFs. The batch will go to the error queue if it contains a mix of image and non-image pages.
You may choose the option to throw an error if the job encounters a batch containing Microsoft Office documents or other document types.
Built-in Method and To TIFF Mode
> Built-in method paired with To TIFF mode:
– Select options for the conversion of PDFs to TIFFs, including the format and DPI.
– Select what TIFF format image files should be converted to. Choose from among the following three conversion options:
» Each page to individual TIFF Note: Non-image and non-PDF pages will be ignored; nothing will be routed to the error queue when mixing.
» Each file to individual TIFF Note: Non-image and non-PDF pages will be ignored; nothing will be routed to the error queue when mixing.
» All converted files to one multipage TIFF
– Select the Resize image file to 8.5X11 check box (optional).
Note: The default maximum for the number of pages per document is 999 when using Built-in PDF to TIFF conversion. If you wish to change this, find this line in the web.config file of both ILINX Capture and ILINX Format Converter and change the value to your desired number of pages:
<add key=”microsoft:xmldictionaryreader:maxmimeparts” value=”50000″/>
Advanced Method Paired With To PDF Mode
Note: The Advanced method provides enhanced performance and capabilities and is licensed separately from the ILINX Format Converter product.
The advanced method for conversion to PDF provides detailed options for the output files. First is a dropdown list of export modes:
> Text With Pictures
> Text On Image
> Image On Text
> Image Only
Next you may choose any combination from four options:
> PDF/A-1b release mode
> Searchable PDF
> Use PDF MRC compression
> Merge images into a single PDF*
> Merge PDF and Microsoft Office file formats into a single PDF*
*These options will work only on image file formats. If any non-image files are detected in the batch, the job will ignore them.
Advanced Method Paired With To TIFF Mode
The advanced method for conversion to TIFF provides an extensive list of compression options for the output files. This list includes options for both full color and black and white conversions.
> TIFF Black and White Uncompressed
> TIFF Black and White CCITT Group 3
> TIFF Black and White CCITT Group 3 Fax
> TIFF Black and White CCITT Group 4
> TIFF Black and White PackBits
> TIFF Color Uncompressed
> TIFF Color PackBits
> TIFF Color JPEG JFIF
> TIFF Black and White Lempel-Ziv-Welch (LZW)
> TIFF Color LZW
> TIFF Black and White Zip
> TIFF Color Zip
After you have selected a compression option, choose the desired DPI of the output files. Finally, you may choose to convert your output files into single-strip TIFFs to reduce file size.
ABBYY Recognition Server Method
ILINX Format Converter can be used in conjunction with the ABBYY Recognition Server to perform advanced document conversion operations. ABBYY Recognition Server provides powerful server-based OCR functionality for automated document capture and PDF conversion. Designed for mid- to high-volume batch processing, it enables organizations and scanning service providers to establish cost-efficient processes for converting paper, as well as TIFF, JPEG, and PDF image documents into electronic files suitable for full-text search and long-term digital archiving.
If you wish to use this product, it can be purchased separately from ImageSource and configured to run in conjunction with your instance of ILINX Format Converter.
Note: The default maximum for the number of pages per document is 999 when using ABBYY Recognition Server conversion. If you wish to change this, find this line in the web.config file of both ILINX Capture and ILINX Format Converter and change the value to your desired number of pages:
<add key=”microsoft:xmldictionaryreader:maxmimeparts” value=”50000″/>
Completing Configuration
Once you have completed everything in the configuration screen, click on the OK button to continue or the Cancel button to exit without saving your changes.
Methods
|
Modes
|
Input File Types Supported
|
Options
|
---|---|---|---|
Built-in | To PDF Note: This option automatically converts most MS Office file formats* to PDF. |
TIFF JPEG PNG Searchable PDF |
Searchable PDF Merge to single PDF PDF/A-b release mode |
To TIFF | Merge to single TIFF | ||
ABBYY FineReader | To PDF Note: This option automatically converts most MS Office file formats* to PDF. |
BMP PCX, DCX JPEG, JPEG 2000 JBIG2 PNG GIF TIFF DjVu WDP |
PDF/A-1b release mode PDF Export Mode › Text with pictures › Text on image › Image on text › Image only Merge images into a single PDF (image format only) Merge PDF and most MS Office file formats into a single PDF |
To TIFF | BMP PCX, DCX JPEG, JPEG 2000 JBIG2 PNG GIF TIFF DjVu WDP |
Black and White › Unpacked › TIFF / Black and White / Packbits › TIFF / Black and White / CCITT Group 4 › TIFF / Black and White / ZIP compression › TIFF / Black and White / LZW compression Gray › Unpacked › TIFF / Gray / Packbits › TIFF / Gray / JPEG compression › TIFF / Gray / ZIP compression › TIFF / Gray / LZW compression Color › Unpacked › TIFF / Color / Packbits › TIFF / Color / JPEG compression › TIFF / Color / ZIP compression › TIFF / Color / LZW compression |
* Word, Excel, PowerPoint, Visio and Publisher
Note: As of ILINX version 6.0, embedded Excel macros are not currently tested or supported.
Getting Best Performance
Please note that there are many factors that affect the operational performance of the ILINX Capture Format Converter service. Physical factors such as the availability of shared machine resources (i.e., memory), processors, and even the amount of free disk space can impact the throughput. Network performance can also affect the ILINX Capture Format Converter operations as it communicates with other ILINX services, Abbyy Recognition Server (if deployed), and the database. The file size, number of pages, and type of documents being converted are the most critical variables that directly affect performance. Carefully monitor your system when processing.
> Generally, all individual file conversions should be limited to a maximum of 50,000 pages and a total file size of 250MB regardless of the configured conversion. As noted above, there are many variables that could increase or decrease this recommended limit.
> Conversions to color TIFF: The color TIFF format generated by ILINX Capture Format Converter ensures superior document representation. However, this format can result in very large TIFF documents. The final size of the converted TIFF is dependent on the original document characteristics including format (TIFF, PDF, JPEG, etc.), number of pages, color depth, page complexity, etc.
> Increasing the resolution of an image will increase the size of the image.
> Using the merge option to create a single PDF or multipage TIFF from multiple images: the size and number of images being merged into the final document will be the critical factor with this conversion.
If you expect that your processing requirements will exceed the operational guidelines described here, please consider these options:
> Test the upper limits of your conversion needs in your environment with your largest documents.
> Increase size of server farm and or machine resources (i.e. memory, processors, etc.).
> Break large documents into multiple, smaller documents.
> Lower inbound and/or outbound image resolutions.