Archive for April, 2008

SSIS Excel Data Source: Error with output column “Comments” (5691) on output “Excel Source Output” (5596). The column status returned was: “Text was truncated or one or more characters had no match in the target code page.”

April 28, 2008

Using SQL Server 2005 SSIS to import an Excel Spreadsheet into a SQL Server database,  I got the error shown in the title.

The problem was that the Excel spreadsheet column ‘Comments’ contained data which was bigger than the maximum size of the Excel Source External Column ‘Comments’ as defined in the Excel Data Source.

The data types of the Excel Source External Columns are automatically determined by SSIS based on sampling of the spreadsheet rows. For string columns if the sample does not reveal cells with greater than 255 characters, the data type is always DT_WSTR (Unicode string) which has maximum length 255. If the sample reveals cells with greater than 255 chars then the datatype will be DT_NTEXT (Unicode text stream).

I was unaware of how the Excel Source datatypes were derived and since the SSIS sample came up with only short strings in ‘Comments’, it assigned it a datatype of DT_WSTR. My full spreadsheet data for ‘Comments’, however, contained cells with strings of greater than this length, hence the truncation error when the SSIS package was executed.

The reason that the datatype for an Excel spreadsheet string column are always DT_WSTR or DT_NTEXT is given by this Microsoft Technet article, ‘Excel Source’ which explains that the datatypes are determined by the Jet Database Engine used by the SSIS Excel Connection Manager to communicate with Excel.

It is impossible to edit the datatype of an Excel Data Source column in SSIS therefore it is impossible to avoid this truncation error unless you edit your spreadsheet data, convert the Excel spreadsheet to a flat file, or increase the sample size as described in the Technet article in the link above.

I decided to convert the Excel spreadsheet to a flat file.

Convert Excel Spreadsheet To A Pipe-Delimited File

This is where my forty minutes turned into three hours.

Since I was not aware of how to increase SSIS sample size at the time, I tried to convert the Excel spreadsheet into a flat-file. Stupidly, I first converted it into a comma-delimited (.csv) file. Of course I ran into problems with phantom columns and incorrectly parsed data because my spreadsheet has embedded commas in its data.

So I tried a Find and Replace in Excel to convert commas to spaces and ran into the Excel ‘Formula Too Big’ error for cell data greater than 900 characters. Since Find and Replace could not work, I located a VBA macro written by aht-ga which did it. This Macro works, but took three hours to run over 1MB of data (cell selection was 1230 rows x 25 columns).

Gibbering with horror at the prospect of ever having to run this Macro again, an epiphany struck me in the form of rememberance of pipe-delimited files. Excel can be forced to make pipe-delimited files by this trick posted by Martin Naranjo.

I now possesed a beautiful pipe-delimited file which I included in my SSIS package using a Flat File Data Source. Flat File Data Sources, unlike Excel Data Sources, can have the datatype of their External Columns edited.

Editing The DataType Of SSIS Data Source Columns

Bring up the SSIS Connection Manager Editor by double clicking on the Connection Manager
for your Data Source in the package’s Connection Managers tab. When you do this for an Excel DataSource you will see a landscape as barren as the Gobi Desert Municipal Swimming Centre, but if you do this for the far more flexible Flat File Data Source a plethora of options will greet you. Click on ‘Advanced’ and slash away to your heart’s content.

Moral Of The Story
When using an Excel Data Source, increase your SSIS sample size to a humungous number so that it can assign the correct datatype to the Excel Data Source External Columns.

Problem Metrics:

    Life Energy Consumed: 1 Hr. (not including Macro running time)

Caveats, Apologies and Clarifications:

Aht-ga’s VBA Macro is excellent. It just took too long to run in my scenario.

Advertisements