C# component to read Excel data using OleDb in SSIS

This is how we can read data from Microsoft Excel using OleDb in SSIS;

public override void CreateNewOutputRows()
    {
        //Change this to your filename you do not need a connection manager
        string fileName = @"E:\SFTP\RSS\Results.xlsx";  
        string SheetName = "Sheet1";  
        string cstr = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fileName + ";Extended Properties=\"Excel 12.0;HDR=YES;IMEX=1\"";

        using (System.Data.OleDb.OleDbConnection xlConn = new System.Data.OleDb.OleDbConnection(cstr))
        {
           xlConn.Open();
           System.Data.OleDb.OleDbCommand xlCmd = xlConn.CreateCommand();
           xlCmd.CommandText = "Select * from [" + SheetName + "$]"; //I assume this is the data you want
           xlCmd.CommandType = CommandType.Text;
           using (System.Data.OleDb.OleDbDataReader rdr = xlCmd.ExecuteReader())
           {
              while (rdr.Read())
              {
                 for (int i = 4; i < rdr.FieldCount; i++) //loop from 5th column to last
                 {
                    //The first 4 columns are static and added to every row
                    Output0Buffer.AddRow();
                    Output0Buffer.UniqueID = Int32.Parse(rdr[0].ToString());
                    Output0Buffer.Year = Int32.Parse(rdr[1].ToString());
                    Output0Buffer.ReportingWave = rdr.GetString(2);
                    Output0Buffer.SubmissionDate = rdr.GetString(3);
                    Output0Buffer.Question = rdr.GetName(i);
                    Output0Buffer.Answer = rdr.GetString(i);
                 }

            }
          }
          xlConn.Close();
       }
    }

We need to add output columns with correct data type. There is no need to setup connection manager because we are using it in the code.

This code will successfully import a file that looks like this;

My recent experience with Excel Sheet.

Check the data type of the column. If the type is “General”, then ACE will determine data type based on first 8-10 rows. If those are numeric then it might ignore alpha-numeric data in the column. Try to change column data type to “Text” and see what happens. It will work. You would be able to see numeric and non-numeric data in your data pipe line.

Resources;

https://stackoverflow.com/questions/61514481/get-an-ssis-c-sharp-script-component-to-read-from-excel-via-oledb

https://docs.microsoft.com/en-us/sql/integration-services/extending-packages-scripting/data-flow-script-component/configuring-the-script-component-in-the-script-component-editor?view=sql-server-ver15

Don’t’ fail parent package if child package fails

I like to keep running child packages inside sequence container even if one of them fails. Currently they are failing;

The work around is to set Sequence container “MaximumErrorCount” property to 2 from 1. After doing that, I am still getting failure error.

What we need to do is to fake the result assuming master and child packages have proper logging in place if they fail. Go to the Properties of “ExecPkg – Child Package” and set “ForceExecutionResult” to Success.

Do the same for master package. Run the package;

You can see that child package has no issues despite the fact that this has been failed. We can confirm that from log table if it’s enabled on package level.

These are the default values for a new container.

Now lets stop and study. If we compare the package behavior against the property settings, this looks wrong. Here we have set FailPackageOnFailure=False, yet a Sequence Container failure is causing a Package failure. Why is this? Unintuitive attribute names. See this Microsoft Connect issue. You are not alone in your confusion. The official explanation from Microsoft is this.

Despite some pretty circular previous messages, we believe that the feature is behaving as designed. When you set FailParentOnFailure to false, the parent will not fail until the number of failures in the child exceeds the MaximumAllowedErrors threshold. When you set FailparentOnFailure to true, the parent will fail on the first occurence of an error regardless of the MaximiumAllowedErrors threshold.

Updating SSIS packages for specific SQL server version

Updating SSIS packages for a specific SQL server target version (valid only for SQL Server 2012 and above)

  1. Right click on a project, click properties and select your target server version from the drop down

Click Apply, read the warning then click Yes as shown below, and finally click Ok

Re-open the packages and notice how the development platform (control flow and data flow tabs) changes

Before

After

Check for any odd behavior (discontinued/deprecated tasks), build/rebuild your packages, test your packages, and deploy/redeploy them.

Configure Azure DevOps for SSIS Projects

Install SSIS DevOps Tool in Azure DevOps organization.

Click on “Browse marketplace” and Search for “SSIS DevOps Tools”. Here is the link.

https://docs.microsoft.com/en-us/sql/integration-services/devops/ssis-devops-overview?view=sql-server-ver15

If you want to install extensions using CLI, here is the link;

https://docs.microsoft.com/en-us/azure/devops/marketplace/overview?tabs=azure-devops-cli&view=azure-devops&preserve-view=true#add-an-extension

Select “SSIS DevOps Tool”;

Click Install and you are done.

Install a second extension “Azure DevOps Open in Excel”. This helps to view work items in Excel and VS.

Go back to your Azure DevOps organization. The installed extensions are here;

Follow this article to create SSIS DevOps pipeline.

https://docs.microsoft.com/en-us/sql/integration-services/devops/ssis-devops-overview?view=sql-server-ver15

Possible ways to run SSIS Packages

As far as launching the package, I think we have three reasonably safe options.  The first is xp_cmdshell, which would probably work fine and be pretty simple to implement on a queue (probably the best option in my opinion). 

The second would be to create a single job agent job for each allowed broker process and have the queue processor call sp_start_job only if the job is not already running (pretty easy to check). 

The third option would be to use a CLR procedure to call a web service and have the web service start the SSIS package.  You would have to dive into C# a bit and create your own web service to start an SSIS package, but it is not much code.  Doing it this way, you would not have to allow unsafe assemblies because the web service would be out of process.

Keep in mind that starting a job is asynchronous, and would require additional coding in both the job and the application that starts the processes, as well as creating a job for every SSIS package that needs to be run.  Using xp_cmdshell is synchronous, and requires that xp_cmdshell access be enabled (requires security configuration).  It does require the writing of a command line for each package, but that can also be stored in a table and pulled when needed.

BOL shows code for running SSIS packages in an application.  I am hoping, though not encouraged, that it may be possible to use a CLR stored procedure to run a SSIS package inside the database by simply passing the name of the package to run.  We won’t need 100’s of jobs, or enable xp_cmdshell to accomplish the task.  I am, however, pragmatic and will use which ever method turns out to be the best/easiest/quickest/etc way to meet the goals of the system: to reliably import the data from numerous source systems in the most efficient way possible and to recover from hardware/network failures/outages both planned and unplanned.

Here is another option;

https://www.mssqltips.com/sqlservertip/2992/how-to-execute-an-integration-services-ssis-package-from-a-sql-server-stored-procedure/