SSIS: Object variable

The SSIS Object variable is a generic object, but I have never seen it used as anything other than a dataset — which is the default behavior that is accessible to you when you dump records into the SSIS object type variable — the first table in the dataset object will contain your records.

If you want to see the values inside the object in debug mode, you will need to cast it as something in order to see anything, for example, in a script task, if you cast the SSIS object variable to a dataset, you can then debug into the script to look at it’s content and structure. Similarly, the foreach enumerator is casting the object as a dataset and you access the first tables columns and you can debug and see row by row the values in the set.

DataSet ds = Dts.Variables["User::vObjectList"].Value as DataSet;
foreach (DataTable tbl in ds.Tables)
{
   foreach(DataRow row in tbl.Rows)
   {
       foreach (DataColumn column in tbl.Columns)
       {
            MessageBox.Show(row[column].ToString());
       }
   }
}

Some more examples;

Here are a couple of examples to demonstrate what the mysterious object should be cast to in order to further explore it in .NET.

ADO.NET (using a System.Data.DataSet):

DataSet ds = (DataSet)Dts.Variables["obj"].Value;
MessageBox.Show(ds.Tables[0].Rows.Count.ToString());

OLE DB:

System.Data.OleDb.OleDbDataAdapter da = new System.Data.OleDb.OleDbDataAdapter();
DataTable dt = new DataTable();
da.Fill(dt, Dts.Variables["obj"].Value);
MessageBox.Show(dt.Rows.Count.ToString());

SSIS: Script task for connecting ADO.NET and Populating Data Table

This is how;

Using(SqlConnection conn = (SqlConnection)Dts.Connections["AdoNet"].AcquireConnection(Dts.Transaction)){

if (conn.State != ConnectionState.Open){
 conn.Open();}

SqlCommand cmd = new SqlCommand();
cmd.Connection = conn;
cmd.CommandType = CommandType.Text;
cmd.CommandText = queryString;
SqlDataAdapter da = new SqlDataAdapter(cmd);
da.Fill(myDataTable);
}

Resource;

https://stackoverflow.com/questions/41733531/ssis-script-task-connecting-the-ado-net-and-populating-datatable

KeyValue pair class

The KeyValue pair class stores a pair of values in a single list.

It’s super easy to create a list of single value. Here is an example;

List<string> firstList =  new List<string> {"'cover page$'", "'i# milestones$'", "'ii# tasks$'" };
List<string> secondList = new List<string> { "'cover page$'", "'i# milestones$'", "'ii# tasks$'" };
var exceptList = secondList.Except(firstList);
Console.WriteLine($"\nsingle string: Value in second list that are not in first List");
foreach (var val in exceptList)
{
     Console.WriteLine($"single string: {val}");
}

What if we want to store pair of values instead of creating any custom classes? We can use KeyValue pair class;

var parentList = new List<KeyValuePair<string, string>>()
{
    new KeyValuePair<string, string>("v2-2021", "'cover page$'"),
    new KeyValuePair<string, string>("v2-2021", "'i# milestones$'"),
    new KeyValuePair<string, string>("v2-2021", "'ii# tasks$'"),
    new KeyValuePair<string, string>("v2-2021", "'iii# spendplan$'"),
};
var parentSubList = new List<KeyValuePair<string, string>>()
{
    new KeyValuePair<string, string>("v2-2021", "'cover page$'"),
    new KeyValuePair<string, string>("v2-2021", "'i# milestones$'"),
    new KeyValuePair<string, string>("v2-2021", "'ii# tasks$'"),
};
var exceptList1 = parentSubList.Except(parentList);
Console.WriteLine($"\nparentSubList->parentList: Value in second list that are not in first List");
foreach (var val in exceptList1)
{
    Console.WriteLine($"{val}");
}
IsASubset = parentSubList.All(i => parentList.Contains(i));
Console.WriteLine($"\nparentSubList->parentList: all members of subset (parentSubList) exists in list1 (parentList): {IsASubset}");
}

KeyValue pair class can also be used like this;

var myList = new List<KeyValuePair<string, string>>();
//add elements now
myList.Add(new KeyValuePair<string, string>("v2-2021", "'cover page$'"));
myList.Add(new KeyValuePair<string, string>("v2-2021", "'i# milestones$'"));
myList.Add(new KeyValuePair<string, string>("v2-2021", "'ii# tasks$'"));
foreach (var val in myList)
{
    Console.WriteLine($"Another style: {val}");
}

LINQ methods, for example Except can be used without implementing any Comparer classes.

Using LINQ methods to compare objects of custom type

I have a master list;

v2-2021 – ‘cover page$’
v2-2021 – ‘i# milestones$’
v2-2021 – ‘ii# tasks$’
v2-2021 – ‘iii# spendplan$’

I have a sub list;

v2-2021 – ‘cover page$’
v2-2021 – ‘i# milestones$’
v2-2021 – ‘ii# tasks$’

I want to make sure that all elements in my sub list exists in master list.

To solve this i have created this class;

internal class ExcelVersions
{
    public string VersionNumber { get; set; }
    public string TableName { get; set; }

}

I have created following objects based on this class;

List<ExcelVersions> cfirstList = new List<ExcelVersions>
{
                new ExcelVersions { VersionNumber = "v2-2021", TableName = "'cover page$'" },
                new ExcelVersions { VersionNumber = "v2-2021", TableName = "'i# milestones$'" },
                new ExcelVersions { VersionNumber = "v2-2021", TableName = "'ii# tasks$'" },
                new ExcelVersions { VersionNumber = "v2-2021", TableName = "'iii# spendplan$'" }
            };
            List<ExcelVersions> csecondList = new List<ExcelVersions>
            {
                new ExcelVersions { VersionNumber = "v2-2021", TableName = "'cover page$'" },
                new ExcelVersions { VersionNumber = "v2-2021", TableName = "'i# milestones$'" },
                new ExcelVersions { VersionNumber = "v2-2021", TableName = "'ii# tasks$'" }
            };
            //var cexceptList = csecondList.Except(cfirstList, new ExcelVersionsComparer());
            var cexceptList = csecondList.Except(cfirstList);
            Console.WriteLine($"\ncSecondList-->cFirstList: Value in second list that are not in first List");
            foreach (var val in cexceptList)
            {
                Console.WriteLine($"{val.TableName}");
            }
            IsASubset = csecondList.All(i => cfirstList.Contains(i));
            Console.WriteLine($"\ncSecondList-->cFirstList: all members of subset (cSecondList) exists in list1 (cFirstList): {IsASubset}");
}

This is the result i get;

To my surprise, none of LINQ comparison method worked on custom class. What’s wrong? The answer is in the LINQ implementation. To be correctly processed by the Except method, a type must implement the IEquatable<T> interface and provide its own Equals and GetHashCode methods.

Re-writing out custom type;

internal class ExcelVersions : IEquatable<ExcelVersions>
{
        public string VersionNumber { get; set; }
        public string TableName { get; set; }

        public bool Equals(ExcelVersions other)
        {
            //check whether the compare object is null
            if (Object.ReferenceEquals(other, null)) return false;
            //check whether the compared object references the same data
            if (Object.ReferenceEquals(this, other)) return true;
            //check whether the object's properteis are equal
            return VersionNumber.Equals(other.VersionNumber) && TableName.Equals(other.TableName);
        }

        //if Equals returns true for a pair of objects
        //GetHashCode must return the same value for these objects
        public override int GetHashCode()
        {
            //Get the hash code for the version number
            int hashVersionNumber = VersionNumber == null ? 0 : VersionNumber.GetHashCode();
            //get the hash code for the table name
            int hashTableName = TableName.GetHashCode();

            //calculate the hash code for the object
            return hashVersionNumber ^ hashTableName;
    }
}

This time the results are;

OK. Custom class is working but what if we cannot modify the type? What if it was provided by a library and we have no way of implementing the IEquiatable<T> interface. The answer is to create our own equality comparer and pass it as a parameter to the Except method.

The equality comparer must implement the IEqualityComparer<T> interface and provide GetHashCode and Equals method like this;

internal class ExcelVersionsComparer : IEqualityComparer<ExcelVersions>
{
        public bool Equals(ExcelVersions x, ExcelVersions y)
        {
            if (Object.ReferenceEquals(x, y))
                return true;

            if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
                return false;
            return x.Equals(y);
        }

        public int GetHashCode(ExcelVersions excelVersion)
        {
            if (Object.ReferenceEquals(excelVersion, null)) return 0;

            int hashVersion = excelVersion.VersionNumber == null ? 0 : excelVersion.GetHashCode();
            int hashTable = excelVersion.TableName.GetHashCode();

            return hashVersion ^ hashTable;
        }
}

This is how we are going to pass the comparer to the Except method;

var cexceptList = csecondList.Except(cfirstList, new ExcelVersionsComparer());

These rules don’t just apply to Except method. For example, the same is true for the Distinct, Contains, Interset and Union methods. Generally, if you see that a LINQ method has an overload that accepts the IEqualityComparer<T> parameter, means that to use it with your own data type, you need to either implement IEquatable<T> in your class or create your own equality comparer.

If you want to use built-in class instead of creating custom class, consider this class;

Reference

https://docs.microsoft.com/en-us/archive/blogs/csharpfaq/how-to-use-linq-methods-to-compare-objects-of-custom-types

https://stackoverflow.com/questions/16824749/using-linq-except-not-working-as-i-thought

https://grantwinney.com/how-to-compare-two-objects-testing-for-equality-in-c/

https://www.tutorialspoint.com/how-to-find-items-in-one-list-that-are-not-in-another-list-in-chash


SSIS: Read Excel Tables

Define two variables; ExcelFile –> String and ExcelTables –> Object. Drop a script task on designer surface;

Here is the script;

public void Main()
{
            string excelFile;
            string connectionString;
            OleDbConnection excelConnection;
            DataTable tablesInFile;
            int tableCount = 0;
            string currentTable;
            int tableIndex = 0;

            //string[] excelTables = new string[5];
            string[] excelTables = new string[20];

            excelFile = Dts.Variables["ExcelFile"].Value.ToString();
            connectionString = $"Provider=Microsoft.ACE.OLEDB.12.0;Data Source={excelFile};Extended Properties=Excel 12.0";
            excelConnection = new OleDbConnection(connectionString);
            excelConnection.Open();
            tablesInFile = excelConnection.GetSchema("Tables");
            tableCount = tablesInFile.Rows.Count;

            foreach (DataRow tableInFile in tablesInFile.Rows)
            {
                currentTable = tableInFile["TABLE_NAME"].ToString();
                char lastCharacter = currentTable[currentTable.Length - 2];
                if (lastCharacter == '$')
                {
                    excelTables[tableIndex] = currentTable;
                    tableIndex += 1;
                }
            }

            Dts.Variables["ExcelTables"].Value = excelTables;

            Dts.TaskResult = (int)ScriptResults.Success;
}

You can display object variables values using this;

string[] tablesInFile = (string[])Dts.Variables["ExcelTables"].Value;
            foreach (string tableInFile in tablesInFile)
            {
                results += " " + tableInFile + EOL;
            }

            MessageBox.Show(results, "Results", MessageBoxButtons.OK, MessageBoxIcon.Information);