KeyValue pair class

The KeyValue pair class stores a pair of values in a single list.

It’s super easy to create a list of single value. Here is an example;

List<string> firstList =  new List<string> {"'cover page$'", "'i# milestones$'", "'ii# tasks$'" };
List<string> secondList = new List<string> { "'cover page$'", "'i# milestones$'", "'ii# tasks$'" };
var exceptList = secondList.Except(firstList);
Console.WriteLine($"\nsingle string: Value in second list that are not in first List");
foreach (var val in exceptList)
{
     Console.WriteLine($"single string: {val}");
}

What if we want to store pair of values instead of creating any custom classes? We can use KeyValue pair class;

var parentList = new List<KeyValuePair<string, string>>()
{
    new KeyValuePair<string, string>("v2-2021", "'cover page$'"),
    new KeyValuePair<string, string>("v2-2021", "'i# milestones$'"),
    new KeyValuePair<string, string>("v2-2021", "'ii# tasks$'"),
    new KeyValuePair<string, string>("v2-2021", "'iii# spendplan$'"),
};
var parentSubList = new List<KeyValuePair<string, string>>()
{
    new KeyValuePair<string, string>("v2-2021", "'cover page$'"),
    new KeyValuePair<string, string>("v2-2021", "'i# milestones$'"),
    new KeyValuePair<string, string>("v2-2021", "'ii# tasks$'"),
};
var exceptList1 = parentSubList.Except(parentList);
Console.WriteLine($"\nparentSubList->parentList: Value in second list that are not in first List");
foreach (var val in exceptList1)
{
    Console.WriteLine($"{val}");
}
IsASubset = parentSubList.All(i => parentList.Contains(i));
Console.WriteLine($"\nparentSubList->parentList: all members of subset (parentSubList) exists in list1 (parentList): {IsASubset}");
}

KeyValue pair class can also be used like this;

var myList = new List<KeyValuePair<string, string>>();
//add elements now
myList.Add(new KeyValuePair<string, string>("v2-2021", "'cover page$'"));
myList.Add(new KeyValuePair<string, string>("v2-2021", "'i# milestones$'"));
myList.Add(new KeyValuePair<string, string>("v2-2021", "'ii# tasks$'"));
foreach (var val in myList)
{
    Console.WriteLine($"Another style: {val}");
}

LINQ methods, for example Except can be used without implementing any Comparer classes.

Using LINQ methods to compare objects of custom type

I have a master list;

v2-2021 – ‘cover page$’
v2-2021 – ‘i# milestones$’
v2-2021 – ‘ii# tasks$’
v2-2021 – ‘iii# spendplan$’

I have a sub list;

v2-2021 – ‘cover page$’
v2-2021 – ‘i# milestones$’
v2-2021 – ‘ii# tasks$’

I want to make sure that all elements in my sub list exists in master list.

To solve this i have created this class;

internal class ExcelVersions
{
    public string VersionNumber { get; set; }
    public string TableName { get; set; }

}

I have created following objects based on this class;

List<ExcelVersions> cfirstList = new List<ExcelVersions>
{
                new ExcelVersions { VersionNumber = "v2-2021", TableName = "'cover page$'" },
                new ExcelVersions { VersionNumber = "v2-2021", TableName = "'i# milestones$'" },
                new ExcelVersions { VersionNumber = "v2-2021", TableName = "'ii# tasks$'" },
                new ExcelVersions { VersionNumber = "v2-2021", TableName = "'iii# spendplan$'" }
            };
            List<ExcelVersions> csecondList = new List<ExcelVersions>
            {
                new ExcelVersions { VersionNumber = "v2-2021", TableName = "'cover page$'" },
                new ExcelVersions { VersionNumber = "v2-2021", TableName = "'i# milestones$'" },
                new ExcelVersions { VersionNumber = "v2-2021", TableName = "'ii# tasks$'" }
            };
            //var cexceptList = csecondList.Except(cfirstList, new ExcelVersionsComparer());
            var cexceptList = csecondList.Except(cfirstList);
            Console.WriteLine($"\ncSecondList-->cFirstList: Value in second list that are not in first List");
            foreach (var val in cexceptList)
            {
                Console.WriteLine($"{val.TableName}");
            }
            IsASubset = csecondList.All(i => cfirstList.Contains(i));
            Console.WriteLine($"\ncSecondList-->cFirstList: all members of subset (cSecondList) exists in list1 (cFirstList): {IsASubset}");
}

This is the result i get;

To my surprise, none of LINQ comparison method worked on custom class. What’s wrong? The answer is in the LINQ implementation. To be correctly processed by the Except method, a type must implement the IEquatable<T> interface and provide its own Equals and GetHashCode methods.

Re-writing out custom type;

internal class ExcelVersions : IEquatable<ExcelVersions>
{
        public string VersionNumber { get; set; }
        public string TableName { get; set; }

        public bool Equals(ExcelVersions other)
        {
            //check whether the compare object is null
            if (Object.ReferenceEquals(other, null)) return false;
            //check whether the compared object references the same data
            if (Object.ReferenceEquals(this, other)) return true;
            //check whether the object's properteis are equal
            return VersionNumber.Equals(other.VersionNumber) && TableName.Equals(other.TableName);
        }

        //if Equals returns true for a pair of objects
        //GetHashCode must return the same value for these objects
        public override int GetHashCode()
        {
            //Get the hash code for the version number
            int hashVersionNumber = VersionNumber == null ? 0 : VersionNumber.GetHashCode();
            //get the hash code for the table name
            int hashTableName = TableName.GetHashCode();

            //calculate the hash code for the object
            return hashVersionNumber ^ hashTableName;
    }
}

This time the results are;

OK. Custom class is working but what if we cannot modify the type? What if it was provided by a library and we have no way of implementing the IEquiatable<T> interface. The answer is to create our own equality comparer and pass it as a parameter to the Except method.

The equality comparer must implement the IEqualityComparer<T> interface and provide GetHashCode and Equals method like this;

internal class ExcelVersionsComparer : IEqualityComparer<ExcelVersions>
{
        public bool Equals(ExcelVersions x, ExcelVersions y)
        {
            if (Object.ReferenceEquals(x, y))
                return true;

            if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
                return false;
            return x.Equals(y);
        }

        public int GetHashCode(ExcelVersions excelVersion)
        {
            if (Object.ReferenceEquals(excelVersion, null)) return 0;

            int hashVersion = excelVersion.VersionNumber == null ? 0 : excelVersion.GetHashCode();
            int hashTable = excelVersion.TableName.GetHashCode();

            return hashVersion ^ hashTable;
        }
}

This is how we are going to pass the comparer to the Except method;

var cexceptList = csecondList.Except(cfirstList, new ExcelVersionsComparer());

These rules don’t just apply to Except method. For example, the same is true for the Distinct, Contains, Interset and Union methods. Generally, if you see that a LINQ method has an overload that accepts the IEqualityComparer<T> parameter, means that to use it with your own data type, you need to either implement IEquatable<T> in your class or create your own equality comparer.

If you want to use built-in class instead of creating custom class, consider this class;

Reference

https://docs.microsoft.com/en-us/archive/blogs/csharpfaq/how-to-use-linq-methods-to-compare-objects-of-custom-types

https://stackoverflow.com/questions/16824749/using-linq-except-not-working-as-i-thought

https://grantwinney.com/how-to-compare-two-objects-testing-for-equality-in-c/

https://www.tutorialspoint.com/how-to-find-items-in-one-list-that-are-not-in-another-list-in-chash


SSIS: Read Excel Tables

Define two variables; ExcelFile –> String and ExcelTables –> Object. Drop a script task on designer surface;

Here is the script;

public void Main()
{
            string excelFile;
            string connectionString;
            OleDbConnection excelConnection;
            DataTable tablesInFile;
            int tableCount = 0;
            string currentTable;
            int tableIndex = 0;

            //string[] excelTables = new string[5];
            string[] excelTables = new string[20];

            excelFile = Dts.Variables["ExcelFile"].Value.ToString();
            connectionString = $"Provider=Microsoft.ACE.OLEDB.12.0;Data Source={excelFile};Extended Properties=Excel 12.0";
            excelConnection = new OleDbConnection(connectionString);
            excelConnection.Open();
            tablesInFile = excelConnection.GetSchema("Tables");
            tableCount = tablesInFile.Rows.Count;

            foreach (DataRow tableInFile in tablesInFile.Rows)
            {
                currentTable = tableInFile["TABLE_NAME"].ToString();
                char lastCharacter = currentTable[currentTable.Length - 2];
                if (lastCharacter == '$')
                {
                    excelTables[tableIndex] = currentTable;
                    tableIndex += 1;
                }
            }

            Dts.Variables["ExcelTables"].Value = excelTables;

            Dts.TaskResult = (int)ScriptResults.Success;
}

You can display object variables values using this;

string[] tablesInFile = (string[])Dts.Variables["ExcelTables"].Value;
            foreach (string tableInFile in tablesInFile)
            {
                results += " " + tableInFile + EOL;
            }

            MessageBox.Show(results, "Results", MessageBoxButtons.OK, MessageBoxIcon.Information);

Azure DevOps Commit error: Object reference not set to an instance of an object

I am using Microsoft SQL Server Data Tools for Visual Studio 2017 (SSDT) to develop SSIS packages. Recently i started getting “Object reference not set to an instance of an object” error on commits to Azure DevOps. This is what i did to resolve my errors;

  1. Close Visual Studio
  2. Navigate to “C:\Users\[User Folder]\AppData\Local\Microsoft\Team Foundation\7.0\Cache” folder. Delete all contents.
  3. Restart Visual Studio and try to commit.

All Excel – Programmatically Reading/Writing

ERR: External table is not in the expected format;

The error might be the result of wrong extended properties in connection string. Check code examples below;

Following code will help you to open *.xls files;

string connString = string.Format("Provider = Microsoft.ACE.OLEDB.12.0; Data Source ={0}; Extended Properties = 'EXCEL 8.0 XML;HDR=YES;IMEX=1'", fullPathToExcel);
using (OleDbConnection conn = new OleDbConnection(connString))
{
   conn.Open();
  //...any other instruction
}

Following code will help you to open *.xlsx files;

string connString = string.Format("Provider = Microsoft.ACE.OLEDB.12.0; Data Source ={0}; Extended Properties = 'EXCEL 12.0 XML;HDR=YES;IMEX=1'", fullPathToExcel);
using (OleDbConnection conn = new OleDbConnection(connString))
{
   conn.Open();
  //...any other instruction
}

Excel ADO.NET properties for different excel version are;

Excel 97-2003 Workbook (.XLS) –> “Excel 80”

Excel 2007-2010 Workbook (.XLSX) –> “Excel 12.0 XML”

Excel 2007-2010 Macro-enabled Workbook (.XLSM) –> “Excel 12.0 MACRO”

This is how you can run a select query;

"SELECT * FROM [StoreSheet$A13:I]"

If you are not able to solve External table is not in the expected format error, consider changing driver. try NPOI or ExcelDataReader.

ExcelDataReader Library

This library doesn’t work with .NET core.

EPPlus Library

EPPlus library works with .NET core.

Working with Excel Files with the help of Script Task

Integration Services provides the Excel connection manager, Excel source, and Excel destination for working with data stored in spreadsheets in the Microsoft Excel file format. The techniques described in this topic use the Script task to obtain information about available Excel databases (workbook files) and tables (worksheets and named ranges).

You can read more here.

How to unprotect the Excel sheet

https://techcommunity.microsoft.com/t5/excel/how-to-unprotect-the-excel-sheet-if-forgot-the-password/m-p/1574559

Common variables and Namespaces used to read Excel files

Variables. Open the Variables window and define the following variables:
1- ExcelFile, of type String. Enter the complete path and filename to an existing Excel workbook.
2 -ExcelTable, of type String. Enter the name of an existing worksheet or named range in the workbook
named in the value of the ExcelFile variable. This value is case-sensitive.
3 – ExcelFileExists, of type Boolean.
4 – ExcelTableExists, of type Boolean.
5 – ExcelFolder, of type String. Enter the complete path of a folder that contains at least one Excel workbook.
6 – ExcelFiles, of type Object.
7 – ExcelTables, of type Object.

Imports statements
You will need following .NET Framework namespaces;
1 – System.IO, for file system operations.
2- System.Data.OleDb, to open Excel files as data sources.

References.
The code that read schema information from Excel files require an additional reference in the script project to the System.Xml namespace.