Comparing tables is a frequent task in data analysis, especially when working with Power BI and Power Query M. The need to Auto Compare datasets and identify discrepancies efficiently is crucial for maintaining data integrity and accuracy. This article provides two custom Power Query M functions designed to automate table comparisons, offering solutions for both general use cases and scenarios where table structure is consistent.
General Table Comparison Function (fxCom2Tables)
This function, fxCom2Tables
, is designed for flexible table comparison, even when key column case sensitivity is a concern. It performs a left outer join and then compares each corresponding field value to identify matches.
```powerquery-m
let
fxCom2Tables = (lTable as table, lKeyColNum as number, rTable as table, rKeyColNum as number, optional Comparison as nullable number) as table =>
let
BaseTable = Table.Buffer(lTable),
lKeyColName = Table.ColumnNames(BaseTable){lKeyColNum},
CompareAs = if Comparison = 1 then Comparer.OrdinalIgnoreCase else Comparer.Equals,
AddKeyL = Table.AddColumn(BaseTable, "KeyColCase", each Text.Upper(Text.From(Record.FieldValues(_){lKeyColNum})), type text),
AddKeyR = Table.AddColumn(rTable, "KeyColCase", each Text.Upper(Text.From(Record.FieldValues(_){rKeyColNum})), type text),
Merge = if Comparison = 1 then Table.NestedJoin(AddKeyL, {"KeyColCase"}, AddKeyR, {"KeyColCase"}, "Temp", JoinKind.LeftOuter) else Table.NestedJoin(BaseTable, lKeyColName, rTable, Table.ColumnNames(rTable){rKeyColNum}, "Temp", JoinKind.LeftOuter),
CompareRecord = Table.AddColumn(Merge, "Compare", each
let
ToRemoveCols = {lKeyColName} & {"Temp", "KeyColCase"}
in
if Table.IsEmpty(_[Temp]) then
List.Repeat({0}, Record.FieldCount(Record.RemoveFields(_, ToRemoveCols, MissingField.Ignore)))
else
List.Transform(List.Zip({
Record.FieldValues(Record.ReorderFields(Record.RemoveFields(_, ToRemoveCols, MissingField.Ignore), Table.ColumnNames(Merge), MissingField.Ignore)),
Record.FieldValues(Record.ReorderFields(_[Temp]{0}, ToRemoveCols, MissingField.Ignore), Table.ColumnNames(Merge), MissingField.Ignore))}), each try if _{0} = _{1} then 1 else 0 otherwise 0)
),
tResult = Table.SplitColumn(
Table.TransformColumns(
Table.SelectColumns(CompareRecord, {lKeyColName} & {"Compare"}),
{"Compare", each Text.Combine(List.Transform(_, Text.From), "||"), type text}
),
"Compare",
Splitter.SplitTextByDelimiter("||", QuoteStyle.Csv),
List.Select(List.Difference(Table.ColumnNames(Merge), {"Temp", "KeyColCase"}), each _ <> lKeyColName)
),
Documentation = [
Documentation.Name = "fxCompare2Tables",
Documentation.LongDescription = "Returns 0 for each field value that doesn't match and a 1 for each field value that does match.",
Documentation.Category = "Table",
Documentation.Author = "Melissa de Korte",
Documentation.Examples = {
[
Description = " ",
Code = "How variables should be used. #(lf) lKeyColNum and rKeyColNum as zero based position of the Key Column #(lf) optional Comparison as number if set to 1 the case of Key column values will be ignored",
Result = " "
]
}
]
in
Value.ReplaceType(fxCom2Tables, Value.ReplaceMetadata(Value.Type(fxCom2Tables), Documentation))
This function takes two tables (lTable
, rTable
), the key column index for each table (lKeyColNum
, rKeyColNum
), and an optional Comparison
parameter. Setting Comparison
to 1
disables case sensitivity for key column comparison. The output table includes the key column and additional columns showing ‘1’ for matching field values and ‘0’ for mismatches between the input tables.
Output of the fxCom2Tables Power Query M function, showing a comparison result with 1 for matches and 0 for mismatches between two tables.
Optimized Table Comparison Function for Identical Structures
If your tables consistently have the same key field value order and column order, a more performant approach utilizes list functions. The fxCompare2Tables
function below leverages this optimization.
```powerquery-m
let
fxCompare2Tables = (lTable as table, lKeyColNum as number, rTable as table, rKeyColNum as number) as table =>
let
lKeyColName = Table.ColumnNames(lTable){lKeyColNum},
rKeyColName = Table.ColumnNames(rTable){rKeyColNum},
tResult = Table.FromColumns(
{Table.ToColumns(lTable){lKeyColNum}} &
List.Transform(
List.Zip({
Table.ToColumns(Table.RemoveColumns(lTable, lKeyColName, MissingField.Ignore)),
Table.ToColumns(Table.RemoveColumns(rTable, rKeyColName, MissingField.Ignore))
}),
each List.Transform(List.Zip({_{0}, _{1}}), (i) => try if i{0} = i{1} then 1 else 0 otherwise 0)
),
{lKeyColName} & Table.ColumnNames(Table.RemoveColumns(lTable, lKeyColName, MissingField.Ignore))
),
Documentation = [
Documentation.Name = "fxCompare2Tables",
Documentation.LongDescription = "Returns 0 for each field value that doesn't match and a 1 for each field value that does match.",
Documentation.Category = "Table",
Documentation.Author = "Melissa de Korte",
Documentation.Examples = {
[
Description = " ",
Code = "How variables should be used. #(lf) lKeyColNum and rKeyColNum as zero based position of the Key Column",
Result = " "
]
}
]
in
Value.ReplaceType(fxCompare2Tables, Value.ReplaceMetadata(Value.Type(fxCompare2Tables), Documentation))
This streamlined function assumes consistent table structures and directly compares corresponding list elements after removing the key columns. It offers improved performance in scenarios where structural consistency is guaranteed.
By using these fxCompare2Tables
functions, you can effectively auto compare tables within Power Query M, gaining insights into data differences and ensuring data quality in your Power BI reports and data workflows. Choose the function that best suits your table structure and comparison needs for optimal results.