Comparing strings is a fundamental operation in programming, and C# offers a rich set of tools to perform these comparisons effectively. Whether you need to determine if two strings are equal or establish their sort order, understanding the nuances of string comparison in C# is crucial. This guide delves into the various aspects of Comparing Strings In C#, ensuring you can choose the most appropriate method for your specific needs.
When you compare strings, you are essentially trying to answer one of two core questions:
- Equality: “Are these two strings the same?”
- Sorting Order: “In what order should these strings be arranged when sorted?”
However, these seemingly simple questions become complex due to several factors that influence string comparisons:
- Ordinal vs. Linguistic Comparison: Do you want to compare strings based on their binary values (ordinal) or according to language-specific rules (linguistic)?
- Case Sensitivity: Should the comparison be case-sensitive (distinguishing between “string” and “String”) or case-insensitive (treating them as the same)?
- Culture-Specific Comparisons: Should the comparison be tailored to a specific culture (e.g., English, German, etc.) or use a culture-invariant approach?
- Platform and Culture Dependency (Linguistic Comparisons): Linguistic comparisons can behave differently across different cultures and platforms.
C# provides the System.StringComparison
enumeration to address these choices, offering distinct comparison types:
CurrentCulture
: Uses culture-sensitive sort rules based on the current culture settings of the system. This is suitable for displaying sorted lists to end-users in a localized application.CurrentCultureIgnoreCase
: Similar toCurrentCulture
, but ignores case differences during comparison.InvariantCulture
: Employs culture-sensitive sort rules based on the invariant culture. The invariant culture is culture-agnostic and consistent across all systems, making it useful for internal operations and data storage.InvariantCultureIgnoreCase
: LikeInvariantCulture
, but performs case-insensitive comparisons.Ordinal
: Performs a fast, binary comparison, examining the numeric Unicode values of eachchar
in the strings. It’s case-sensitive and doesn’t consider linguistic rules. Ordinal comparison is ideal for performance-critical operations or when linguistic relevance is not needed, such as comparing identifiers or file paths in a technical context.OrdinalIgnoreCase
: Similar toOrdinal
, but ignores case, using the casing rules of the invariant culture.
Understanding these StringComparison
options is key to performing string comparisons in C# accurately and efficiently. Let’s explore these comparison types in detail with practical examples.
Understanding Default Ordinal Comparisons in C
By default, certain common string operations in C# utilize ordinal comparisons. Let’s examine this with an example:
string root = @"C:users";
string root2 = @"C:Users";
bool result = root.Equals(root2);
Console.WriteLine($"`Equals` (default - Ordinal) comparison: <{root}> and <{root2}> are {(result ? "equal." : "not equal.")}");
result = root.Equals(root2, StringComparison.Ordinal);
Console.WriteLine($"`Equals` (Ordinal explicit) comparison: <{root}> and <{root2}> are {(result ? "equal." : "not equal.")}");
Console.WriteLine($"`==` operator comparison: <{root}> and <{root2}> are {(root == root2 ? "equal" : "not equal")}");
As the output demonstrates, the default Equals
method and the ==
operator perform ordinal comparisons. Ordinal comparison operates by comparing the binary value of each Char
object in the strings. Consequently, it is inherently case-sensitive. In the example above, "C:users"
and "C:Users"
are deemed not equal because the case of the ‘u’ and ‘U’ characters differs in their binary representations.
It’s important to note a subtle distinction: while equality checks (Equals
, ==
, !=
) default to ordinal comparison, methods like String.CompareTo
and String.Compare(String, String)
employ a culture-aware linguistic comparison using the current culture by default. This difference can lead to unexpected behavior if you assume all default string comparisons are handled the same way. To ensure clarity and prevent ambiguity, it’s best practice to explicitly specify the StringComparison
type in your code, especially when using Equals
or Compare
methods.
Performing Case-Insensitive Ordinal String Comparisons
For scenarios where case should be ignored during string comparison, C# provides the StringComparison.OrdinalIgnoreCase
option. You can use this with methods like String.Equals(String, StringComparison)
and String.Compare(String, String, StringComparison)
.
Consider this code example:
string root = @"C:users";
string root2 = @"C:Users";
bool result = root.Equals(root2, StringComparison.OrdinalIgnoreCase);
bool areEqual = String.Equals(root, root2, StringComparison.OrdinalIgnoreCase);
int comparison = String.Compare(root, root2, comparisonType: StringComparison.OrdinalIgnoreCase);
Console.WriteLine($"Ordinal ignore case `Equals` method: <{root}> and <{root2}> are {(result ? "equal." : "not equal.")}");
Console.WriteLine($"Ordinal ignore case static `Equals` method: <{root}> and <{root2}> are {(areEqual ? "equal." : "not equal.")}");
if (comparison < 0)
Console.WriteLine($"<{root}> is less than <{root2}>");
else if (comparison > 0)
Console.WriteLine($"<{root}> is greater than <{root2}>");
else
Console.WriteLine($"<{root}> and <{root2}> are equivalent in order");
As the output shows, using StringComparison.OrdinalIgnoreCase
treats "C:users"
and "C:Users"
as equal. These methods utilize the casing rules defined by the invariant culture when performing case-insensitive ordinal comparisons. The invariant culture provides a consistent, culture-neutral casing behavior, ensuring predictable results regardless of the user’s locale settings.
Exploring Linguistic String Comparisons in C
Linguistic comparisons go beyond simple binary comparisons and consider language-specific rules and cultural conventions. Many string methods in C# (like String.StartsWith
, String.IndexOf
, and default String.Compare
overloads) use linguistic rules based on the current culture by default. This is often referred to as “word sort order.”
In linguistic comparisons, certain Unicode characters might have assigned weights that influence sorting order. For instance, a hyphen (“-“) might have a small weight, causing “co-op” and “coop” to be placed close together in a sorted list. Some control characters might be ignored entirely. Furthermore, some Unicode characters can be equivalent to a sequence of Char
instances.
Consider the German language example:
string first = "Sie tanzen auf der Straße.";
string second = "Sie tanzen auf der Strasse.";
Console.WriteLine($"First sentence is <{first}>");
Console.WriteLine($"Second sentence is <{second}>");
bool equalInvariant = String.Equals(first, second, StringComparison.InvariantCulture);
Console.WriteLine($"The two strings are {(equalInvariant ? "linguistically" : "not linguistically")} equal (InvariantCulture).");
bool equalOrdinal = String.Equals(first, second, StringComparison.Ordinal);
Console.WriteLine($"The two strings are {(equalOrdinal ? "ordinally" : "not ordinally")} equal (Ordinal).");
string word = "coop";
string words = "co-op";
string other = "cop";
ShowComparison(word, words);
ShowComparison(word, other);
ShowComparison(words, other);
void ShowComparison(string one, string two)
{
int compareLinguistic = String.Compare(one, two, StringComparison.InvariantCulture);
int compareOrdinal = String.Compare(one, two, StringComparison.Ordinal);
if (compareLinguistic < 0)
Console.WriteLine($"<{one}> is less than <{two}> using invariant culture (linguistic)");
else if (compareLinguistic > 0)
Console.WriteLine($"<{one}> is greater than <{two}> using invariant culture (linguistic)");
else
Console.WriteLine($"<{one}> and <{two}> are equivalent in order using invariant culture (linguistic)");
if (compareOrdinal < 0)
Console.WriteLine($"<{one}> is less than <{two}> using ordinal comparison");
else if (compareOrdinal > 0)
Console.WriteLine($"<{one}> is greater than <{two}> using ordinal comparison");
else
Console.WriteLine($"<{one}> and <{two}> are equivalent in order using ordinal comparison");
}
In this example, the German word “Straße” (street) is compared to “Strasse”. Linguistically, in both “en-US” and “de-DE” cultures, “ss” is considered equivalent to the German Esszet character ‘ß’. Therefore, with StringComparison.InvariantCulture
, the two sentences are deemed linguistically equal. However, using StringComparison.Ordinal
, they are not equal because the binary representations of ‘ß’ and “ss” are different.
Similarly, the words “cop”, “coop”, and “co-op” are sorted differently depending on the comparison type. Linguistic comparison (using InvariantCulture
in this example) places “co-op” closer to “coop”, while ordinal comparison sorts them based purely on their binary values, resulting in a different order.
It’s crucial to recognize that linguistic comparison behavior can vary across platforms and .NET versions. Prior to .NET 5, .NET globalization APIs on Windows relied on National Language Support (NLS) libraries. However, .NET 5 and later versions utilize International Components for Unicode (ICU) libraries, ensuring more consistent globalization behavior across all supported operating systems. This change can affect linguistic comparison results, especially for complex scenarios and different cultures.
Culture-Specific String Comparisons in C
For applications targeting specific locales or dealing with user-generated content in particular languages, culture-specific comparisons are essential. C# allows you to perform comparisons using specific cultures by utilizing the System.Globalization.CultureInfo
class.
Consider comparing the same German sentences using English (en-US) and German (de-DE) cultures:
string first = "Sie tanzen auf der Straße.";
string second = "Sie tanzen auf der Strasse.";
Console.WriteLine($"First sentence is <{first}>");
Console.WriteLine($"Second sentence is <{second}>");
var en = new System.Globalization.CultureInfo("en-US");
int iEn = String.Compare(first, second, en, System.Globalization.CompareOptions.None);
Console.WriteLine($"Comparing in {en.Name} (en-US) returns {iEn}.");
var de = new System.Globalization.CultureInfo("de-DE");
int iDe = String.Compare(first, second, de, System.Globalization.CompareOptions.None);
Console.WriteLine($"Comparing in {de.Name} (de-DE) returns {iDe}.");
bool bCurrentCulture = String.Equals(first, second, StringComparison.CurrentCulture);
Console.WriteLine($"The two strings are {(bCurrentCulture ? "linguistically" : "not linguistically")} equal (CurrentCulture).");
string word = "coop";
string words = "co-op";
string other = "cop";
ShowComparison(word, words, en);
ShowComparison(word, other, en);
ShowComparison(words, other, en);
void ShowComparison(string one, string two, System.Globalization.CultureInfo culture)
{
int compareLinguistic = String.Compare(one, two, culture, System.Globalization.CompareOptions.None);
int compareOrdinal = String.Compare(one, two, StringComparison.Ordinal);
if (compareLinguistic < 0)
Console.WriteLine($"<{one}> is less than <{two}> using {culture.Name} culture (linguistic)");
else if (compareLinguistic > 0)
Console.WriteLine($"<{one}> is greater than <{two}> using {culture.Name} culture (linguistic)");
else
Console.WriteLine($"<{one}> and <{two}> are equivalent in order using {culture.Name} culture (linguistic)");
if (compareOrdinal < 0)
Console.WriteLine($"<{one}> is less than <{two}> using ordinal comparison");
else if (compareOrdinal > 0)
Console.WriteLine($"<{one}> is greater than <{two}> using ordinal comparison");
else
Console.WriteLine($"<{one}> and <{two}> are equivalent in order using ordinal comparison");
}
The output demonstrates that comparing the German sentences in “en-US” and “de-DE” cultures yields different results. This highlights that linguistic comparisons are culture-sensitive, and the chosen culture significantly impacts the outcome.
Culture-sensitive comparisons are generally employed when comparing strings entered by users, as their expected sorting and comparison behavior is often dependent on their locale. Even strings with identical characters might be sorted differently based on the current thread’s culture.
Linguistic Sorting and Searching of String Arrays in C
When working with arrays of strings, you might need to sort or search them linguistically based on the current culture. C# provides static Array
methods that accept a System.StringComparer
parameter to facilitate this.
Here’s how to sort an array of strings using the current culture’s linguistic rules:
string[] lines = new string[] { @"c:publictextfile.txt", @"c:publictextFile.TXT", @"c:publicText.txt", @"c:publictestfile2.txt" };
Console.WriteLine("Non-sorted order:");
foreach (string s in lines)
{
Console.WriteLine($" {s}");
}
Console.WriteLine("nSorted order (CurrentCulture):");
Array.Sort(lines, StringComparer.CurrentCulture);
foreach (string s in lines)
{
Console.WriteLine($" {s}");
}
Console.WriteLine("nSorted order (Ordinal):");
Array.Sort(lines, StringComparer.Ordinal);
foreach (string s in lines)
{
Console.WriteLine($" {s}");
}
Once an array is linguistically sorted, you can efficiently search it using binary search. Array.BinarySearch
also has overloads that accept a StringComparer
. Remember, binary search requires the collection to be already sorted using the same comparison rules.
string[] lines = new string[] { @"c:publictextfile.txt", @"c:publictextFile.TXT", @"c:publicText.txt", @"c:publictestfile2.txt" };
Array.Sort(lines, StringComparer.CurrentCulture); // Sort linguistically first
string searchString = @"c:publicTEXTFILE.TXT";
Console.WriteLine($"nBinary search for <{searchString}> (CurrentCulture):");
int resultCurrentCulture = Array.BinarySearch(lines, searchString, StringComparer.CurrentCulture);
ShowWhere(lines, resultCurrentCulture);
Console.WriteLine($"{(resultCurrentCulture >= 0 ? "Found" : "Did not find")} {searchString}");
Console.WriteLine($"nBinary search for <{searchString}> (Ordinal):");
int resultOrdinal = Array.BinarySearch(lines, searchString, StringComparer.Ordinal);
ShowWhere(lines, resultOrdinal);
Console.WriteLine($"{(resultOrdinal >= 0 ? "Found" : "Did not find")} {searchString}");
void ShowWhere<T>(T[] array, int index)
{
if (index < 0)
{
index = ~index; // Bitwise complement to get the index of the next larger element
Console.Write("Not found. Sorts between: ");
if (index == 0)
Console.Write("beginning of sequence and ");
else
Console.Write($"{array[index - 1]} and ");
if (index == array.Length)
Console.WriteLine("end of sequence.");
else
Console.WriteLine($"{array[index]}.");
}
else
{
Console.WriteLine($"Found at index {index}.");
}
}
The ShowWhere
local function provides helpful information about the search result, indicating either the index where the string is found or where it would be inserted to maintain the sorted order if not found. Notice how searching with CurrentCulture
and Ordinal
StringComparers yields different results, underscoring the importance of consistency between sorting and searching comparison types.
Ordinal Sorting and Searching in C# Collections
Similar to arrays, collections like List<string>
can be sorted and searched ordinally or linguistically. The List<string>.Sort
method can accept a delegate to define the comparison logic. String.CompareTo
provides a default ordinal case-sensitive comparison. To customize the comparison, you can use String.Compare
overloads with specific StringComparison
values.
Here’s an example of ordinal sorting a List<string>
:
List<string> lines = new List<string> { @"c:publictextfile.txt", @"c:publictextFile.TXT", @"c:publicText.txt", @"c:publictestfile2.txt" };
Console.WriteLine("Non-sorted order:");
foreach (string s in lines)
{
Console.WriteLine($" {s}");
}
Console.WriteLine("nSorted order (Ordinal):");
lines.Sort((left, right) => left.CompareTo(right)); // Default ordinal sort
foreach (string s in lines)
{
Console.WriteLine($" {s}");
}
Console.WriteLine("nSorted order (CurrentCulture):");
lines.Sort((left, right) => String.Compare(left, right, StringComparison.CurrentCulture)); // Explicit CurrentCulture sort
foreach (string s in lines)
{
Console.WriteLine($" {s}");
}
Once sorted, you can use List<string>.BinarySearch
for efficient searching. Again, ensure you use a comparison consistent with the sorting method.
List<string> lines = new List<string> { @"c:publictextfile.txt", @"c:publictextFile.TXT", @"c:publicText.txt", @"c:publictestfile2.txt" };
lines.Sort((left, right) => left.CompareTo(right)); // Ordinal sort
string searchString = @"c:publicTEXTFILE.TXT";
Console.WriteLine($"nBinary search for <{searchString}> (Ordinal):");
int resultOrdinal = lines.BinarySearch(searchString); // Default ordinal search
ShowWhere(lines, resultOrdinal);
Console.WriteLine($"{(resultOrdinal >= 0 ? "Found" : "Did not find")} {searchString}");
Console.WriteLine($"nBinary search for <{searchString}> (CurrentCulture):");
int resultCurrentCulture = lines.BinarySearch(searchString, StringComparer.CurrentCulture); // Explicit CurrentCulture search
ShowWhere(lines, resultCurrentCulture);
Console.WriteLine($"{(resultCurrentCulture >= 0 ? "Found" : "Did not find")} {searchString}");
void ShowWhere<T>(IList<T> collection, int index)
{
if (index < 0)
{
index = ~index;
Console.Write("Not found. Sorts between: ");
if (index == 0)
Console.Write("beginning of sequence and ");
else
Console.Write($"{collection[index - 1]} and ");
if (index == collection.Count)
Console.WriteLine("end of sequence.");
else
Console.WriteLine($"{collection[index]}.");
}
else
{
Console.WriteLine($"Found at index {index}.");
}
}
Crucially, always use the same comparison type for both sorting and searching. Mixing comparison types will lead to incorrect and unpredictable results.
For collection classes like Hashtable
, Dictionary<TKey, TValue>
, and List<T>
, constructors exist that accept a System.StringComparer
when the key or element type is string
. Whenever possible, leverage these constructors and specify either StringComparer.Ordinal
or StringComparer.OrdinalIgnoreCase
for optimal performance and predictable behavior, especially in scenarios where linguistic sorting is not a requirement.
Conclusion
Mastering string comparison in C# involves understanding the different comparison types offered by the StringComparison
enumeration and choosing the right type for your specific scenario. Ordinal comparisons are fast and suitable for technical contexts where linguistic rules are irrelevant, while linguistic comparisons are essential for user-facing applications and scenarios requiring culture-sensitive string handling. Always be explicit in specifying the StringComparison
type to avoid ambiguity and ensure your code behaves as expected across different cultures and platforms. By carefully selecting the appropriate comparison method, you can write robust and efficient C# applications that handle strings effectively.
See also
- System.StringComparison Enumeration
- System.StringComparer Class
- System.Globalization.CultureInfo Class
- Handling Globalization in .NET Applications
- International Components for Unicode (ICU)
- National Language Support (NLS)