compare-similar-strings-using-strcmp
compare-similar-strings-using-strcmp

How to Compare Strings in C: A Detailed Guide with `strcmp()`

In C programming, strings are fundamental data types used to represent text. Comparing strings is a common operation in many applications, from sorting lists of names to validating user input. C provides a built-in function called strcmp() to perform lexicographical comparison between two strings. This article will delve into the strcmp() function, explaining its syntax, how it works, and providing practical examples to illustrate its usage in various scenarios.

Understanding strcmp() Function in C

The strcmp() function is a part of the C standard library, specifically defined in the <string.h> header file. It stands for “string compare” and is designed to compare two strings, character by character, to determine their lexicographical order.

Syntax of strcmp()

The syntax of the strcmp() function is straightforward:

int strcmp(const char *str1, const char *str2);

Parameters:

  • str1: A pointer to the first string to be compared. It is of type const char *, indicating that the function expects a null-terminated string and will not modify it.
  • str2: A pointer to the second string to be compared, also of type const char *.

Return Value:

The strcmp() function returns an integer value based on the comparison result:

  • 0 (Zero): Returns zero if str1 is identical to str2. This means both strings have the same sequence of characters.
  • Greater than 0 (Positive): Returns a positive value if str1 is lexicographically greater than str2. This indicates that str1 would come after str2 in dictionary order.
  • Less than 0 (Negative): Returns a negative value if str1 is lexicographically less than str2. This indicates that str1 would come before str2 in dictionary order.

How strcmp() Works: Lexicographical Comparison

The strcmp() function operates by comparing the two input strings lexicographically. This means it compares the strings character by character, based on their ASCII values, until a difference is found or the end of either string is reached (indicated by the null terminator ).

Here’s a step-by-step breakdown of how strcmp() performs the comparison:

  1. Character-by-Character Comparison: strcmp() starts by comparing the first character of str1 with the first character of str2.
  2. ASCII Value Check: It compares the ASCII values of these characters.
  3. Equality: If the characters are the same, strcmp() proceeds to compare the next characters in both strings.
  4. Difference Found: If the characters are different, strcmp() determines which string is lexicographically greater based on the ASCII values.
    • If the character in str1 has a higher ASCII value than the character in str2, strcmp() returns a positive value.
    • If the character in str1 has a lower ASCII value than the character in str2, strcmp() returns a negative value.
  5. Null Terminator Encountered: If strcmp() reaches the null terminator in both strings simultaneously without finding any differences, it means the strings are identical, and it returns 0.
  6. String Length Difference: If one string is a prefix of another (e.g., “Geek” and “Geeks”), the shorter string is considered lexicographically smaller.

In essence, strcmp() mimics the way words are ordered in a dictionary. It considers the ASCII values to establish the order. For example, ‘A’ comes before ‘B’, and ‘a’ comes before ‘b’. Numbers and special characters also have their respective positions in the ASCII table, influencing the lexicographical order.

Examples of strcmp() in C

Let’s explore practical examples to understand the behavior of strcmp() in C programs.

Comparing Identical Strings

This example demonstrates how strcmp() behaves when comparing two identical strings.

#include <stdio.h>
#include <string.h>

int main() {
    char s1[] = "Hello";
    char s2[] = "Hello";

    int result = strcmp(s1, s2);

    if (result == 0) {
        printf("Strings are Equaln");
    } else {
        printf("Strings are Unequaln");
    }

    return 0;
}

Output:

Strings are Equal

Explanation: In this code, s1 and s2 are initialized with the same string “Hello”. strcmp(s1, s2) compares them and, finding them identical, returns 0. The if condition evaluates to true, and the program prints “Strings are Equal”.

Comparing Lexicographically Greater String

This example shows how strcmp() identifies a lexicographically greater string.

#include <stdio.h>
#include <string.h>

int main() {
    char str1[] = "zebra";
    char str2[] = "apple";

    int result = strcmp(str1, str2);

    if (result == 0) {
        printf("Strings are Equaln");
    } else if (result > 0) {
        printf("str1 is lexicographically greater than str2n");
    } else {
        printf("str1 is lexicographically smaller than str2n");
    }

    return 0;
}

Output:

str1 is lexicographically greater than str2

Explanation: Here, str1 is “zebra” and str2 is “apple”. When strcmp(str1, str2) is called, it compares ‘z’ with ‘a’. Since ‘z’ has a higher ASCII value than ‘a’, strcmp() returns a positive value. The program correctly identifies “zebra” as lexicographically greater than “apple”.

Comparing Lexicographically Smaller String

This example demonstrates the case where the first string is lexicographically smaller.

#include <stdio.h>
#include <string.h>

int main() {
    char string1[] = "Ball";
    char string2[] = "Cat";

    int res = strcmp(string1, string2);

    if (res == 0) {
        printf("Strings are Equaln");
    } else if (res > 0) {
        printf("string1 is lexicographically greater than string2n");
    } else {
        printf("string1 is lexicographically smaller than string2n");
    }

    return 0;
}

Output:

string1 is lexicographically smaller than string2

Explanation: string1 is “Ball” and string2 is “Cat”. Comparing ‘B’ and ‘C’, ‘B’ has a lower ASCII value. Therefore, strcmp(string1, string2) returns a negative value, indicating “Ball” is lexicographically smaller than “Cat”.

Using strcmp() to Sort an Array of Strings

One practical application of strcmp() is in sorting arrays of strings. You can use it as a comparison function with sorting algorithms like qsort() (quicksort).

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int compareStrings(const void *a, const void *b) {
    return strcmp(*(const char **)a, *(const char **)b);
}

int main() {
    const char *names[] = {"Charlie", "Alice", "Bob", "David"};
    int n = sizeof(names) / sizeof(names[0]);

    qsort(names, n, sizeof(names[0]), compareStrings);

    printf("Sorted names:n");
    for (int i = 0; i < n; i++) {
        printf("%sn", names[i]);
    }

    return 0;
}

Output:

Sorted names:
Alice
Bob
Charlie
David

Explanation:

  • compareStrings function: This function is designed to be used with qsort(). It takes two pointers to const char * (string pointers), dereferences them to get the actual string pointers, and then uses strcmp() to compare the strings.
  • qsort(): This standard library function performs quicksort.
    • names: The array of strings to be sorted.
    • n: The number of elements in the array.
    • sizeof(names[0]): The size of each element (a const char * pointer).
    • compareStrings: The comparison function we defined.
  • The code sorts the names array lexicographically using strcmp() within the compareStrings function, resulting in the names being printed in alphabetical order.

C strcmp() – FAQs

When does strcmp() function return zero?

The strcmp() function returns zero when the two strings being compared are exactly identical. This means they have the same characters in the same order.

What does a positive return value from strcmp() mean?

A positive return value from strcmp() indicates that the first string (str1) is lexicographically greater than the second string (str2). In dictionary terms, str1 would appear after str2.

What does a negative return value from strcmp() mean?

A negative return value from strcmp() indicates that the first string (str1) is lexicographically smaller than the second string (str2). In dictionary terms, str1 would appear before str2.

Can strcmp() be used to compare non-string data types in C?

No, the strcmp() function is specifically designed to compare null-terminated strings (character arrays) in C. It is not intended for and cannot be directly used to compare other data types like integers, floats, or structures. For comparing numerical or other data types, you would use standard comparison operators (e.g., ==, <, >, <=, >=).

Conclusion

The strcmp() function is an essential tool for string manipulation in C. It provides a simple and efficient way to compare strings lexicographically, which is crucial for various tasks like sorting, searching, and data validation. Understanding how strcmp() works, its syntax, and its return values is fundamental for any C programmer working with strings. By utilizing strcmp() effectively, you can build robust and reliable C applications that handle string comparisons with ease.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *