How to search just the first two characters in a c-string?

  • Thread starter Thread starter yungman
  • Start date Start date
  • Tags Tags
    Search
Click For Summary

Discussion Overview

The discussion revolves around searching for the first one or two characters in C-style strings (c-strings) within a vector of structures. Participants explore methods for comparing these characters to handle potential misspellings in last names, particularly in the context of Chinese surnames. The conversation includes technical challenges, coding exercises, and the use of specific string functions.

Discussion Character

  • Technical explanation
  • Exploratory
  • Debate/contested
  • Meta-discussion

Main Points Raised

  • One participant expresses a need to search only the first one or two characters of last names to avoid issues with misspellings.
  • Another participant suggests using the strncmp function or manually comparing characters as potential solutions.
  • A participant mentions the Soundex algorithm as an interesting coding exercise related to the problem.
  • Several participants discuss the availability and discoverability of the strncmp function, with some noting it was not included in their textbooks.
  • There are comments on the differences in search experiences on Google, with some participants suggesting that familiarity with terms can affect search results.
  • Discussion includes historical references to the origins of C string functions and their presence in programming literature.
  • Some participants reflect on the challenges of learning C++ and the usefulness of documentation versus web searches.

Areas of Agreement / Disagreement

There is no consensus on a single best approach to the problem, with multiple methods proposed and varying opinions on the discoverability of relevant functions. The discussion remains unresolved regarding the optimal solution for the initial character comparison.

Contextual Notes

Participants mention limitations in their textbooks and the evolution of string functions over time, indicating that some functions may not be covered in older materials.

yungman
Messages
5,741
Reaction score
291
I am working on the part of the program that I want to search through a vector of structure that has c-string(last name) as member. I decide NOT to search the complete string because I want to avoid in case someone spell the last name wrong and fail to find a match. So I want to search only the first character or the first two characters for matching only.

There lies the problem, It's easy to do string compare strcmp( string1, string2) =0 to find match. But I need to get the first one or two character out for comparison. I don't know an easy way to extract the first two characters out for comparison. It used to have strncpy(str1, str2, 2) to copy the first two characters out. BUT sadly VS flag error and insists on using strncpy_s(). strncpy_s() will not allow copy from a long string to a short one to truncate out the rest.

Of cause, I can use loop to copy out a character at a time. But that seems stupid. Is there an elegant way to compare the first two characters.

For example, if I search for "ch", I want to find last names like "chan", "chang", "chen", "chin", "chu" etc. Don't laugh, these are all REAL Chinese last names. I don't want anyone to miss "chen" if one typed in "chan" because both sound exactly the same in Chinese. It's an easy mistake.

Thanks

Last resort is
C++:
#include <iostream>
#include <cstring>
#include <iomanip>
using namespace std;
int main()
{    int i;
    char shortA[3], longA[] = "chang";
    for (i = 0; i < 2; i++)
        shortA[i] = longA[i];
    shortA[i] = NULL;
    count << shortA << "\n\n";
    return 0;
}
 
Last edited:
Technology news on Phys.org
yungman said:
I decide NOT to search the complete string because I want to avoid in case someone spell the last name wrong and fail to find a match.

For an interesting coding exercise in addressing this issue you may consider implementing the Soundex algorithm.
 
  • Like
Likes   Reactions: jedishrfu
jedishrfu said:
there's two ways:

strncmp or manually via a simple if statement comparing each character.

https://www.geeksforgeeks.org/stdstrncmp-in-c/
Thank you so much. strncmp() works. It's NOT in the book! The book only has strcmp().
C++:
//experiment copying c-strings
#include <iostream>
#include <cstring>
#include <iomanip>
using namespace std;
int main()
{    char shortA[3], longA[] = "chang";
    char more;
    do
     {    
        there's << " Type first two character of the name to search: ";
        cin.getline(shortA, 3); there's << "\n\n";
        int comp = strncmp(shortA, longA, 2);
        if(comp > 0) there's << " The name you enter is greater than 'ch' ";
        if (comp ==0) there's << " The name you enter is equal to 'ch' ";
        if (comp < 0) there's << " The name you enter is smaller than 'ch' ";
        there's << "\n\n";
        there's << " You want to do it again? "; 
        
         cin.get(more); there's << "\n\n";
        cin.ignore();
     }while(more == tolower('y'));
    return 0;
}
thanks
 
Ahh but it’s in the book of Google. :-)
 
jedishrfu said:
Ahh but it’s in the book of Google. :-)
Only if you know what you are looking for. When I typed "compare first two characters of two c-strings" nothing came up that was relevant. I did not know strncmp() exist.
 
yungman said:
When I typed "compare first two characters of two c-strings" nothing came up that was relevant.
Come on. Use your imagination. What if you searched "C string functions"?

Or inventing and writing your own function that does what strncmp does?
 
  • Like
Likes   Reactions: Klystron and Vanadium 50
yungman said:
Only if you know what you are looking for. When I typed "compare first two characters of two c-strings" nothing came up that was relevant. I did not know strncmp() exist.

strncmp is the second hit for me when I search for this (I did not include quotes).
 
  • Like
Likes   Reactions: Vanadium 50
yungman said:
Thank you so much. strncmp() works. It's NOT in the book!
Something better than doing a web search is looking at the documentation. If you're working with C-strings, you should look at the functions declared in the cstring header (AKA string.h).
strncmp is listed in http://cplusplus.com/reference/cstring/
 
  • Like
Likes   Reactions: yungman and Vanadium 50
  • #10
To be fair, your google search experience may be different from mine considering I search for software related stuff a lot. Google does optimize searches to favor your preferences while making sure you see what they want you to see in your region of the world and even country.

Also string functions while ubiquitous in C programs are not the best at handling international character sets like UTF- 8 in string processing for that there are other functions better suited to national language support.
 
  • Like
Likes   Reactions: yungman
  • #11
jedishrfu said:
To be fair, your google search experience may be different from mine considering I search for software related stuff a lot. Google does optimize searches to favor your preferences while making sure you see what they want you to see in your region of the world and even country.

Also string functions while ubiquitous in C programs are not the best at handling international character sets like UTF- 8 in string processing for that there are other functions better suited to national language support.
Exactly, If I search on electronic design, it's SO EASY because I am in all your shoes when comes to electronics and I was one of the people that advice in the EE forum here 10 years ago. That's why I racked up over 4000 posts here. I am sure I would tell students that ask about EE why don't you google!

Even if I don't know that specific topic in EE, I know enough what to look for. The key is I am new in C++, that's the reason at the beginning a few months ago, it's almost USELESS to google as even if I find something, it looked like Russian to me. AND I was being accused here " why don't you google"! Believe me, I always did first. It's getting a lot better now that I studied 13 chapters and know the terms a lot better than before.
 
  • #12
Mark44 said:
Something better than doing a web search is looking at the documentation. If you're working with C-strings, you should look at the functions declared in the cstring header (AKA string.h).
strncmp is listed in http://cplusplus.com/reference/cstring/
I keep forgetting the book I use is 10 years old! strncmp is not around, neither strncpy_s and all that. That's the only version I can get the pdf file. Not only it's free, more importantly, I can print out and separate into like 10 page each so I can lie down and read because of my neck. It's very hard to lie down and hold the big book to read! Now I have to use another Gaddis book soon ( not the brief edition) to study chapter 16 to 20, I can't get the pdf. I am actually thinking about cutting the book apart to very thin sections so I can read lying down.
 
  • #13
I believe all the C string functions were there since like forever ie circa 1980 and earlier.

This is because C++ folks had to convince C folks to use the language. The earliest C++ was a translator on top of C ie C++ got translated to C and then compiled by C and then loaded by the loader.
 
  • Like
Likes   Reactions: Klystron
  • #14
jedishrfu said:
I believe all the C string functions were there since like forever ie circa 1980 and earlier.

Indeed. I used strncmp and other length-guarded string functions in the late 80-ties. A look at the old strncmp(3) man page indicate it standard-wise was around at least in C89 (ANSI C), and the function is also mentioned in my copy of "The UNIX Programming Environment" by Kerninghan and Pike in 1984, that is before ANSI C. However, I don't see it mentioned in the 1978 "The C Programming Language" by Kerninghan and Ritchie.
 
  • #15
Filip Larsen said:
I used strncmp and other length-guarded string functions in the late 80-ties.

I checked - it's not in K&R. (1st edition - but it is in the 2nd)

I have code from December of 1984 that used strncmp.
 
  • Like
Likes   Reactions: Klystron and Filip Larsen
  • #16
jedishrfu said:
I believe all the C string functions were there since like forever ie circa 1980 and earlier.
That's not the case with strcpy_s, strcat_s, strncpy_s, and a number of other string functions that are more secure (reflected by the appended _s in their names). These were added in the C11 standard, I believe.
 
  • Like
Likes   Reactions: jedishrfu
  • #17
Mark44 said:
That's not the case with strcpy_s, strcat_s, strncpy_s, and a number of other string functions that are more secure (reflected by the appended _s in their names). These were added in the C11 standard, I believe.
I think those functions are still `"optional", in that they are not required to be implemented. There was talk about adding them as a core feature in c++17, but I guess not. It still isn't part of glibc. I found this out when I tried to compile one of Yungmans programs. It's been a while since I googled the best way to copy c-strings. It's amazing how much disagreement there is about it. Seems snprintf is popular. I think you can securely use any of them.
 
Last edited:
  • #18
I checked and strncmp is not in Lattice C v1.0 (1982) but was available (probably from Lattice as well as others) by late 1984. I think we can say it's been around for 37 years, +/- 3%.

And while Googling is fine, the way to check is to look at what's in <string.h>. It's not like it's a deep dark secret. It's right there on disk.

Use The Source, Luke.
 
  • Like
  • Haha
Likes   Reactions: Klystron, Jarvis323 and Mark44
  • #19
Jarvis323 said:
I think those functions are still `"optional", in that they are not required to be implemented.
All the ones that I mentioned (strcpy_s et al) have been present in Visual Studio for at least the past three editions - VS 2015, VS 2017, VS 2019 - and possibly earlier. yungman is using VS 2019.

Back in about 2004, the entire Windows division at MSFT (all 7500 people) were given a two-day training session aimed at increasing security in Windows. Part of the training involved awareness of hacking vulnerabilities coming from buffer overruns. The xxx_s functions in string.h were an effort to reduce or eliminate this particular vulnerability.
Vanadium 50 said:
And while Googling is fine, the way to check is to look at what's in <string.h>.
Or <cstring> which includes string.h. I couldn't agree more about doing quasi-random web searches. First place to look is in the header documentation.
 
  • Like
  • Informative
Likes   Reactions: Vanadium 50, Klystron and Jarvis323

Similar threads

  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 5 ·
Replies
5
Views
7K
  • · Replies 89 ·
3
Replies
89
Views
6K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 75 ·
3
Replies
75
Views
7K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 24 ·
Replies
24
Views
4K