A question from my data sturture project

  • Thread starter Thread starter kant
  • Start date Start date
  • Tags Tags
    Data Project
AI Thread Summary
The discussion centers around how to extract words from a text input while ignoring symbols and punctuation. The original poster is attempting to read words from a file and save them into a data structure, but they face challenges with symbols like commas, periods, and hyphens that interfere with word separation. They provide a code snippet that attempts to handle punctuation but acknowledges its limitations, particularly with complex cases like words connected by hyphens. Participants suggest using string manipulation functions such as strpbrk, strcspn, and strtok, while noting that string parsing can be complicated and often lacks elegance. They recommend exploring string manipulation routines and consulting documentation for better understanding and solutions. The overall consensus is that while string parsing can be challenging, there are various tools available to assist in the process.
kant
Messages
388
Reaction score
0
I have this input file that contain words. I am suppose to scan this text, and save each word into a some data structure( not part of my question). My question is: How do i get the words but ignore the symbols? The text that is given contain symbols like - , : ; ( ) _ - + - ...etc. Here is what i got so far:

while( scanf(fpname, %s, stun) !=EOF) )
{
******
****
***
pname= AllocateName( stun);
*******
*****
***
*

}
/* here is my allocate name fuction*/

char* AllocateName( char *stun)
{
char*name;
char let;
int num;
num= strlen(stun);
--num;
let=stun[num];
if(let=='.' ||let==',' || let==':')
{
stun[num]='\o';
}
if(!(name=(char*)malloc( strlen(stun)+1, sizeof(char))))
{
printf("problem allocating name\n");
exit(2);
}
strcmp(name, stun);

return name;
}

Yes, yes.. It only care for words that ends with a comma, or a period.
So it is a not a perfect solution. There can be words like:

(log)(base4)(12) <-- consider one word

but not this:

I have the utter-most-hatred-this-lab, where "utter", "most", "hatred", "this" , "lab" are consider individual words, without the god damn '-'. In other word, if i save "utter-most-hatred-this-lab" as a string as stun, than it must be borken up into some unknown number of pieces individually allocate in the heap! unless i am to use [^!@#$%^&*()]. What is a alogant way to god damn do this??
 
Last edited:
Technology news on Phys.org
Well, to be elegant you have to first stop swearing. :-p

Do you know the strpbrk function? Or strcspn? There's also strtok, but it has issues.

There's a whole slew of useful string manipulation routines -- you should find a chapter in a book on them and just read it, or scan through all the manpages for them (they usually contain references to each other) to learn what they all do.

You can even do a google search for "man strpbrk" :biggrin:


But, the whole lesson is: string parsing is not elegant. Deal with it. :smile:
 
Last edited:
Thread 'Is this public key encryption?'
I've tried to intuit public key encryption but never quite managed. But this seems to wrap it up in a bow. This seems to be a very elegant way of transmitting a message publicly that only the sender and receiver can decipher. Is this how PKE works? No, it cant be. In the above case, the requester knows the target's "secret" key - because they have his ID, and therefore knows his birthdate.
I tried a web search "the loss of programming ", and found an article saying that all aspects of writing, developing, and testing software programs will one day all be handled through artificial intelligence. One must wonder then, who is responsible. WHO is responsible for any problems, bugs, deficiencies, or whatever malfunctions which the programs make their users endure? Things may work wrong however the "wrong" happens. AI needs to fix the problems for the users. Any way to...

Similar threads

Replies
3
Views
2K
Replies
2
Views
4K
Replies
2
Views
4K
Back
Top