How to search for a string in a text file using wildcards(*,?)
Hey,
I am trying to write a program that reads a textfile and displays all valid e-mail address(tj@yahoo.com or tn.hall@auburn.edu) that are in the textfile in C++.At this time, I am reading the file line by line, but don't have a clue how to search for a valid e-mail. However, I can search for a particular pharse such as "a" and the program will tell me if "a" is in the file. Overall, I am unsure how to go by searching for an e-mail bcause I don't know the exact format that I am looking for but I know something like this would work *@.com. With this in mind how to search a file using wildcards in C++. Any help will be appreciated. |
You might want to look up regular expressions. The following pattern may help
\[\w]+@[\w]+\.[a-zA-Z]{2,4}\ The above pattern (untested) says: [\w] any word character + one or more times @ followed by an ampersand [\w] any word character + one or more times \. followed by a dot [a-zA-Z] any letter (lower case or upper case) {2,4} 2 3 or 4 times |
When graemef says regular expressions, he means specifically here Perl Compatible Regular Expressions. In C and C++ these can be used from the pcre library.
I'm not sure the expression he gave covers all legal email addresses, but it does cover quite a lot. There is a perl module called Mail::CheckUser, which you should be able to dig though to find out how they do it. I dare say that is quite a robust method. There is also an RFC which describes how to to properly validate an email address using pcre's with PHP. You can also have a look at the source for that and see how it should be done. In either case, the only gotcha is that Perl and PHP hove different quoting / escaping rules from literal strings in C/C++, so you'll need some additional/different \ escapes in your regular expressions. In any case, I highly recommend you do some tutorials on regular expressions in generally, and pcre's in particular - I consider them an essential tool in any competent programmers toolbox. Goodness knows they've saved me hundred of hours writing painful string parsing code myself. |
All times are GMT -5. The time now is 03:47 PM. |