LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 04-30-2012, 12:28 AM   #1
alaios
Senior Member
 
Registered: Jan 2003
Location: Aachen
Distribution: Opensuse 11.2 (nice and steady)
Posts: 2,203

Rep: Reputation: 45
C Libriaries for string


Dear all,
I would like to ask you if you know any web site that has a lit of c libraries/functions especially for strings.

I would like to have a good reference when I want to do something with strings and try to answer a question of the form "I need a function that finds in hug string a filename of the format *.html"

Do you know anything like that?

I would like to thank you in advance for your help

B.R
Alex
 
Old 04-30-2012, 01:43 AM   #2
flamelord
Member
 
Registered: Jun 2011
Distribution: Arch Linux
Posts: 151

Rep: Reputation: 34
http://www.cplusplus.com/reference/clibrary/cstring/
http://www.utas.edu.au/infosys/info/....html#string.h
http://en.wikipedia.org/wiki/String.h

man page for string.h

for regular expressions:

man regex.h
http://www.gnu.org/software/libc/man...pressions.html

google search for things like "c string" "c string.h" "c regular expression" "c regex.h", stuff like that. That's how I found the above links.
 
1 members found this post helpful.
Old 04-30-2012, 03:20 PM   #3
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 454Reputation: 454Reputation: 454Reputation: 454Reputation: 454
Quote:
Originally Posted by alaios View Post
... I need a function that finds in hug string a filename of the format *.html ...
I am not sure I understand you, but if you need to make queries whether something is present in an HTML file, you need and HTML parser - not a "C" string library.
 
Old 05-01-2012, 07:24 AM   #4
alaios
Senior Member
 
Registered: Jan 2003
Location: Aachen
Distribution: Opensuse 11.2 (nice and steady)
Posts: 2,203

Original Poster
Rep: Reputation: 45
Thats a small part of my code.... I want to get out of that string the index.html and or the test.html

Unfortunately it returns no match...

What might I doing wrong here?

Regards
Alex
Code:
/*
** talker.c -- a datagram "client" demo
*/

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <sys/types.h>
#include <regex.h>

int main(int argc, char *argv[])
{
	int i,reti;
	regex_t regex;
	reti = regcomp(&regex, ".*\.\(html\|htm\)", 0);
	char errormsg[100];
	if( reti )
		{ fprintf(stderr, "Could not compile regex\n");
		   exit(1);
		}
 


        reti = regexec(&regex, "I am looking inside this string the index.html or any other files in this format like test.html", 0, NULL, 0);
        if( !reti ){
                       printf("Match\n");
               	   }
       else if( reti == REG_NOMATCH ){
    	   	   printf("No match\n");
    	   	   regerror(reti,&regex,errormsg,sizeof(errormsg));
    	   	   printf("error text is: %s\n",errormsg);
    	   }
       




    return 0;
}
 
Old 05-01-2012, 08:15 AM   #5
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 454Reputation: 454Reputation: 454Reputation: 454Reputation: 454
Quote:
Originally Posted by alaios View Post
Thats a small part of my code.... I want to get out of that string the index.html and or the test.html

Unfortunately it returns no match...

What might I doing wrong here?

Regards
Alex
Code:
/*
** talker.c -- a datagram "client" demo
*/

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <sys/types.h>
#include <regex.h>

int main(int argc, char *argv[])
{
	int i,reti;
	regex_t regex;
	reti = regcomp(&regex, ".*\.\(html\|htm\)", 0);
	char errormsg[100];
	if( reti )
		{ fprintf(stderr, "Could not compile regex\n");
		   exit(1);
		}
 


        reti = regexec(&regex, "I am looking inside this string the index.html or any other files in this format like test.html", 0, NULL, 0);
        if( !reti ){
                       printf("Match\n");
               	   }
       else if( reti == REG_NOMATCH ){
    	   	   printf("No match\n");
    	   	   regerror(reti,&regex,errormsg,sizeof(errormsg));
    	   	   printf("error text is: %s\n",errormsg);
    	   }
       




    return 0;
}
Start from a simpler case, i.e. try to match 'a' in "a".
...
Maybe you need double backslashes.
...
You are doing it apparently wrong - as I wrote earlier in this thread, you need and HTML parser.
 
Old 05-01-2012, 10:00 AM   #6
alaios
Senior Member
 
Registered: Jan 2003
Location: Aachen
Distribution: Opensuse 11.2 (nice and steady)
Posts: 2,203

Original Poster
Rep: Reputation: 45
I do no want any html parser this is an example when I have a given string

I am looking inside this string the index.html or any other files in this format like test.html

to get back only the index.html
 
Old 05-01-2012, 10:19 AM   #7
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 454Reputation: 454Reputation: 454Reputation: 454Reputation: 454
Quote:
Originally Posted by alaios View Post
I do no want any html parser this is an example when I have a given string

I am looking inside this string the index.html or any other files in this format like test.html

to get back only the index.html
You have probably modified this: http://www.peope.net/old/regex.html :


Code:
#include <sys/types.h>
#include <regex.h>
#include <stdio.h>

int main(int argc, char *argv[]){
        regex_t regex;
        int reti;
        char msgbuf[100];

/* Compile regular expression */
        reti = regcomp(&regex, "^a[[:alnum:]]", 0);
        if( reti ){ fprintf(stderr, "Could not compile regex\n"); exit(1); }

/* Execute regular expression */
        reti = regexec(&regex, "abc", 0, NULL, 0);
        if( !reti ){
                puts("Match");
        }
        else if( reti == REG_NOMATCH ){
                puts("No match");
        }
        else{
                regerror(reti, &regex, msgbuf, sizeof(msgbuf));
                fprintf(stderr, "Regex match failed: %s\n", msgbuf);
                exit(1);
        }

/* Free compiled regular expression if you want to use the regex_t again */
	regfree(&regex);

        return 0;
}
example.

Does the original non-modified example work ?
 
Old 05-01-2012, 11:55 AM   #8
flamelord
Member
 
Registered: Jun 2011
Distribution: Arch Linux
Posts: 151

Rep: Reputation: 34
I'm not entirely sure why what you have doesn't work but if you change the regcomp line to

Code:
	reti = regcomp(&regex, ".*\.(html|htm)", REG_EXTENDED);
it seems to work (the trick is the REG_EXTENDED option)
 
Old 05-01-2012, 03:06 PM   #9
alaios
Senior Member
 
Registered: Jan 2003
Location: Aachen
Distribution: Opensuse 11.2 (nice and steady)
Posts: 2,203

Original Poster
Rep: Reputation: 45
The following brought me slightly closer


Code:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <netdb.h>
#include <regex.h>

int main(int argc, char *argv[])
{
	int i,reti;
	size_t nmatch=10;
	regex_t regex;
	reti = regcomp(&regex, "[^\.]*", REG_EXTENDED);
	char errormsg[100];
	regmatch_t pmatch[nmatch];
	char *string="I am looking inside this string the index.html or any other files in this format like test.htm";
	if( reti )
		{ fprintf(stderr, "Could not compile regex\n");
		   exit(1);
		}

    else{
    	printf("socket created %d",i);


        reti = regexec(&regex, string, nmatch, pmatch, 0);
        if( !reti ){
                       printf("Match\n");
                       printf("With the whole expression, "
                                    "a matched substring \"%.*s\" is found at position %d to %d.\n",
                                    pmatch[0].rm_eo - pmatch[0].rm_so, &string[pmatch[0].rm_so],
                                    pmatch[0].rm_so, pmatch[0].rm_eo - 1);
                             printf("With the sub-expression, "
                                    "a matched substring \"%.*s\" is found at position %d to %d.\n",
                                    pmatch[1].rm_eo - pmatch[1].rm_so, &string[pmatch[1].rm_so],
                                    pmatch[1].rm_so, pmatch[1].rm_eo - 1);
               	   }
       else if( reti == REG_NOMATCH ){
    	   	   printf("No match\n");
    	   	   regerror(reti,&regex,errormsg,sizeof(errormsg));
    	   	   printf("error text is: %s\n",errormsg);
    	   }
       }


    regfree(&regex);

    return 0;
}
but returning the
Code:
socket created 32767Match
With the whole expression, a matched substring "I am looking inside this string the index" is found at position 0 to 40.
With the sub-expression, a matched substring "" is found at position -1 to -2.
In the give regular expression I try to get only index (from the index.html), well in the ideal case I want to have the "index.html" but I have simplified it a bit.

As you can see it seems to find something but the returned positions seem weird...
 
Old 05-01-2012, 03:16 PM   #10
flamelord
Member
 
Registered: Jun 2011
Distribution: Arch Linux
Posts: 151

Rep: Reputation: 34
You're current regular expression is matching any string which doesn't contain any periods. I think you want a regular expression like: "(\\w*)\\.htm", this matches the whole string "index.htm" and the first subexpression is just "index", the \\'s are to escape the backslashes so C doesn't interpret them, and "\w" is the character class for word characters (i.e. [A-Za-z0-9])
 
1 members found this post helpful.
Old 05-02-2012, 01:06 AM   #11
alaios
Senior Member
 
Registered: Jan 2003
Location: Aachen
Distribution: Opensuse 11.2 (nice and steady)
Posts: 2,203

Original Poster
Rep: Reputation: 45
That solved my problem. I only changed the regex... kaboom finished

Great now I will try to further improve it.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Sed/awk/cut to pull a repeating string out of a longer string StupidNewbie Programming 4 09-13-2018 03:41 AM
[SOLVED] copy string a to string b and change string b with toupper() and count the chars beep3r Programming 3 10-22-2010 07:22 PM
Shell scripting - Getting just directory string from file path string? arashi256 Programming 5 10-16-2009 08:21 AM
read string after specific string from a text file using C++ programing language badwl24 Programming 5 10-08-2009 05:41 AM
Shell Script: Delete lines til string found or until particular string. bhargav_crd Linux - General 3 12-20-2007 11:14 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 12:11 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration