LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 06-20-2008, 10:42 AM   #1
student04
Member
 
Registered: Jan 2004
Location: Georgia
Distribution: OS X, CentOS
Posts: 669

Rep: Reputation: 34
Question Character Literals


While trying to understand some code for work I came across something I was initially puzzled over, but now have a slight understanding as to what the original author intended to do:
Code:
#include <string.h>

     ...
     char *ptr;
     ptr = strrchr( some_string, '.' );
     if( ptr ) *ptr = '\000';
     ...
Now, what I think the intention is, is to artificially end the specified string at the last location of a period, if found.

However, what can possibly be going wrong with the '\000' literal? The code compiles without warnings (even with -Wall). When I run this it segfaults assigning that literal (run on a Solaris 10 box with gcc 3.4.3). Is this a NUL-byte (value 0) followed by two zero characters (value 30)? From what I can tell the code is pumping 3 bytes into a space allocated to hold only one byte. The literal seems to be okay with the compiler, but not when assigning it to the address pointed to by 'ptr'.

I think it should be rewritten as
Code:
#include <string.h>

     ...
     char *ptr;
     ptr = strrchr( some_string, '.' );
     if( ptr ) *ptr = '\0';
     ...
This code probably never got executed in the original file as the strings given to it probably never had periods in them...

-AM
 
Old 06-20-2008, 11:02 AM   #2
netdog
LQ Newbie
 
Registered: Dec 2007
Location: Richmond, VA
Distribution: Fedora, Slackware
Posts: 5

Rep: Reputation: 0
Ok, somebody check me on this, it's been a while since C...

Your assumption is good, the \000 is an octal literal for a one byte 0 which will end the string. I think the problem is the quote marks. By sitting in the quotes it translates to three character bytes instead of one. Try assigning the value as some sort of int. One of the following...

- if(ptr) *ptr = \000; // octal
- if(ptr) *ptr = \0x00; // hex
- if(ptr) *ptr = 0; // dec


That will assign the character at *ptr to 0 (null) and end the string.

Hope that helps

Ciao
 
Old 06-20-2008, 11:39 AM   #3
student04
Member
 
Registered: Jan 2004
Location: Georgia
Distribution: OS X, CentOS
Posts: 669

Original Poster
Rep: Reputation: 34
Oh that's interesting. I hadn't thought of octals for this... See I thought '\000' would translate to 0x003030 (NUL-byte, zero, zero). My task wasn't to clean up the code but to band-aid it.. And it is 8 years old with the original author gone, so I didn't want to touch the rest of it.

So,
Code:
*ptr = '\0'; // and
*ptr = 0;
are the same, correct? Since the decimal 0 is typecast into a single byte 0 before assigned to the location at ptr?

I can't really ask the person who wrote this what the heck they intended to do.. either use an octal value like you mentioned, or write two more zeros, or...

My confusion is only
Code:
'\000';
should not be valid syntax, disregarding semantics. Am I wrong?
 
Old 06-20-2008, 01:22 PM   #4
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian, Arch
Posts: 3,387

Rep: Reputation: 1553Reputation: 1553Reputation: 1553Reputation: 1553Reputation: 1553Reputation: 1553Reputation: 1553Reputation: 1553Reputation: 1553Reputation: 1553Reputation: 1553
See http://www.cppreference.com/escape_sequences.html

Code:
\nnn  	Octal number (nnn)
\0 	Null character (really just the octal number zero)
 
Old 06-20-2008, 01:59 PM   #5
student04
Member
 
Registered: Jan 2004
Location: Georgia
Distribution: OS X, CentOS
Posts: 669

Original Poster
Rep: Reputation: 34
Quote:
Originally Posted by ntubski View Post
See http://www.cppreference.com/escape_sequences.html

Code:
\nnn  	Octal number (nnn)
\0 	Null character (really just the octal number zero)
Okay; just thinking too hard about this.

Thanks!
 
Old 06-20-2008, 07:26 PM   #6
osor
HCL Maintainer
 
Registered: Jan 2006
Distribution: (H)LFS, Gentoo
Posts: 2,450

Rep: Reputation: 76
Quote:
Originally Posted by student04 View Post
Okay; just thinking too hard about this.
Yes. Whenever you have single quote-marks, you will only get one character, period (with width of int in C and width of char in C++).
Quote:
Originally Posted by student04 View Post
When I run this it segfaults assigning that literal (run on a Solaris 10 box with gcc 3.4.3).
If the segfault does indeed occur on the assignment (*ptr = '\000';), it is most likely because the original string (some_string) was in read-only memory. This is where you should direct your debugging efforts. In particular, if some_string is a (pointer to a) string literal (and not an array initialized to a string literal), many older compilers (wrongly) allowed this to work, whereas most modern compilers put string literals in read-only memory. For example, the following program would have worked on an older compiler, but should segfault with a newer compiler:
Code:
#include <stdio.h>

int main()
{
        char *string = "safe";
        string[2] = 'v';

        return 0;
}
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
xemacs syntax highlighting only works on string literals! (Slackware) turbo_spool Linux - Software 3 11-22-2007 11:25 AM
about wide character and multiple byte character George2 Programming 5 05-23-2006 02:03 AM
concentration of string literals with _FUNCTION_ is depreciated dzt Linux - Software 4 10-16-2003 01:38 PM
concentration of string literals with _FUNCTION_ is depreciated dzt Linux - General 0 10-16-2003 04:41 AM
^M character david_wliu Linux - General 6 03-17-2003 12:16 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 01:57 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration