LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 04-27-2010, 09:00 PM   #1
Ronin-8
LQ Newbie
 
Registered: Apr 2010
Distribution: Ubuntu 9.04
Posts: 13

Rep: Reputation: 0
ASCII clarification


Hello all,

I'm not sure if this is an appropiate question to be posting on this site. It's not a Linux specific question but since I use Linux I thought it would be okay.

I'm reading up on ASCII and was wondering if someone would be able to tell me if I have it right. I'm not sure if I'm 100% correct but this is what I've picked up so far:

A file is stored as a long list of bytes.

A byte has 256 combinations

A single byte represents any one of the 256 characters

Each character is a byte, so:

"A" 01000001 is a byte
"?" 00111111 is a byte
"3" 00000011 is a byte

A file name can be up to 256 characters, so that means up to 256 bytes.

If a file contained only one word like "Hello" then the size of that file would be 5 bytes.

If I'm right about what I've learned so far, I guess what I would like to know is where do Decimal, Hexidecimal and Octal numbers come into the picture?

Sorry if this is obvious but I'm just starting to learn both Linux and computers and would like to have a clear understanding of how this works.

Thank You.
 
Old 04-27-2010, 11:07 PM   #2
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
AIUI, the octal, decimal and hexadecimal entries are simply alternate ways to represent the byte sequences in more human-accessible forms. Depending on the programming environment, one base can be more convenient to use than another, otherwise they are all equivalent. The wikipedia entry on hexadecimal explains it like this:

Quote:
Each hexadecimal digit represents four binary digits (bits) (also called a "nibble"), and the primary use of hexadecimal notation is as a human-friendly representation of binary coded values in computing and digital electronics. For example, byte values can range from 0 to 255 (decimal) but may be more conveniently represented as two hexadecimal digits in the range 00 through FF. Hexadecimal is also commonly used to represent computer memory addresses.
It's fairly easy to convert byte sequences between binary, octal, and hexidecimal bases, which is why they're all commonly used in programming. But converting to and from decimal is a bit trickier, and it's mostly used when something needs to be human-readable.

Last edited by David the H.; 04-27-2010 at 11:19 PM. Reason: small word change for accuracy
 
Old 04-27-2010, 11:31 PM   #3
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809

Rep: Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743
As you learn things, keep focussed on the groupings of definitions. For example, these are all names for number systems:
binary
octal
decimal
hexadecimal

None of these has anything to do with the definitions of
bit
nibble
byte

And neither group has anything to do with:
ascii
unicode
ebcdic
and other character encoding schemes

To take one cut thru this, let's first define a "byte" by its number of bits:
1000 in binary
10 in octal
8 in decimal or hex

but it's **meaning** may be different in ascii, unicode, or ebcdic

So there are at least 3 ways to define something:
What is it?
How is it measured?
What does it do?
 
Old 04-28-2010, 07:17 AM   #4
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443
Blog Entries: 3

Rep: Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723
binary, hex, etc. are just ways of representing numbers. Remember, it's still the same value, just represented in a different way.

bit, byte, etc. have to do with the way computers store numbers (computers use binary):
a "bit" is a binary digit, a 1 or 0.
a "byte" is an 8-bit binary number. A lot of the computer's design is byte-centric. The RAM is basically an array of byte-size storage cells. Your hard drive stored data in units of bytes. Your CPU's word size is a multipla of 8, to make it easier to process bytes.

A file on the hard drive is an array of bytes.
 
Old 04-28-2010, 01:12 PM   #5
H_TeXMeX_H
LQ Guru
 
Registered: Oct 2005
Location: $RANDOM
Distribution: slackware64
Posts: 12,928
Blog Entries: 2

Rep: Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301
Quote:
Originally Posted by Ronin-8 View Post
A file name can be up to 256 characters, so that means up to 256 bytes.

If a file contained only one word like "Hello" then the size of that file would be 5 bytes.

If I'm right about what I've learned so far, I guess what I would like to know is where do Decimal, Hexidecimal and Octal numbers come into the picture?

Sorry if this is obvious but I'm just starting to learn both Linux and computers and would like to have a clear understanding of how this works.

Thank You.
Yes, try:

Code:
bash-3.1$ printf Hello > te
bash-3.1$ stat -c %s te
5
5 bytes in size, if you add a newline it will be 6.

This table is useful in understanding ASCII:
http://www.cs.utk.edu/~pham/ascii.html
 
Old 04-28-2010, 01:59 PM   #6
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
Quote:
Originally Posted by Ronin-8 View Post
A file is stored as a long list of bytes.
Basically yes, but there are some definitional quibbles possible.

Quote:
A byte has 256 combinations
Yes.

Quote:
A single byte represents any one of the 256 characters
Larger definitional quibbles on that one.

Quote:
Each character is a byte
In some representations of some character sets that is true (except that it still depends on what the meaning of "is" is).

Quote:
"A" 01000001 is a byte
"?" 00111111 is a byte
Yes.

Quote:
"3" 00000011 is a byte
No. Ascii '3' is not binary 3.

Quote:
A file name can be up to 256 characters, so that means up to 256 bytes.
Depends on the filesystem. I don't know the limit for common filesystems in Linux.

Quote:
If a file contained only one word like "Hello" then the size of that file would be 5 bytes.
The filesystem might keep track of 5 as the nominal size of the file, but the physical size of the file would be rounded up to some allocation unit.

Last edited by johnsfine; 04-28-2010 at 02:02 PM.
 
Old 04-28-2010, 03:12 PM   #7
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443
Blog Entries: 3

Rep: Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723
Quote:
Originally Posted by johnsfine View Post
No. Ascii '3' is not binary 3.
Yes, good point.

Remember, the number 3 and the ASCII character "3" are completely different.
 
Old 04-28-2010, 05:11 PM   #8
Ronin-8
LQ Newbie
 
Registered: Apr 2010
Distribution: Ubuntu 9.04
Posts: 13

Original Poster
Rep: Reputation: 0
Hey everyone, thank you for all your replies.

Okay I can now see how using hex is easier than dealing directly with binary. So in what situation would you be writing or reading hex? Are there specific files that have to be written in hex?

Whoops-ASCII '3' in binary is '00110011'
 
Old 04-28-2010, 05:13 PM   #9
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443
Blog Entries: 3

Rep: Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723
Remember, files are not "stored in hex" or "stored in binary". Do you understand that?
 
Old 04-28-2010, 05:53 PM   #10
Ronin-8
LQ Newbie
 
Registered: Apr 2010
Distribution: Ubuntu 9.04
Posts: 13

Original Poster
Rep: Reputation: 0
No I'm sorry I don't understand. I thought that hardware can only understand binary or "sequences of on's and off's", and since files are stored in hardware it would have to be in binary?

Do you mean that files are not "stored in binary" in the way that binary is just a human interpretation of the on and off sequences.

Last edited by Ronin-8; 04-28-2010 at 06:14 PM. Reason: Thought about it some more, lol.
 
Old 04-28-2010, 06:14 PM   #11
jiml8
Senior Member
 
Registered: Sep 2003
Posts: 3,171

Rep: Reputation: 116Reputation: 116
Try man ascii.

And files ARE stored in binary.
 
Old 04-28-2010, 07:06 PM   #12
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809

Rep: Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743
files are not stored in hex--or binary (But-see the discussion to follow)--or decimal---or octal. Those are all number systems.

Digital storage is in bits. We have seen some definitions here, and you can look it up also. A "bit" is a way of describing an element which can have two states. To be sure, the word "binary" is sometimes used in reference to this 2-state paradigm. Personally, I think it is better to make the distinction between:
Analog data: stored or transmitted as a continuum of voltage or current states
Digital data: stored or transmitted as a series of bits (or bytes, where 1 byte = 8 bits)

Isn't semantics fun?.......
 
Old 04-28-2010, 07:23 PM   #13
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443
Blog Entries: 3

Rep: Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723
Technically files are stored in binary, but that is intrinsic to the file system. It is transparent to the user. It's impossible to have "a file in hex" or "a file in decimal".

Imagine a file as an array of numbers, each of which can be an integer from 0 to 255 (inclusive).
 
Old 04-28-2010, 07:24 PM   #14
Ronin-8
LQ Newbie
 
Registered: Apr 2010
Distribution: Ubuntu 9.04
Posts: 13

Original Poster
Rep: Reputation: 0
Lol, yes semantics is very fun!

Okay so then files are stored in bits. And number systems such as binary, hex, oct and decimal are ways to represent the state of these bits.
 
Old 04-28-2010, 07:26 PM   #15
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443
Blog Entries: 3

Rep: Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723
Quote:
Originally Posted by Ronin-8 View Post
Okay so then files are stored in bits. And number systems such as binary, hex, oct and decimal are ways to represent the state of these bits.
Exactly.

I recommend you try programming, esp. working with files, that will clear up many things.

(but I still can't imagine the amount of confusion it takes to think that files can be stored in "different number systems"...)
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[bash] ASCII to HEX and hex to ascii ////// Programming 17 05-08-2018 09:55 PM
rsync clarification salimshahzad Linux - Server 2 02-24-2010 06:51 PM
need some clarification/help superhumanCA Linux - General 4 03-18-2008 12:19 PM
I just need some clarification Duneatreides Linux - Newbie 2 03-12-2007 05:48 AM
Clarification needed..... SomeEverydayNob Linux - Newbie 3 05-10-2003 02:07 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 04:20 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration