LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 12-24-2013, 01:53 AM   #1
sam@
Member
 
Registered: Sep 2013
Posts: 31

Rep: Reputation: Disabled
Question finding sequence size


Hi All,

I have a file which has sequences which look like this

>String1
aqwertrtrytytyuuijhgddfghhhghhgjhjhhsswekrkmygppdslxmvbnhkwqalldrtjbllnlnlnnnvc
>String2
qwwerrtyuiopasdfghjmnbvfklzxerbvcwghjjkoowwqerrtggbddqsdfgaqwcxzakjtyugfsdefrtgyhujiknbbbbcdcdcxsxsx zxzxcvcfcdcg
>String3
rtyhujrfedwsqavfbgnhmjklopoiuytiuytrewqxszavfbgnhmjkjgdaaarftgwqwqsxddfcazxshjklopute
...

>String120..

so on

Whenever I wanted to find the size of any particular sequence (say string 1) I used to delete the remaining part of the file( ie from string2 till end) ,and find file size using du -h .This gave me the size of string1.

However is there an easier way that I could find say size of (string1 or say string2 )rather than deleting the remaining part to make a new file.
 
Old 12-24-2013, 03:49 AM   #2
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
@sam@: Can you be more specific about what it is you are after?

You talk about the size of a string and using du to determine this. du reports the size of a file, normally reported in the amount of blocks needed for disk storage.

Using your string1 as an example, du will show:
Code:
$ du test.file
4       test.file
This tells you that 4.0k is needed to store this file on disk. This reported size will stay the same if you add a few characters to the string (up to a point, after which the size will become 8.0k)

The actual file itself isn't 4.0k:
Code:
$ du -b test.file
80      test.file

# or using stat:
$ stat test.file 
  File: `test.file'
  Size: 80              Blocks: 8          IO Block: 4096   regular file
80 bytes are needed.

The length of the string itself is 79 characters("bytes") the extra byte is the carriage return, which is part of the file and not the string itself.

So, what is it you are actually trying to determine?
 
Old 12-24-2013, 04:34 AM   #3
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,912

Rep: Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513
You can also always use awk to determine the line length...
or perl...
 
Old 12-24-2013, 06:25 AM   #4
sam@
Member
 
Registered: Sep 2013
Posts: 31

Original Poster
Rep: Reputation: Disabled
Unhappy

Basically What I meant to say is whenever I wanted to find the length of any string (take case of finding length of string1)
i used to create a new file having

>String1
aqwertrtrytytyuuijhgddfghhhghhgjhjhhsswekrkmygppdslxmvbnhkwqalldrtjbllnlnlnnnvc

and delete the remaining part of original file.
Then using a ls -la i used to find the size of this newly created file.

My query is there a way i could find the file length of string1 string 2 without having to split the file
 
Old 12-24-2013, 06:28 AM   #5
sam@
Member
 
Registered: Sep 2013
Posts: 31

Original Poster
Rep: Reputation: Disabled
Intuitively if I need to find the size of string1 ,it will calculate from >string1 till >string2 and give me the bytes(file size) between them.

For finding size of string2, it will start from >string2 to>string3 and find the bytes between them and so on.
 
Old 12-24-2013, 06:52 AM   #6
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
You are still mixing length, bytes and (file) size...

Ok, have a look at this:
Code:
$ cat  test.file
>String1
aqwertrtrytytyuuijhgddfghhhghhgjhjhhsswekrkmygppdslxmvbnhkwqalldrtjbllnlnlnnnvc
>String2
qwwerrtyuiopasdfghjmnbvfklzxerbvcwghjjkoowwqerrtggbddqsdfgaqwcxzakjtyugfsdefrtgyhujiknbbbbcdcdcxsxsx zxzxcvcfcdcg
>String3
rtyhujrfedwsqavfbgnhmjklopoiuytiuytrewqxszavfbgnhmjkjgdaaarftgwqwqsxddfcazxshjklopute
With that in mind:
Code:
# get length of String2
$ thisString=$(sed -n '/String2/{n;p}' test.file)
$ echo "${#thisString}"
113

# get length of String1
$ thisString=$(sed -n '/String1/{n;p}' test.file)
$ echo "${#thisString}"
79

# get length of String3
$ thisString=$(sed -n '/String3/{n;p}' test.file)
$ echo "${#thisString}"
85
The sed command looks for StringX and print the next line (the actual string), which is stored in a variable named thisString. The echo "${#thisString}" prints the length of the found string.

If this is not what you are after then please provide more details (also provide an input and expected output example).

EDIT: I just noticed that there is a space in string number 2, which makes it 2 strings. The above sed solution ignores this (the space is counted as 1 character).

Last edited by druuna; 12-24-2013 at 07:08 AM.
 
Old 12-24-2013, 06:53 AM   #7
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,912

Rep: Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513
Read the bash manpage.

You will find the answer there. If the string is in a shell variable s, then ${#s} will be its length.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Finding and deleting a sequence of letters geodave0110 Linux - Newbie 3 12-08-2010 01:14 PM
Finding Directory Size Manjunath1847 Linux - Software 5 10-27-2010 04:54 AM
Finding a file size LneWlf Linux - Newbie 6 12-07-2009 06:06 PM
Help finding gaps in a sequence mikeleigh Programming 10 08-20-2009 05:13 PM
Finding size of a directory chrisk5527 Linux - General 2 12-30-2003 08:49 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 10:26 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration