LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 04-17-2013, 04:10 AM   #1
ravisingh1
Member
 
Registered: Apr 2013
Location: Mumbai
Distribution: Ubuntu13.10
Posts: 291

Rep: Reputation: Disabled
How to select unique hard link files?


Calculate the total sze of the files recursively from the current directory. Hard linked files are to be considered only once.

Please use awk also.
 
Old 04-17-2013, 04:36 AM   #2
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
Homework? We are willing to help, but we won't do it for you....

What have you tried this far and what are the problems you run into?
 
1 members found this post helpful.
Old 04-17-2013, 09:14 AM   #3
ravisingh1
Member
 
Registered: Apr 2013
Location: Mumbai
Distribution: Ubuntu13.10
Posts: 291

Original Poster
Rep: Reputation: Disabled
Code:
find . -type f -ls | awk '{$7+= $7 END {print $7}'
This program will output the size of all the files searched recursively from the current files.
This may also includes the same hard-linked files. So, how to write a program that will fetch the size of all the files but the hard-linked ones have to be included only ones.
I explain the same a bit more. Suppose 2 files are there having same inode number and both are located in the same directory tree from the current directory. So, I need that when the program calculates the total size of all the files, it should consider only 1 file among the 2 hard-linked files.
 
Old 04-17-2013, 09:29 AM   #4
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,248
Blog Entries: 8

Rep: Reputation: 235Reputation: 235Reputation: 235
Quote:
Originally Posted by ravisingh1 View Post
Calculate the total sze of the files recursively from the current directory. Hard linked files are to be considered only once.

Please use awk also.
But you should know that a single file could be hard-linked twice or more. How do you intend to recognize them as one?
 
Old 04-18-2013, 03:03 AM   #5
Ramurd
Member
 
Registered: Mar 2009
Location: Rotterdam, the Netherlands
Distribution: Slackwarelinux
Posts: 703

Rep: Reputation: 111Reputation: 111
Hint: ls can show the amount of links; In case of hardlinks, this counter increases. If it's > 1, fetch the inode (ls can show that as well), store it somehow and if the inode is in your array (and thus already counted once) skip it. Otherwise add the size to the total size and store the inode in your array.

Somewhat abstract:
Code:
ls -li # shows inode - accessbits - link count - user - group - size - date - time - filename
if link count > 1
   if inode is not in array of inodes
      totalsize = totalsize + size
      add inode to array of inodes
   end if
else
  totalsize = totalsize + size
endif
 
Old 04-18-2013, 04:25 PM   #6
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
It's not really a good idea to parse ls for filenames or metadata.

I suggest instead looping through the file names and use the -ef flag in the test construct, or extract the inode numbers with stat, to remove any that evaluat as the same.

If this can be a bash script specifically (or ksh) then I also recommend using arrays to keep track of everything.
 
Old 04-18-2013, 08:12 PM   #7
ravisingh1
Member
 
Registered: Apr 2013
Location: Mumbai
Distribution: Ubuntu13.10
Posts: 291

Original Poster
Rep: Reputation: Disabled
Ramurd, I know the logic but I need codes.
 
Old 04-18-2013, 08:25 PM   #8
TobiSGD
Moderator
 
Registered: Dec 2009
Location: Germany
Distribution: Whatever fits the task best
Posts: 17,148
Blog Entries: 2

Rep: Reputation: 4886Reputation: 4886Reputation: 4886Reputation: 4886Reputation: 4886Reputation: 4886Reputation: 4886Reputation: 4886Reputation: 4886Reputation: 4886Reputation: 4886
If you know the logic coming up with the code should be the easy part. Try it and when you have problems with your code post it here, so that we can see possible errors in the code.
We won't deliver you ready-made code, doing your homework is not something we do on this forum.
 
1 members found this post helpful.
Old 04-18-2013, 09:02 PM   #9
ravisingh1
Member
 
Registered: Apr 2013
Location: Mumbai
Distribution: Ubuntu13.10
Posts: 291

Original Poster
Rep: Reputation: Disabled
Mr. Moderator, please make it a welcoming forum for all. Don't take it as a homework. It's common to come up with such an issue.
Please see my earlier post where i have posted my code. Now let the viewers reply.

Quote:
Originally Posted by TobiSGD View Post
If you know the logic coming up with the code should be the easy part. Try it and when you have problems with your code post it here, so that we can see possible errors in the code.
We won't deliver you ready-made code, doing your homework is not something we do on this forum.
It's simply a blockage in communications the way you responded. Why you say that you don't deliver ready-made codes!!! it's a service you are providing for all viewers to communicate and discuss each other problems. Please make it joy for all to be in this post. Now as these talks are out-of-subject, let us stop any further talk on such things. Let everyone enjoy there knowledge sharing.

THANKS AND HOPE YOU WILL ALWAYS LOOK FORWARD TO HELP OTHERS AND ENCOURAGE OTHERS AND GIVE EVERYBODY A FEELING THAT THERE SOLUTION IS IN THIS FORUM.
 
Old 04-19-2013, 04:30 AM   #10
TobiSGD
Moderator
 
Registered: Dec 2009
Location: Germany
Distribution: Whatever fits the task best
Posts: 17,148
Blog Entries: 2

Rep: Reputation: 4886Reputation: 4886Reputation: 4886Reputation: 4886Reputation: 4886Reputation: 4886Reputation: 4886Reputation: 4886Reputation: 4886Reputation: 4886Reputation: 4886
There is a difference between sharing knowledge and doing your work for you. Your initial question is clearly phrased in a way that it resembles a homework question, so we have to assume that it is one, and on this forum we won't do your work for you.
You were given pseudo-code pointing out what you have to do to make it work, Ramurd has done a good job with that. Try to make real code from it and when you have difficulties with that come back with what you have already and point out where it is not working or where you have difficulties. If you want to have ready-made code hire a programmer.
 
3 members found this post helpful.
Old 04-19-2013, 04:39 AM   #11
Ramurd
Member
 
Registered: Mar 2009
Location: Rotterdam, the Netherlands
Distribution: Slackwarelinux
Posts: 703

Rep: Reputation: 111Reputation: 111
Quote:
Originally Posted by David the H. View Post
It's not really a good idea to parse ls for filenames or metadata.

I suggest instead looping through the file names and use the -ef flag in the test construct, or extract the inode numbers with stat, to remove any that evaluat as the same.

If this can be a bash script specifically (or ksh) then I also recommend using arrays to keep track of everything.
I read that link, and most of those weaknesses were known by me already. Filenames nowadays often contain spaces, but it's easy to work around that, by using "read" and putting quotes around the variable. As for "\n" characters, yes, I know they can. In all of my Unix career I've never seen it happen that files were created with this. Also in the link I "missed" the solution that most implementations of ls offer:
- "-b, --escape print C-style escapes for nongraphic characters".
- " -Q, --quote-name enclose entry names in double quotes
--quoting-style=WORD use quoting style WORD for entry names:
literal, locale, shell, shell-always, c, escape"
(both taken from ls --help :-) )

That said, the OP asked for a full 'awk' solution. I'm not (that) fluid in awk; so I'd work around that and use a bash or ksh solution instead. But that's not what was requested, so I offered a sort of workaround. There's a few things to keep in mind about this:
One has to make sure ls provides consistent behavior, both regarding date and time and filenames.

so this would be a first step approach around the problem, but it's not the solution that was requested; it's a solution to the problem.
And if this is homework, then it really should be done in awk, but in that case I leave it to the student :-)
Code:
declare -a INODES
TOTALSIZE=0
\ls -l -i -b --time-style=full-iso | while read inodenum access lc owner ogroup size date time offset filename
do
   if [ -f "${filename}" ]
   then
      if (( lc > 1 ))
      then
         INODEFOUND=0
         for((i=0;i< ${#INODES[*]};i++))
         do
           if [[ ${INODES[${i}]} == ${inodenum} ]]
           then
               ((INODEFOUND++))
           fi
         done
         if((INODEFOUND>1))
         then
            printf "ERROR: I should not have been able to find ${inodenum} more than once.\n"
         elif (( INODEFOUND > 0 ))
         then
            printf "Skipping Inode ${inodenum} as it has already been counted.\n"
         else
            ((TOTALSIZE+=size))
            ${INODES[${#INODES}]}=${inodenum}
         fi
      else
         ((TOTALSIZE+=size))
      fi
   fi
done
This code has not been tested at all, I've only written it in this post; so there may be a few glitches :-)
Also mind that there are implementations of ls that offer the possibility to provide a list separated by commas with the "-m" parameter. This may help you in your awk script. Mind that filenames might also contain commas, but those should be listed as escaped characters. I leave that exercise to someone else.

Last edited by Ramurd; 04-19-2013 at 11:58 AM. Reason: variables in "... listed as escaped variables" ought to have been characters of course.
 
Old 04-19-2013, 05:10 AM   #12
salasi
Senior Member
 
Registered: Jul 2007
Location: Directly above centre of the earth, UK
Distribution: SuSE, plus some hopping
Posts: 4,070

Rep: Reputation: 897Reputation: 897Reputation: 897Reputation: 897Reputation: 897Reputation: 897Reputation: 897
Quote:
Originally Posted by ravisingh1 View Post
Calculate the total sze of the files recursively from the current directory. Hard linked files are to be considered only once.

Please use awk also.
We get a lot of homework questions, and that is phrased exactly like a homework question. Why shouldn't we consider to be a homework question? for example "Please use awk also." is not a usual 'real world' requirement. "Please do something that works" might be, or "Please do not use Ruby, as it isn't installed on all of the platforms" might just be, but that isn't.

In fact, it looks like one of the many 'just cut and paste' questions, where the poster hasn't even re-phrased the question that they have been asked by their course tutor. And, by the way, what is your intention with respect to soft linked files?

Quote:
Mr. Moderator, please make it a welcoming forum for all. Don't take it as a homework. It's common to come up with such an issue.
In this instance, the moderator is going straight down the line with the site rules. If you object to the site rules, and think that there would be some advantage if the site rules were different, you could argue that case, but this would not be the appropriate sub-forum for that (and you don't seem to be trying that, just objecting to the effect of applying the site rules to your query, so far).

Quote:
AND GIVE EVERYBODY A FEELING THAT THERE SOLUTION IS IN THIS FORUM.
Under the circumstances, choosing to shout was probably quite a bad decision. In any case, assuming that there is a solution in this forum (and that is probably correct), the only way in which that solution got there is that the people who have this solution got it by doing their own coding and learning from the experience. It would be unfair to deprive you of the opportunity to get to the same position.

HTH
 
3 members found this post helpful.
Old 04-19-2013, 09:11 AM   #13
ravisingh1
Member
 
Registered: Apr 2013
Location: Mumbai
Distribution: Ubuntu13.10
Posts: 291

Original Poster
Rep: Reputation: Disabled
Ramurd, I thank you very much
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Difference Between Soft Link & Hard Link rajaniyer123 Solaris / OpenSolaris 16 09-30-2012 03:42 AM
create soft link and hard link in RHEL5 ramadas88 Linux - Server 6 09-15-2010 04:32 AM
in copy files or ls files the command want to invert select some files how to?? hocheetiong Linux - Newbie 3 06-27-2008 06:32 AM
select the longest string of each unique path in a sorted text file powah Programming 5 03-01-2007 11:16 AM
Hard Disk Error.... Unique. cmgannon26 Linux - Hardware 1 05-02-2005 08:13 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 08:09 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration