ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
find . -type f -ls | awk '{$7+= $7 END {print $7}'
This program will output the size of all the files searched recursively from the current files.
This may also includes the same hard-linked files. So, how to write a program that will fetch the size of all the files but the hard-linked ones have to be included only ones.
I explain the same a bit more. Suppose 2 files are there having same inode number and both are located in the same directory tree from the current directory. So, I need that when the program calculates the total size of all the files, it should consider only 1 file among the 2 hard-linked files.
Hint: ls can show the amount of links; In case of hardlinks, this counter increases. If it's > 1, fetch the inode (ls can show that as well), store it somehow and if the inode is in your array (and thus already counted once) skip it. Otherwise add the size to the total size and store the inode in your array.
Somewhat abstract:
Code:
ls -li # shows inode - accessbits - link count - user - group - size - date - time - filename
if link count > 1
if inode is not in array of inodes
totalsize = totalsize + size
add inode to array of inodes
end if
else
totalsize = totalsize + size
endif
It's not really a good idea to parse ls for filenames or metadata.
I suggest instead looping through the file names and use the -ef flag in the test construct, or extract the inode numbers with stat, to remove any that evaluat as the same.
If this can be a bash script specifically (or ksh) then I also recommend using arrays to keep track of everything.
If you know the logic coming up with the code should be the easy part. Try it and when you have problems with your code post it here, so that we can see possible errors in the code.
We won't deliver you ready-made code, doing your homework is not something we do on this forum.
Mr. Moderator, please make it a welcoming forum for all. Don't take it as a homework. It's common to come up with such an issue.
Please see my earlier post where i have posted my code. Now let the viewers reply.
Quote:
Originally Posted by TobiSGD
If you know the logic coming up with the code should be the easy part. Try it and when you have problems with your code post it here, so that we can see possible errors in the code.
We won't deliver you ready-made code, doing your homework is not something we do on this forum.
It's simply a blockage in communications the way you responded. Why you say that you don't deliver ready-made codes!!! it's a service you are providing for all viewers to communicate and discuss each other problems. Please make it joy for all to be in this post. Now as these talks are out-of-subject, let us stop any further talk on such things. Let everyone enjoy there knowledge sharing.
THANKS AND HOPE YOU WILL ALWAYS LOOK FORWARD TO HELP OTHERS AND ENCOURAGE OTHERS AND GIVE EVERYBODY A FEELING THAT THERE SOLUTION IS IN THIS FORUM.
There is a difference between sharing knowledge and doing your work for you. Your initial question is clearly phrased in a way that it resembles a homework question, so we have to assume that it is one, and on this forum we won't do your work for you.
You were given pseudo-code pointing out what you have to do to make it work, Ramurd has done a good job with that. Try to make real code from it and when you have difficulties with that come back with what you have already and point out where it is not working or where you have difficulties. If you want to have ready-made code hire a programmer.
It's not really a good idea to parse ls for filenames or metadata.
I suggest instead looping through the file names and use the -ef flag in the test construct, or extract the inode numbers with stat, to remove any that evaluat as the same.
If this can be a bash script specifically (or ksh) then I also recommend using arrays to keep track of everything.
I read that link, and most of those weaknesses were known by me already. Filenames nowadays often contain spaces, but it's easy to work around that, by using "read" and putting quotes around the variable. As for "\n" characters, yes, I know they can. In all of my Unix career I've never seen it happen that files were created with this. Also in the link I "missed" the solution that most implementations of ls offer:
- "-b, --escape print C-style escapes for nongraphic characters".
- " -Q, --quote-name enclose entry names in double quotes
--quoting-style=WORD use quoting style WORD for entry names:
literal, locale, shell, shell-always, c, escape"
(both taken from ls --help :-) )
That said, the OP asked for a full 'awk' solution. I'm not (that) fluid in awk; so I'd work around that and use a bash or ksh solution instead. But that's not what was requested, so I offered a sort of workaround. There's a few things to keep in mind about this:
One has to make sure ls provides consistent behavior, both regarding date and time and filenames.
so this would be a first step approach around the problem, but it's not the solution that was requested; it's a solution to the problem.
And if this is homework, then it really should be done in awk, but in that case I leave it to the student :-)
Code:
declare -a INODES
TOTALSIZE=0
\ls -l -i -b --time-style=full-iso | while read inodenum access lc owner ogroup size date time offset filename
do
if [ -f "${filename}" ]
then
if (( lc > 1 ))
then
INODEFOUND=0
for((i=0;i< ${#INODES[*]};i++))
do
if [[ ${INODES[${i}]} == ${inodenum} ]]
then
((INODEFOUND++))
fi
done
if((INODEFOUND>1))
then
printf "ERROR: I should not have been able to find ${inodenum} more than once.\n"
elif (( INODEFOUND > 0 ))
then
printf "Skipping Inode ${inodenum} as it has already been counted.\n"
else
((TOTALSIZE+=size))
${INODES[${#INODES}]}=${inodenum}
fi
else
((TOTALSIZE+=size))
fi
fi
done
This code has not been tested at all, I've only written it in this post; so there may be a few glitches :-)
Also mind that there are implementations of ls that offer the possibility to provide a list separated by commas with the "-m" parameter. This may help you in your awk script. Mind that filenames might also contain commas, but those should be listed as escaped characters. I leave that exercise to someone else.
Last edited by Ramurd; 04-19-2013 at 11:58 AM.
Reason: variables in "... listed as escaped variables" ought to have been characters of course.
Calculate the total sze of the files recursively from the current directory. Hard linked files are to be considered only once.
Please use awk also.
We get a lot of homework questions, and that is phrased exactly like a homework question. Why shouldn't we consider to be a homework question? for example "Please use awk also." is not a usual 'real world' requirement. "Please do something that works" might be, or "Please do not use Ruby, as it isn't installed on all of the platforms" might just be, but that isn't.
In fact, it looks like one of the many 'just cut and paste' questions, where the poster hasn't even re-phrased the question that they have been asked by their course tutor. And, by the way, what is your intention with respect to soft linked files?
Quote:
Mr. Moderator, please make it a welcoming forum for all. Don't take it as a homework. It's common to come up with such an issue.
In this instance, the moderator is going straight down the line with the site rules. If you object to the site rules, and think that there would be some advantage if the site rules were different, you could argue that case, but this would not be the appropriate sub-forum for that (and you don't seem to be trying that, just objecting to the effect of applying the site rules to your query, so far).
Quote:
AND GIVE EVERYBODY A FEELING THAT THERE SOLUTION IS IN THIS FORUM.
Under the circumstances, choosing to shout was probably quite a bad decision. In any case, assuming that there is a solution in this forum (and that is probably correct), the only way in which that solution got there is that the people who have this solution got it by doing their own coding and learning from the experience. It would be unfair to deprive you of the opportunity to get to the same position.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.