LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 10-10-2010, 11:22 PM   #1
Anoop Madhusoodhanan P
LQ Newbie
 
Registered: Jan 2010
Posts: 2

Rep: Reputation: 0
Why RB trees are used in Linux’s page cache?


Why RB trees are used in Linux’s page cache for a chunk of a file (represented by an inode and offset pair) ? Self balancing can be attained using AVL trees also. So whats the significance of red and black nodes here? Can someone throw light on this..
 
Old 10-10-2010, 11:49 PM   #2
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,120

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
Have a look in the kernel source tree ../Documentation/rbtree.txt. The lwn reference might be of interest from a specifically Linux perspective.
 
Old 10-11-2010, 12:12 PM   #3
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
Quote:
Originally Posted by Anoop Madhusoodhanan P View Post
Why RB trees are used in Linux’s page cache for a chunk of a file (represented by an inode and offset pair) ? Self balancing can be attained using AVL trees also.
If I flipped your question around: Why would you prefer to use AVL trees?

I wrote my own tree code, which I use in place of std::set and std::map in many situations in which almost anyone else would have chosen std:set or std::map. Most implementations of std::set and std::map I have seen use RB trees. I chose to use AVL trees.

I only roughly understand the performance math of either, so I cannot say I chose AVL instead of RB based on any good understanding of the theoretical performance difference. I just felt more comfortable with AVL.

My primary motivation in writing my own was hitting a large number of situations in which the inflexibility of the interface of set and map led to inefficiencies in the calling code. So I wrote my own with a more flexible interface to allow for more efficient calling code, but lost some of the cleanliness and encapsulation of the design both necessarily because of the flexibility requirement and unnecessarily because I didn't have time to think through a fully polished design. So I have set and map replacements that are harder to use, even myself. But produce better results.

While I wanted flexibilty of interface for more efficient calling code, I also used my own sets in a few places where I had no need of that flexibility and it produced no extra efficiencies in the calling code. It still produced better overall efficiency than std::set. I don't know enough of the performance theory to say whether that improved performance indicates AVL trees were better for those problems than RB trees, or whether it just means I write tighter code than the implementers of either GNU or Microsoft std::set.

While coding it, I had the rough feel that AVL trees were giving me performance advantage vs RB trees in inserting elements that would be paid back in performance disadvantage while deleting elements. For my uses of sets and maps, that would be a net win for AVL trees. Most of my sets and maps are bulk discarded when fairly full. Over the life of the container, elements are both inserted and deleted. But significantly more are inserted than deleted because the container starts out empty and is ultimately bulk discarded fairly full. For many of these containers less than half of all elements ever inserted are individually deleted before the bulk discard.

I'm not certain (lack of the right math skills again) that AVL actually has a performance bias for individual insert and bulk discard as compared to RB. Maybe it just felt that way in coding.

Quote:
Originally Posted by syg00 View Post
../Documentation/rbtree.txt.
The relevant part (that I can find) of that document says:
Quote:
Red-black trees are similar to AVL trees, but provide faster real-time bounded worst case performance for insertion and deletion
I would prefer to know more than that. I don't think many uses of RB in Linux are actually "real-time". As long as "worst case" isn't really terrible, I think one would rather worry about average performance for cache look-up rather than worst case.

So I had read that before, and still never felt I understood why RB is so often used in situations where I would use AVL. My choice was not better informed than the choice made by GNU or Microsoft developers. So most likely I'm just failing to understand some reason that RB is a better choice. But so far, I think I'm getting better results from AVL than I would have gotten from RB.

From that lwn article at
http://lwn.net/Articles/184495/
Quote:
The high-resolution timer code uses an rbtree to organize outstanding timer requests.
It seems pretty clear RB is a better choice than AVL for that use. But (without having looked at the code) it is hard to imagine that RB is better than some other form of priority queue for such a purpose.

Maybe RB is used for a wide range of purposes because it only needed to be coded once for each of those purposes and is near best for each, when methods that are better for each would require many different methods individually coded.

Last edited by johnsfine; 10-11-2010 at 12:41 PM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Difference between page cache and buffer cache Nayaka Linux - Kernel 5 09-28-2011 08:23 AM
Tweak Page Cache in Ubuntu Linux sulekha Ubuntu 4 12-10-2008 02:36 AM
Memory leak in RHEL4 in the page cache sree1972 Linux - Server 1 08-03-2007 09:23 PM
better use of page cache with 64 bit processor panandsapphire Linux - Kernel 1 09-19-2006 10:05 AM
Where is page cache? zdz97 Linux - Hardware 2 09-16-2003 03:39 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 07:13 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration