Search in Linux
Hello ..
I have a website , in my website code there is a link (e.g. ....../myfolder/file001.pdf) this link is used to access a file in the linux system. My question is: How the linux search for a file in the folder? especially if we have million of files inside the folder. I want to know the searching method? Is there a special algorithm, or it will go through all the files (file by file) and chack the file name? Or it will use the find method in linux? Thanks in advance. |
It depends on what you are referring to.
Searching for a specific file, searching for a file with a partial name? If "how does linux locate the file to read" is the question then the answer depends on the filesystem used. Most of the current filesystems (ext4 for instance) are using trees to store the directory, and the file name itself is hashed to make a short key to make quick matches - if the short key matches, then it verifies by using the full name. It is still a bit slow if the directory has "million of files".. There are also other problems caused by that (slower to backup... harder for people to scan the file names, longer search when you don't know the exact name, too many names when performing file maintenance...) |
Quote:
------------------------------------- Thank you very much. Yes .. Searching for a specific file in the system. Each Linux system support one filesystem or more than one filesystem? Do you have a resource (website or document) that explain this in details? specially the delay part (Slow). |
Quote:
what does the "......" stand for? |
Quote:
I'll start the system next week. |
Quote:
xfs is another filesystem (from SGI) that is also designed to handle large files and large filesystems. It is also used a a base for a cluster filesystem from SGI (cxfs) that has proprietary parts. There is jfs from IBM, riserfs (not used as much now) with alternate data segments like HFS from Apple.. btrfs is from Oracle - it has some really nice features such as builtin support for raid 1/5/6, but also some really bad errors (it is still in testing, raid 5/6 specially, but raid 1 also has some issues). ISO9660 filesystems for DVD/CD usage... NTFS is supported (but you have to be careful if the system is also booted to windows - Windows has be fully shutdown before Linux can use the filesystem or it shows up as corrupted). Another problem is that NTFS doesn't have quite the same definition of user identification... and has some security weaknesses. FAT16/32 is available for compatibility use but has no user identification at all, file handling is a bit peculiar from the point of view of Linux (text files for instance should have <cr><lf> for compatibility, but Linux files don't). Quote:
To know which limitations apply to which filesystem, you have to search for design documents... For the Linux native ones that isn't too hard, but for those from outside, it is harder to find. |
Just a thought... Have you tried
Code:
# updatedb According to the man pages, it's generally run as a cron job overnight to update with any changes; new files, deletions... Finding files is then by using the command Code:
$ locate <filename> Anyway... My :twocents: Play Bonny! :hattip: |
Quote:
---------------------------- Thank you very very much. for the answer and the prompt reply. Qus: Directory indexing: dir_index (dir_index which use hashed b-trees to speed up name lookups in large directories.) This feature just for the directories, what about the files? or it also used to index the files in single directory? |
Quote:
--------------- Thank you so much will try it. |
Quote:
|
Quote:
I don't mean the content of the files. I mean the content of the directory (many files in a single directory) How the kernal index these files? How the kernal search for a file in a directory containing many files? What method the kernal will use for the search? Hashing or linked list..? Or Is it the same as directories (dir_index hash btree or there is a different way) Thank you so much again. |
Quote:
You can find out some from https://ext4.wiki.kernel.org/index.php/Main_Page but for exact details you have to look at the source code. |
Quote:
Thank you sooo much dear. |
somehow i can't shake the feeling that op is actually talking about a server, and something like HTML_DOCUMENT_ROOT.
and not really about filesystems at all. |
Quote:
|
All times are GMT -5. The time now is 07:40 PM. |