Time complexity for an OS to find a file/directory?

In summary, the time complexity for an OS to find a file can vary depending on the indexing system used. In older OS versions, the time complexity is ~ln(N) due to indexing, while in most Unix/Linux versions, it is O(log n) due to a binary tree search. Indexing is done by hashing the filename and searching a binary tree of hashes, which is faster than searching by full-path filenames. However, this method may not work for templates and only works for exact file names. The science of sorting and searching is highly advanced and there are many ways to approach it. Donald Knuth's book "The Art of Computer Programming, Volume 3" is a recommended resource on the subject.
  • #1
kolleamm
477
44
What is the time complexity for an OS to find a file? Is it O(1) time?

Let's say for example, you had a billion files in a single folder, and you wanted to load a file into a string in your program, would the system find a specific file right away, or would there be a longer wait?

Let's assume an OS like Windows.
 
Technology news on Phys.org
  • #2
kolleamm said:
What is the time complexity for an OS to find a file? Is it O(1) time?

Let's say for example, you had a billion files in a single folder, and you wanted to load a file into a string in your program, would the system find a specific file right away, or would there be a longer wait?

Let's assume an OS like Windows.
Older OS (since Windows 2000 or NT 4) have the utility for files to be indexed, therefore time to find a file in indexed database is ~ln(N), which is slightly slower than O(0). The indexing is active by default and run fully automatically in Windows 7 or Windows 10, but is absent in Windows 8.
Most of Unix/Linux versions also use a sort of indexing/hashing of file tree.
 
  • Like
Likes kolleamm
  • #3
trurle said:
Older OS (since Windows 2000 or NT 4) have the utility for files to be indexed, therefore time to find a file in indexed database is ~ln(N), which is slightly slower than O(0). The indexing is active by default and run fully automatically since Windows 7.
Do you mean O (log n)? Which would be the time complexity for a binary tree search.
 
  • Like
Likes trurle
  • #4
kolleamm said:
Do you mean O (log n)? Which would be the time complexity for a binary tree search.
Yes, exactly
 
  • #5
trurle said:
Yes, exactly
Ah I see, and by indexing do you mean alpha numerical order? I'm guessing by the ascii values of the characters?
 
  • #6
kolleamm said:
Ah I see, and by indexing do you mean alpha numerical order? I'm guessing by the ascii values of the characters?
As i know, indexing is done by hashing the filename, and then by searching binary tree of hashes. Because hashes are shorter than full-path filenames, (4 to 16 bytes), the search is faster. Of course, this way do not work for templates, only for exact file names.
 
  • Like
Likes kolleamm
  • #7
trurle said:
As i know, indexing is done by hashing the filename, and then by searching binary tree of hashes. Because hashes are shorter than filenames, (4 to 16 bytes), the search is faster. Of course, this way do not work for templates, only for exact file names.
Wow I never knew, thanks for the info!
 
  • #8
@kolleamm , The science of sorting and searching is very advanced. I recommend the classical book on the subject.

Donald Knuth's "The Art and Science of Computer Programming, Volume 3"
 
  • Like
Likes Klystron, QuantumQuest, FactChecker and 1 other person
  • #9
anorlunda said:
Donald Knuth's "The Art and Science of Computer Programming, Volume 3"

"THE ART OF COMPUTER PROGRAMMING, Volume 3/Sorting and Searching" Donald Knuth, ADDISON-WESLEY PUBLISHING COMPANY, Reading, Massachusetts. 1973, ISBN 0-201-03803-X

There are later editions.

'The Bible.' Highly recommended!

Cheers,
Tom
 
  • Like
Likes Klystron, FactChecker and anorlunda
  • #10
It's a huge text tree search. Lots of possible searches, and even if indexing, lots of ways to index (and keep an index updated). Worst case can be quite bad, especially if a file was just added and hasn't made it into the index yet, or if you are using regular expressions and there are a gazillion candidates for consideration.
 
  • #11
But wouldn't a hash table lookup be O(1)?
 
  • #12
Well, yeah, in the 'ideal' implementations. Once you hit the 'real' world there are either collisions or you are over provisioning the algorithm, the storage space, or both.
 
  • Like
Likes Vanadium 50 and DanielChin

1. What is time complexity for an OS to find a file/directory?

The time complexity for an OS to find a file/directory depends on the algorithm used by the OS. It can range from O(1) for a simple linear search to O(log n) for a binary search, where n is the number of files/directories in the system.

2. How does the number of files/directories in the system affect the time complexity?

The time complexity generally increases with the number of files/directories in the system. This is because the OS has to search through a larger pool of data to find the desired file/directory.

3. What factors other than algorithm and number of files/directories can affect the time complexity?

Other factors that can affect the time complexity include the size and structure of the storage device, the efficiency of the file system, and the hardware resources available to the OS.

4. Can the time complexity for finding a file/directory differ between different operating systems?

Yes, the time complexity can differ between operating systems depending on their design and implementation. Some OS may have more efficient algorithms for file/directory search, resulting in a lower time complexity.

5. Is there a way to improve the time complexity for finding a file/directory?

Yes, there are various techniques that can be used to improve the time complexity for finding a file/directory. These include optimizing the algorithm used, implementing caching mechanisms, and using efficient data structures for file system organization.

Similar threads

  • Programming and Computer Science
Replies
17
Views
1K
  • Programming and Computer Science
3
Replies
81
Views
5K
  • Programming and Computer Science
Replies
19
Views
1K
  • Programming and Computer Science
Replies
18
Views
1K
  • Programming and Computer Science
Replies
0
Views
413
  • Programming and Computer Science
Replies
3
Views
2K
  • Programming and Computer Science
Replies
7
Views
2K
  • Programming and Computer Science
Replies
19
Views
3K
  • Computing and Technology
Replies
24
Views
7K
  • Programming and Computer Science
3
Replies
75
Views
4K
Back
Top