Time complexity for an OS to find a file/directory?

kolleamm · Dec 18, 2018

What is the time complexity for an OS to find a file? Is it O(1) time?

Let's say for example, you had a billion files in a single folder, and you wanted to load a file into a string in your program, would the system find a specific file right away, or would there be a longer wait?

Let's assume an OS like Windows.

trurle · Dec 19, 2018

kolleamm said:

What is the time complexity for an OS to find a file? Is it O(1) time?

Let's say for example, you had a billion files in a single folder, and you wanted to load a file into a string in your program, would the system find a specific file right away, or would there be a longer wait?

Let's assume an OS like Windows.

Older OS (since Windows 2000 or NT 4) have the utility for files to be indexed, therefore time to find a file in indexed database is ~ln(N), which is slightly slower than O(0). The indexing is active by default and run fully automatically in Windows 7 or Windows 10, but is absent in Windows 8.
Most of Unix/Linux versions also use a sort of indexing/hashing of file tree.

kolleamm · Dec 19, 2018

trurle said:

Older OS (since Windows 2000 or NT 4) have the utility for files to be indexed, therefore time to find a file in indexed database is ~ln(N), which is slightly slower than O(0). The indexing is active by default and run fully automatically since Windows 7.

Do you mean O (log n)? Which would be the time complexity for a binary tree search.

trurle · Dec 19, 2018

kolleamm said:

Do you mean O (log n)? Which would be the time complexity for a binary tree search.

Yes, exactly

kolleamm · Dec 19, 2018

trurle said:

Yes, exactly

Ah I see, and by indexing do you mean alpha numerical order? I'm guessing by the ascii values of the characters?

trurle · Dec 19, 2018

kolleamm said:

Ah I see, and by indexing do you mean alpha numerical order? I'm guessing by the ascii values of the characters?

As i know, indexing is done by hashing the filename, and then by searching binary tree of hashes. Because hashes are shorter than full-path filenames, (4 to 16 bytes), the search is faster. Of course, this way do not work for templates, only for exact file names.

kolleamm · Dec 19, 2018

trurle said:

As i know, indexing is done by hashing the filename, and then by searching binary tree of hashes. Because hashes are shorter than filenames, (4 to 16 bytes), the search is faster. Of course, this way do not work for templates, only for exact file names.

Wow I never knew, thanks for the info!

anorlunda · Dec 19, 2018

@kolleamm , The science of sorting and searching is very advanced. I recommend the classical book on the subject.

Donald Knuth's "The Art and Science of Computer Programming, Volume 3"

Tom.G · Dec 20, 2018

anorlunda said:

Donald Knuth's "The Art and Science of Computer Programming, Volume 3"

"THE ART OF COMPUTER PROGRAMMING, Volume 3/Sorting and Searching" Donald Knuth, ADDISON-WESLEY PUBLISHING COMPANY, Reading, Massachusetts. 1973, ISBN 0-201-03803-X

There are later editions.

'The Bible.' Highly recommended!

Cheers,
Tom

harborsparrow · Jan 21, 2019

It's a huge text tree search. Lots of possible searches, and even if indexing, lots of ways to index (and keep an index updated). Worst case can be quite bad, especially if a file was just added and hasn't made it into the index yet, or if you are using regular expressions and there are a gazillion candidates for consideration.

DanielChin · Nov 24, 2020

But wouldn't a hash table lookup be O(1)?

Tom.G · Nov 24, 2020

Well, yeah, in the 'ideal' implementations. Once you hit the 'real' world there are either collisions or you are over provisioning the algorithm, the storage space, or both.

Time complexity for an OS to find a file/directory?

Is A.I. more than the sum of its parts?

AI vs. Humans as Processors in an Environment

Sweetspot of data compression

Other than just FizzBuzz to test programmer candidates

How to show RS(U+TRS)* is equivalent to (R+SUT)SU?

Newest Blender features

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Time complexity for an OS to find a file/directory?

Similar threads