Are there are any cases where you would prefer `O(log n)`

time complexity to `O(1)`

time complexity? Or `O(n)`

to `O(log n)`

?

Do you have any examples?

I would prefer an O(log n) algorithm to an O(1) algorithm if understand the former, but not the latter...

**2018年05月27日17分47秒**

There's tons of impractical data structures with O(1) operations from theoretical computer science. One example would be select() on bitvectors, which can be supported in o(n) extra space and O(1) per operation, using 5 layers of indirection. Simple binary search combined with O(1) rank() turns out to be faster in practice according to the author of the Succinct Data Structure Library

**2018年05月27日17分47秒**

Lower asymptotic complexity does not guarentee faster runtimes. Research matrix multiplication for a concrete example.

**2018年05月27日17分47秒**

Also...any algorithm can be converted to O(1), given a sufficiently large table lookup ;)

**2018年05月27日17分47秒**

Hoten -- That's assuming the table lookup is O(1), which isn't a given at all for the size of the tables you're talking about! :)

**2018年05月27日17分47秒**

Also, I've seen a few times that people were focused on the big-O of their central algorithm, but ignoring the setup costs. Building a hash-table, for example, can be more expensive than going through an array linearly if you don't need to do it over and over again. In fact, due to the way modern CPUs are built, even something like binary search can be just as fast on sorted arrays as a linear search - profiling is a necessity.

**2018年05月27日17分47秒**

Luaan "In fact, due to the way modern CPUs are built, even something like binary search can be just as fast on sorted arrays as a linear search - profiling is a necessity." Interesting! Can you explain how binary search and linear search could take the same amount of time on a modern cpu?

**2018年05月27日17分47秒**

Luaan - Never mind, I found this: schani.wordpress.com/2010/04/30/linear-vs-binary-search

**2018年05月27日17分47秒**

DenisdeBernardy: No, actually it doesn't. They could be algorithms in P. And even if these weren't, under reasonable definitions of what it means to parallelize, that wouldn't imply P != NP either. Also remember that search the space of possible runs of a non-deterministic turing machine is quite parallelizable.

**2018年05月27日17分47秒**

This answer could benefit from some examples...

**2018年05月27日17分47秒**

One example I can think of: for a small sorted array, it'd be easier and more compact for the programmer to implement a binary search function than to write a complete hash map implementation and use it instead.

**2018年05月27日17分47秒**

An example of complexity: finding the median of an unsorted list is easy to do in O(n * log n) but hard to do in O(n).

**2018年05月27日17分47秒**

-1, don't put logs in your toaster... Kidding aside, this is spot on. lg n is so, so, so close to k for large n that most operations would never notice the difference.

**2018年05月27日17分47秒**

There's also the fact that the algorithmic complexities most people are familiar with don't take cache effects into account. Looking up something in a binary tree is O(log2(n)) according to most people but in reality it's much worse because binary trees have bad locality.

**2018年05月27日17分47秒**

Alistra addressed this indirectly when talking about "space concerns"

**2018年05月27日17分47秒**

Huge amount of cache misses only multiplies the final execution by a constant value (which is not bigger than 8 for 4-core 3.2GHz CPU with 1.6GHz ram, usually it is much lower) so it is counted as a fixed constant in the big-O notation. Thus only thing the cache misses cause is moving the threshold of n where that O(n) solution starts to be slower than the O(1) solution.

**2018年05月27日17分47秒**

MarianSpanik You are of course correct. But this question asked for a situation where we would prefer O(logn) over O(1). You could very easily imagine a situation where for all your feasible n, the less memory-bound application would run in faster wall-time, even at a higher complexity.

**2018年05月27日17分47秒**

MarianSpanik isn't a cache miss up to 300 clock cycles ? Where is the 8 coming from ?

**2018年05月27日17分47秒**

While what you said is true, in that case, an algorithm executing in O(1) is by definition invulnerable to timing attacks.

**2018年05月27日17分47秒**

JustinLessard: Being O(1) means that there is some input size after which the algorithm's runtime is bounded by a constant. What happens below this threshold is unknown. Also, the threshold may not even be met for any real-world use of the algorithm. The algorithm might be linear and thus leak information about the length of the input, for example.

**2018年05月28日17分47秒**

The runtime might also fluctuate in different ways, while still being bounded. If the runtime is proportional to (n mod 5) + 1, it is still O(1), yet reveals information about n. So a more complex algorithm with smoother runtime may be preferable, even if it may be asymptotically (and possibly even in practice) slower.

**2018年05月27日17分47秒**

This is basically why bcrypt is considered good; it makes things slower

**2018年05月27日17分47秒**

DavidGrinberg That's the reason why bcrypt is used, and fits the question. But that is unrelated to the this answer, which talks about timing attacks.

**2018年05月27日17分47秒**

I'm a little confused here. Are you talking about creating a 10-billion-entry array (most of which will be undefined) and treating the UPC as an index into that array?

**1970年01月01日00分03秒**

DavidZ Yes. If you use a sparse array, you may not get O(1) but it will use only 1MB memory. If you use an actual array, you're guaranteed O(1) access but it will use 1/3 TB memory.

**2018年05月27日17分47秒**

On a modern system, it will use 1/3 TB of address space, but that doesn't mean it will come anywhere close to that much allocated backing memory. Most modern OSes don't commit storage for allocations until they need to. When doing this, you're essentially hiding an associative lookup structure for your data inside the OS/hardware virtual memory system.

**2018年05月27日17分47秒**

Novelocrat True, but if you're doing it at RAM speeds the lookup time won't matter, no reason to use 40mb instead of 1mb. The array version only makes sense when storage access is expensive--you're going off to disk.

**2018年05月27日17分47秒**

Or when this isn't a performance-critical operation, and developer time is expensive - saying malloc(search_space_size) and subscripting into what it returns is as easy as it gets.

**2018年05月28日17分47秒**

But all that parallelization does is reduce the constant factor that others have talked about, right?

**2018年05月27日17分47秒**

Yes, but a parallel algorithm can divide the constant factor by 2 every time you double the number of executing machines. Another single threaded algorithm can reduce the constant factor just one time in a constant way. So with a parallel algorithm you can dynamically react to the size of n and be faster in wall clock execution time.

**2018年05月27日17分47秒**

I might add here that on the sequential-access drive or tape, the O(1) algorithm instead becomes O(n), which is why the sequential solution becomes more favorable. Many O(1) operations depend on adding and indexed lookup being a constant-time algorithm, which it is not in a sequential-access space.

**2018年05月27日17分47秒**

- When is it justified to use a O(2^n) algorithm?
- Is it sometimes better to write solution in O(n^2) than in O(n)?
- Why is it faster to process a sorted array than an unsorted array?
- Fast random weighted selection across all rows of a stochastic matrix
- Is an algorithm with a worst-case time complexity of O(n) always faster than an algorithm with a worst-case time complexity of O(n^2)?
- Big O(n logn) is not preferable over the O(n^2)
- Is there any practical application of Tango Trees?
- Quickly convert numpy arrays with index to dict of numpy arrays keyed on that index
- In which case(s) would an algorithm of complexity 2^n be used over a n^5?

ADS