标签云

微信群

扫码加入我们

WeChat QR Code

有任何情况下,你会更喜欢更大0算法时间复杂度较低的一个?

Are there are any cases where you would prefer O(log n) time complexity to O(1) time complexity? Or O(n) to O(log n)?

Do you have any examples?


I would prefer an O(log n) algorithm to an O(1) algorithm if understand the former, but not the latter...

2018年05月27日17分47秒

There's tons of impractical data structures with O(1) operations from theoretical computer science. One example would be select() on bitvectors, which can be supported in o(n) extra space and O(1) per operation, using 5 layers of indirection. Simple binary search combined with O(1) rank() turns out to be faster in practice according to the author of the Succinct Data Structure Library

2018年05月27日17分47秒

Lower asymptotic complexity does not guarentee faster runtimes. Research matrix multiplication for a concrete example.

2018年05月27日17分47秒

Also...any algorithm can be converted to O(1), given a sufficiently large table lookup ;)

2018年05月27日17分47秒

Hoten -- That's assuming the table lookup is O(1), which isn't a given at all for the size of the tables you're talking about! :)

2018年05月27日17分47秒

Also, I've seen a few times that people were focused on the big-O of their central algorithm, but ignoring the setup costs. Building a hash-table, for example, can be more expensive than going through an array linearly if you don't need to do it over and over again. In fact, due to the way modern CPUs are built, even something like binary search can be just as fast on sorted arrays as a linear search - profiling is a necessity.

2018年05月27日17分47秒

Luaan "In fact, due to the way modern CPUs are built, even something like binary search can be just as fast on sorted arrays as a linear search - profiling is a necessity." Interesting! Can you explain how binary search and linear search could take the same amount of time on a modern cpu?

2018年05月27日17分47秒

Luaan - Never mind, I found this: schani.wordpress.com/2010/04/30/linear-vs-binary-search

2018年05月27日17分47秒

DenisdeBernardy: No, actually it doesn't. They could be algorithms in P. And even if these weren't, under reasonable definitions of what it means to parallelize, that wouldn't imply P != NP either. Also remember that search the space of possible runs of a non-deterministic turing machine is quite parallelizable.

2018年05月27日17分47秒

This answer could benefit from some examples...

2018年05月27日17分47秒

One example I can think of: for a small sorted array, it'd be easier and more compact for the programmer to implement a binary search function than to write a complete hash map implementation and use it instead.

2018年05月27日17分47秒

An example of complexity: finding the median of an unsorted list is easy to do in O(n * log n) but hard to do in O(n).

2018年05月27日17分47秒

-1, don't put logs in your toaster... Kidding aside, this is spot on. lg n is so, so, so close to k for large n that most operations would never notice the difference.

2018年05月27日17分47秒

There's also the fact that the algorithmic complexities most people are familiar with don't take cache effects into account. Looking up something in a binary tree is O(log2(n)) according to most people but in reality it's much worse because binary trees have bad locality.

2018年05月27日17分47秒

Alistra addressed this indirectly when talking about "space concerns"

2018年05月27日17分47秒

Huge amount of cache misses only multiplies the final execution by a constant value (which is not bigger than 8 for 4-core 3.2GHz CPU with 1.6GHz ram, usually it is much lower) so it is counted as a fixed constant in the big-O notation. Thus only thing the cache misses cause is moving the threshold of n where that O(n) solution starts to be slower than the O(1) solution.

2018年05月27日17分47秒

MarianSpanik You are of course correct. But this question asked for a situation where we would prefer O(logn) over O(1). You could very easily imagine a situation where for all your feasible n, the less memory-bound application would run in faster wall-time, even at a higher complexity.

2018年05月27日17分47秒

MarianSpanik isn't a cache miss up to 300 clock cycles ? Where is the 8 coming from ?

2018年05月27日17分47秒

While what you said is true, in that case, an algorithm executing in O(1) is by definition invulnerable to timing attacks.

2018年05月27日17分47秒

JustinLessard: Being O(1) means that there is some input size after which the algorithm's runtime is bounded by a constant. What happens below this threshold is unknown. Also, the threshold may not even be met for any real-world use of the algorithm. The algorithm might be linear and thus leak information about the length of the input, for example.

2018年05月28日17分47秒

The runtime might also fluctuate in different ways, while still being bounded. If the runtime is proportional to (n mod 5) + 1, it is still O(1), yet reveals information about n. So a more complex algorithm with smoother runtime may be preferable, even if it may be asymptotically (and possibly even in practice) slower.

2018年05月27日17分47秒

This is basically why bcrypt is considered good; it makes things slower

2018年05月27日17分47秒

DavidGrinberg That's the reason why bcrypt is used, and fits the question. But that is unrelated to the this answer, which talks about timing attacks.

2018年05月27日17分47秒

I'm a little confused here. Are you talking about creating a 10-billion-entry array (most of which will be undefined) and treating the UPC as an index into that array?

1970年01月01日00分03秒

DavidZ Yes. If you use a sparse array, you may not get O(1) but it will use only 1MB memory. If you use an actual array, you're guaranteed O(1) access but it will use 1/3 TB memory.

2018年05月27日17分47秒

On a modern system, it will use 1/3 TB of address space, but that doesn't mean it will come anywhere close to that much allocated backing memory. Most modern OSes don't commit storage for allocations until they need to. When doing this, you're essentially hiding an associative lookup structure for your data inside the OS/hardware virtual memory system.

2018年05月27日17分47秒

Novelocrat True, but if you're doing it at RAM speeds the lookup time won't matter, no reason to use 40mb instead of 1mb. The array version only makes sense when storage access is expensive--you're going off to disk.

2018年05月27日17分47秒

Or when this isn't a performance-critical operation, and developer time is expensive - saying malloc(search_space_size) and subscripting into what it returns is as easy as it gets.

2018年05月28日17分47秒

But all that parallelization does is reduce the constant factor that others have talked about, right?

2018年05月27日17分47秒

Yes, but a parallel algorithm can divide the constant factor by 2 every time you double the number of executing machines. Another single threaded algorithm can reduce the constant factor just one time in a constant way. So with a parallel algorithm you can dynamically react to the size of n and be faster in wall clock execution time.

2018年05月27日17分47秒

I might add here that on the sequential-access drive or tape, the O(1) algorithm instead becomes O(n), which is why the sequential solution becomes more favorable. Many O(1) operations depend on adding and indexed lookup being a constant-time algorithm, which it is not in a sequential-access space.

2018年05月27日17分47秒