标签云

微信群

扫码加入我们

WeChat QR Code

Yesterday I was pairing the socks from the clean laundry and figured out the way I was doing it is not very efficient. I was doing a naive search — picking one sock and "iterating" the pile in order to find its pair. This requires iterating over n/2 * n/4 = n2/8 socks on average.

As a computer scientist I was thinking what I could do? Sorting (according to size/color/...) of course came to mind to achieve an O(NlogN) solution.

Hashing or other not-in-place solutions are not an option, because I am not able to duplicate my socks (though it could be nice if I could).

So, the question is basically:

Given a pile of n pairs of socks, containing 2n elements (assume each sock has exactly one matching pair), what is the best way to pair them up efficiently with up to logarithmic extra space? (I believe I can remember that amount of info if needed.)

I will appreciate an answer that addresses the following aspects:

  • A general theoretical solution for a huge number of socks.
  • The actual number of socks is not that large, I don't believe my spouse and I have more than 30 pairs. (And it is fairly easy to distinguish between my socks and hers; can this be used as well?)
  • Is it equivalent to the element distinctness problem?


I use pigeon hole principle to pair exactly one from the laundry pile. I have 3 different colors of socks (Red,Blue and Green) and 2 pairs of each color. I pick up 4 number of socks each time and I always make up a pair and get to work.

2018年07月23日26分03秒

Yet another pigeon hole principle: if you take a subset of n/2 +1 socks, there must be at least one pair in this subset.

2018年07月23日26分03秒

Great question! You might be interested in my article on a related problem, which is a discussion of the probability of pulling two matched socks out of the pile: blogs.msdn.com/b/ericlippert/archive/2010/03/22/…

2018年07月23日26分03秒

Why not spawn a child and waitpid so that, as the parent, you're not even sorting any socks yourself?

2018年07月23日26分03秒

I solved this problem by only owning white knee-high socks. They all match. I could simply grab any two socks at random from the pile and they would match. I further simplify the problem by NOT pairing the socks. I have a sock drawer that I simply throw all my socks into, unpaired. I grab two at random from the drawer every morning. I've simplified it down to O(0). Can't get any simpler than that. :)

2018年07月23日26分03秒

This is exactly what I do! I make piles dependent on the style of the opening of the sock (I only have white), that gives me enough "buckets" to quickly match each of those up.

2018年07月23日26分03秒

I've tried this with my socks (I've got easily 30+ pairs) and man it is FAST. One problem I've found is when I can't have a good enough hash algorithm (I've got lots of white socks without any pattern) so it becomes hard. In that case, what would be the optimal way to do it?

1970年01月01日00分03秒

NothingsImpossible that's how hash collision attacks feel like for a poor web-server! Are the white socks distinguishable by some attribute? There must be something you can distribute them on. Otherwise, you could just form pairs arbitrarily.

2018年07月23日26分03秒

This is a Radix Sort, which I agree is the right answer. MarkPeters I don't think you need a lookup table. A single linear pass over the socks can convert the socks to number vectors, making the mapping of "sock segment" to bucket trivial. The socks can be tied to the vectors with string so that you don't need another linear pass at the end.

2018年07月23日26分03秒

A guy I went to college with actually had PairIDs. It was sewn on each pair of socks with thread: 1, 2, 3, 4...

2018年07月23日26分03秒

as the number of socks increases, human's SIMD become no better than a CPU.

2018年07月23日26分03秒

The best answer, IMO. While it's fun and clever (and appropriate for SO) to reduce a day-to-day problem to a computer algorithm, it makes much more sense to use the resolution power of man's eye/brain for a set as small as ~60 socks.

2018年07月23日26分03秒

LieRyan If the socks are uniformily distributed, you will end up noticing a pair in any sufficiently small set of socks due to the birthday paradox (unless you can distinguish colors to arbitrary precision, which I doubt) so the bottleneck here wouldn't be the human color matching algorithm but the spreading step.

2018年07月23日26分03秒

dpc.ucore.info No, because they have different woven cuff patterns, cuff lengths, overall lengths and shades of black (my wife would probably physically hurt me for that last one).

2018年07月23日26分03秒

You had better hope you have an even number of socks, otherwise you are going to be folding socks for a long time...

2018年07月23日26分03秒

> It might take even more time than sequential search, which requires quadratic time in theory. Yeah that is why I hate doing this, maybe I should throw away all my socks and start with case 1.

2018年07月23日26分03秒

I find it easier to have all the same socks as well. Every couple years I buy 10 of those 6 packs of socks when they are on sale and throw out all my old socks. Just easier to match identical socks and they look better then old holy socks. With this, just a simple first off the top of the pile is the fastest for me.

2018年07月23日26分03秒

the down side of having all identical socks is that they tend to age at different rates. So you still end up trying to match them based on how worn they are. (which is harder than simply matching by pattern)

2018年07月23日26分03秒

The problem with having 60 pairs of identical socks "because it makes pairing easier" is that it gives people the impression you work with computers.

2018年07月23日26分03秒

Case 1 is not constant time when there's an operation involved, such as folding pairs together. In this case, it's linear time with the smallest constant factor (the proof of which is left as an exercise for the reader). One can't possibly take the same time folding one pair and a bucket full of socks. However, it scales linearly. By Amdahl's law, it has unlimited speedup, ignoring overhead. By Gustafson's law, you can fold as many pairs as it takes to fold one pair given enough workers (the amount of which is left as an exercise for the reader), ignoring overhead.

2018年07月23日26分03秒

Upvote for 'non-algorithmic' answer. This is exactly what I do and it works wonderfully. The replacement issue is not a problem if you 'rotate' your sock stock by placing washed socks in back and pulling from the front of the drawer in the morning. All socks wear evenly. When I start noticing some wear on one, I put on the shopping list to completely replace that entire class of socks. For the old socks, I give the best 20% to Goodwill (tied in a grocery sac so they don't get mixed back in) and pitch the rest. You're not wasting socks, at this point, the 80% only have 6 months left anyway.

2018年07月23日26分03秒

BTW (1) Binding your socks results in the elastic one one being stored stretched and the will fail much more quickly. Limiting the kinds of unique socks you have makes binding unneded. (2) A disadvantage of limiting unique socks is that for people with certain fashion concerns, the method may be unsuitable.

2018年07月23日26分03秒

I came here specifically to post your "non-algorithmic" answer. As in true computer science, most people never pay enough attention to the data and its structure.

2018年07月23日26分03秒

I use this algorithmic approach every morning and it works like a charm! Additionally, I put worn out socks to a different pile to throw away later (unfortunately they manage to get to the original pile again before I find the time to trash it).

2018年07月23日26分03秒

«n packet of white and m packets of black. No need for other colors in everyday's life» A good standard rule for easy sock selection is actually that they should match either the color of your trousers or the color of your belt. For this reason, the most commonly used colors will likely be black, blue, gray and some brown. It's hard to believe one needs many white socks.

1970年01月01日00分07秒

This is also what I do, (note that if you simply leave spaces then inserts are also O(1) ), but it scales poorly with theoretically large numbers of socks.

2018年07月23日26分03秒

scales poorly with theoretically large numbers of types of socks

2018年07月23日26分03秒

StevenLu - as I said - it's n*n or nLogn, depending on whether you sort it or not. So it scales about as poorly as any sorting algorithm. If you want faster, number them and use radix sort.

2018年07月23日26分03秒

This is essentially storing found-but-not-matched socks in a hash-based lookup. With an ideal hash it is O(n), but if you've enough socks stored that the hash begins to degenerate, it becomes more complex accordingly.

2018年07月23日26分03秒

what value does inserting a sock between 2 other socks provide to the goal of pairing socks? there is no cardinality of socks. :-x

2018年07月23日26分03秒

A lifetime (assuming 75 years) supply of socks (assuming you exhaust 4 pairs/month, that makes 3600 pairs) would take up (assuming a new pair of socks takes up 20 cubic inches) a total of 1 1/2 cubic yards. That is an enormous amount of space. Assuming they deliver it to you in a box that is roughly a cube, that crate will be about 3 feet 4 inches on a side.

2018年07月23日26分03秒

AJMansfield valid concern. However, I disagree with a few of your numbers. I'd take a timespan of just 40 years (25...65) (time between not living at parents/dorm/etc and retiring, see above). Also, I think one pair takes more like 0,5x4x6 inches in original packaging. These numbers bring your space estime down quite a bit!

2018年07月23日26分03秒

Step 4 is unnecessarily wasteful, -1.

2018年07月23日26分03秒

Guide for others who might be confused by AJMansfield's measurements, a translation into metric: »would take up (assuming a new pair of socks takes up 327 cm³) a total of 1.14 m³. That is an enormous amount of space. Assuming they deliver it to you in a box that is roughly a cube, that crate will be about 1.04 m on a side.«

2018年07月23日26分03秒

Socks often come in 4-packs and larger, since that is cheaper, but that also makes them indistinguishable. To counter this, my wife sews a tiny mark onto each new pair of socks I buy. The mark is of a different color for each pair, or of a different shape, if she runs out of colors. With this approach you don't even need a limited set of attributes. Just sew a unique number on each pair. :) For extra points, use binary.

2018年07月23日26分03秒

Vilx- WHY?!? Isn't the whole point that they be indistinguishable?

2018年07月24日26分03秒

flup - I think the whole point is to sell in larger bundles. :) As for me, this helps to wear them down in pairs. Otherwise I can end up with three very worn socks and one brand new one. Kinda silly.

2018年07月23日26分03秒

I disagree with the calculation of O(n). What is $k$? $k$ is the number of attributes. I would argue $k$ is $O(log n)$ because it has to be enough to uniquely identify each pair. If you have 2 pairs (black and white), then color ($k=1, n=2$) is enough. If you have one pair of black, short; one pair of black, long; one pair of white, short; and one pair of white, long - then $k=2, n=4$. Then if we limit $k$, we at the same time limit $n$. If we are going to limit $n$ then order calculation does not make sense anymore.

2018年07月23日26分03秒

emory, I think that you're looking for the backtick, not the $ character, to make your stuff look code-y.

2018年07月23日26分03秒

Human optimisation: I'd argue that as a human, for step 2, you should plonk the socks down in roughly ascending order, then repeat with finer and finer granularity until sorted, a bit like shell sort. This would be much faster for a human (visual estimation) than a comparison-swap based approach.

2018年07月23日26分03秒

Ok, that's better, although it's still quite wrong ... this question is not about that. Whether or not the Church-Turing thesis is correct, both humans and our computers can sort socks. (The reality is that, humans, being highly finite entities, have far less computational power than Turing Machines ... and the same is true of our computers, but the limitations are different.)

2018年07月23日26分03秒

I disagree. Of course any of our current computers is essentially and enormous DFA (modulo i/o differences) rather than a TM. Any analog device, however, such as our bodies, is capable of emulating an infinite tape. We don't yet have a useful characterization of the way our minds compute.

2018年07月23日26分03秒

No infinite tape for humans or other physical devices because nothing in the human brain has infinite resolution, nor could it. It would also help to learn some neuroscience. In any case, there was no deep philosophical question here, regardless of your desire to inject one. But believe what you will ... this isn't the place for this sort of debate and I've had it too many times before. But I'm always amused by people who can barely solve the simplest problems (that's all of us) imagining that they are TM-equivalent.

2018年07月23日26分03秒

Tying socks (or any clothes) in a knot reduces the capability of the washer to wash the clothes, and makes untying them to wear much more difficult. Solution 2 makes maintenance more difficult the longer the state of affairs progresses; after 6 months, when you need two black ankle socks to wear with a pair of shorts and sneakers, 6 months of doing whatever works is going to make finding that pair in the same condition (dirty/clean, similar wear) much less likely. Solution 3 is less "asynchronous" and more straight-up "lazy"; do the minimum work you need exactly when you need to.

2018年07月23日26分03秒

This is probably pretty close to my mental process. I have an added layer of pre-sort optimization. My athletic socks get washed with the whites and my dress socks get washed with colors. This means that as long as I don't dump two loads of laundry together, my socks are already grouped by type. The white load goes really fast (many identical socks) but the dress socks take longer. Other key tip--make more available memory for the sort (fold and remove all non-socks first and THEN run the pairing algorithm)

2018年07月23日26分03秒

That is usually how I do it. Works much better than iterating through all the remaining socks each time.

2018年07月23日26分03秒

Nice approach and I think it can be applied to some real CS problems as well. Can you please add an example of such (a CS problem where we could use a similar approach to solve problems)? Also, how does this solution scales for millions of socks?

2018年07月23日26分03秒

I think this is basically th same as the other answer here, stackoverflow.com/a/14423956, from Jan 20. Both +1. Human vision system is massively parallel.

2018年07月23日26分03秒

This is the only valid answer. All the others disregard the fact that the most time is spent distinguishing between similar socks (so lumping them all together by physical appearance makes it even worse).

2018年07月23日26分03秒

For fun I wrote this method of piling socks up into a little python program gist.github.com/justinfay/53b574cf0a492f6795ef

2018年07月23日26分03秒

this is the same as stackoverflow.com/a/14423956 and stackoverflow.com/a/14468913 I think.

2018年07月23日26分03秒

WillNess Yep, with a bit more of explanation

2018年07月23日26分03秒

After I wrote this, I use it every time. It helped me to became a bit more efficient and the job is less boring now.

2018年07月23日26分03秒

It does not answer the question, because handling with already paired data is easy, the question is what to do when the data is UNPAIRED and you want to pair it.

2018年07月23日26分03秒

I didn't answer, just edited the answer. Please correct.

2018年07月23日26分03秒

This is what I do ... O(n)

2018年07月23日26分03秒

Pykler - It's O(n) in the best case and O(n*n) in the worst case.

2018年07月23日26分03秒

Thats assuming that you cannot create a fully unique hash in your mind of all the socks you already seen, which for me is a O(1) to match a sock that I have seen and previously and placed in the waiting for matching hash

2018年07月23日26分03秒

This approach has two other advantages: line-drying loses many fewer socks IME than the tumble dryer does, and the sort process can be extended to the rest of the laundry, so (e.g.) all towels are near each other to be folded off the line and binned and taken straight to their storage. It also works in two low-effort passes, putting the clothes up and taking them down again.

2018年07月23日26分03秒

Socks are only metaphor for pairing arbitrary objects in some database.

2018年07月23日26分03秒

Got it, didn't see that you are the author. If you wanted a generic solution, you should really have said so. Anyway, there is nothing wrong with taking any information you have into account, unless you have to come up with a general solution - giving up the reusability of the solution could result in considerably better performance. In this case, considering the use case and the available data base as a whole is be beneficial. However, this special answer to your special question has issues with similar-looking socks, e.g. black socks in different sizes, so it's not applicable in some cases.

2018年07月23日26分03秒

Also, you did not get >2k upvotes because you asked a question about pairing arbitrary objects in the database. You specifically constrained the question due to the very nature of socks (which you cannot duplicate, as opposed to data), you even encouraged to use the fact that you can easily distinguish your socks from the socks of your spouse. If you ask a question about socks, don't expect the answers to be about databases ;-)

2018年07月23日26分03秒

There are a few assumptions: a normal washing mashine, a, normal clothesline, and the fact that you toss both socks in the bin at the same time, which means that in most cases both socks are in the same machine, and the number of leftover socks to be sorted is therefore small. But since you really wanted an answer about storing arbitrary objects in the database, is it really useful discussing my solution any futher?

2018年07月23日26分03秒

As I said, I think that I addressed everything you asked for, except for the element distinctness problem, which has been answered by other people. I'm not trying to be a douche here, but I have put a lot of effort in this answer a while back, and am mildly disappointed that you now go through some of the answers and claim that they didn't answer the original question. Why don't you just leave the whole thread alone - it's still an interesting read, over 2 years after you asked it?

2018年07月23日26分03秒

How to do it not in-place, as specifically mentioned in the question?

2018年07月23日26分03秒

This does not answer the question, where the socks are only a metaphor.

2018年07月23日26分03秒

The question was how to pair the socks from an unpaired pile, not how to avoid needing to pair.

2018年07月23日26分03秒

This is still an O(n^2) algorithm since you have to iterate over each bucket whenever you pull out a new sock. But, considering the fact that even the socks bought within the same batch have minor differences which render them effectively pair-unique (or even single-unique), there's no better way anyway

2018年07月23日26分03秒

Agree, but my algorithm is assuming that human is doing the pairing. Therefore, there will be a kind-of cache in your mind when you are searching for the matching bucket, so you don't really need to iterate over the buckets anyway. Not sure what kind of data structure is built for this caching mechanism in my head during the pairing.

2018年07月23日26分03秒