How to pair socks from a pile efficiently?
Yesterday I was pairing the socks from the clean laundry, and figured out the way I was doing it is not very efficient. I was doing a naive search â€” picking one sock and "iterating" the pile in order to find its pair. This requires iterating over n/2 * n/4 = n2/8 socks on average.

As a computer scientist I was thinking what I could do? Sorting (according to size/color/...) of course came into mind to achieve an O(NlogN) solution.

Hashing or other not-in-place solutions are not an option, because I am not able to duplicate my socks (though it could be nice if I could).

So, the question is basically:

Given a pile of n pairs of socks, containing 2n elements (assume each sock has exactly one matching pair), what is the best way to pair them up efficiently with up to logarithmic extra space? (I believe I can remember that amount of info if needed.)

I will appreciate an answer that addresses the following aspects:
improve this question | comment
Quentin Jerde Created at: 2013-11-13 17:07:39 UTC By Quentin Jerde
I use pigeon hole principle to pair exactly one from the laundry pile. I have 3 different colors of socks (Red,Blue and Green) and 2 pairs of each color. I pick up 4 number of socks each time and I always make up a pair and get to work. - Ena Bins DVM
Yet another pigeon hole principle: if you take a subset of n/2 +1 socks, there must be at least one pair in this subset. - Trycia Eichmann
remark that if you had an infinite number of different symmetric pairs of socks (i.e. left=right), then, in fact, just picking out one of each is not possible unless you accept the axiom of choice (  What consequence does this have on pairing?  how is this relevant to computability?… - Jaquan Smith
Great question! You might be interested in my article on a related problem, which is a discussion of the probability of pulling two matched socks out of the pile:… - Ryan Dare
Stopped reading at "assume each sock has exactly one matching [pair]". Everyone knows this is an impossible assumption to make. - Alek Larson
28 Answers
We can use hashing to our leverage if you can abstract a single pair of socks as the key itself and its other pair as value.

Make two imaginary sections behind you on the floor, one for you and another for your spouse. 
Take one from the pile of socks.
Now place the socks on the floor, one by one, following the below rule.

Identify the socks as yours or hers and look at the relevant section on floor.
If you can spot the pair on the floor pick it up and knot them up or clip them up or do whatever you would do after you find a pair and place it in a basket (remove it from the floor).
Place it in the relevant section.

Repeat 3 until all socks are over from the pile.


Hashing and Abstraction

Abstraction is a very powerful concept that has been used to improve user experience (UX). Examples of abstraction in real life interactions with computers includes:

Folder icons used for navigation in a GUI (graphical user interface) to access an address instead of typing the actual address to navigate to a location.
GUI sliders used to control various levels like volume, document scroll position, etc..

  Hashing or other not-in-place solutions are not an option, because I am not able to duplicate my socks (though it could be nice if I could).

I believe the asker was thinking of applying hashing such that the slot to which either pair of sock goes should be known before placing them.

That's why I suggested abstracting a single sock that is placed on the floor as the hashkey itself (hence there is no need to duplicate the socks).

How to define our hashkey?

The below definition for our key would also work if there are more than one pair of similar socks. That is, let's say there are two pairs of black men socks PairA and PairB, and each socks are named PairA-L, PairA-R, PairB-L, PairB-R. So PairA-L can be paired with PairB-R, but PairA-L and PairB-L cannot be paired.

Let say any sock can be uniqly identified by,

Attribute[Gender] + Attribute[Colour] + Attribute[Material] + Attribute[Type1] + Attribute[Type2] + Attribute[Left_or_Right]

This is our first hash function. Let's use a short notation for this h1(G_C_M_T1_T2_LR). h1(x) is not our location key.

Another hash function eliminating the Left_or_Right attribute would be h2(G_C_M_T1_T2). This second function h2(x) is our location key! (for the space on the floor behind you).

To locate the slot, use h2(G_C_M_T1_T2).
Once the slot is located then use h1(x) to check their hashes. If they don't match, you have a pair. Else throw the sock into same slot.
NOTE: Since we remove a pair once we find one, it's safe to assume that there would only be at a maximum one slot with a unique h2(x) or h1(x) value.

In case we have each sock with exactly one matching pair then use h2(x) for finding the location and if no pair, a check is required, since it's safe to assume they are a pair.

Why is it important to lay the socks on the floor

Let's consider a scenario where the socks are stacked over one another in a pile (worst case). This means we would have no other choice but to do a linear search to find a pair.

Spreading them on the floor gives more visibility which improves the chance of spotting the matching sock (matching a haskey). 
When a sock was placed on the floor in step 3, our mind had subconciously registered the location. 
  - So in case this location is available in our memory we can directly find the matching pair.
  - In case the location is not remembered, don't worry, then we can always revert back to linear search.

Why is it important to remove the pair from the floor?

Short-term human memory works best when it has fewer items to remember. Thus increasing the probability of us resorting to hashing to spot the pair. 
It will also reduce the number of items to be searched through when linear searching for pair is being used.

Case 1: Worst case when Derpina cannot remember or spot the socks on the floor directly using the hashing technique. Derp does a linear search through the items on the floor. This is not worse than the iterating throught the pile to find the pair.
Upper bound for comparison: O(n^2).
Lower bound for comparison: (n/2). (when every other sock Derpina picks up is the pair of previous one).

Case 2: Derp remembers each the location of every sock that he placed on the floor and each sock has exactly one pair.
Upper bound for comparison: O(n/2).
Lower bound for comparison: O(n/2).

I am talking about comparison operations, picking the socks from the pile would necessarily be n number of operations. So a practical lower bound would be n iterations with n/2 comparisions.

Speeding up things

To achieve a perfect score so Derp gets O(n/2) comparisons, I would recommend Derpina to,

spend more time with the socks to get familiarize with it. Yes, that means spending more time with Derp's socks too.
Playing memory games like spot the pairs in a grid can improve short-term memory performance, which can be highly beneficial.
Is this equivalent to the element distinctness problem?

The method I suggested is one of the method used to solve element distinctness problem where you place them in hash table and do the comparison. 

Given your special case where there exist only one exact pair, it has become very much equivalent to the element distinct problem. Since we can even sort the socks and check adjacent socks for pairs (another solution for EDP). 

However, if there is a possibility of more than one pair that can exists for given sock then it deviates from EDP.
List<Sock> UnSearchedSocks = getAllSocks();
List<Sock> UnMatchedSocks = new list<Sock>();
List<PairOfSocks> PairedSocks = new list<PairOfSocks>();

foreach (Sock newSock in UnsearchedSocks)
  Sock MatchedSock = null;
  foreach(Sock UnmatchedSock in UnmatchedSocks)
    if (UnmatchedSock.isPairOf(newSock))
      MatchedSock = UnmatchedSock;
  if (MatchedSock != null)
    PairedSocks.Add(new PairOfSocks(MatchedSock, NewSock));

As architecture of human brain is completely different than of modern CPU this question makes no practical sense.

Humans can win over CPU algorithms using the fact that "finding a matching pair" can be one operation for a set that isn't too big.

My algorithm:

while (socks_left_on_a_surface()) {
     // thanks to human visual SIMD, this is one, quick operation
     pair = notice_any_matching_pair();

At least this is what I am using in real life and I find it very efficient. The downsize is it requires flat surface, but it's usually abundant.
Case 1: All socks are identical (this is what I do in real life by the way). 

Pick any two of them to make a pair. Constant time.

Case 2: There are a constant number of combinations (ownership, color, size, texture, etc.).

Use radix sort. This is only linear time since comparison is not required.

Case 3: The number of combinations is not known in advance (general case).

We have to do comparison to check whether two socks come in pair. Pick one of the O(n log n) comparison-based sorting algorithms.

However in real life when the number of socks is relatively small (constant), these theoretically optimal algorithms wouldn't work well. It might take even more time than sequential search, which theoretically requires quadratic time.
Non-algorithmic answer, yet "efficient" when I do it:

step 1) discard all your existing socks
step 2) go to Walmart and buy them by packets of 10 - n packet of
white and m packets of black. No need for other colors in everyday's
Yet times to times, I have to do this again (lost socks, damaged socks, etc.), and I hate to discard perfectly good socks too often (and I wished they kept selling the same socks reference!), so I recently took a different approach.

Algorithmic answer:

Consider than if you draw only one sock for the second stack of socks, as you are doing, your odds of finding the matching sock in a naive search is quite low.

So pick up five of them at random, and memorize their shape or their length.
Why five? Usually humans are good are remembering between five and seven different elements in the working memory - a bit like the human equivalent of a RPN stack - five is a safe default.

Pick up one from the stack of 2n-5.
Now look for a match (visual pattern matching - humans are good at that with a small stack) inside the five you drew, if you don't find one, then add that to your five.
Keep randomly picking socks from the stack and compare to your 5+1 socks for a match. As your stack grows, it will reduce your performance but raise your odds. Much faster.
Feel free to write down the formula to calculate how many samples you have to draw for a 50% odds of a match. IIRC it's an hypergeometric law.

I do that every morning and rarely need more than three draws - but I have n similar pairs (around 10, give or take the lost ones) of m shaped white socks. Now you can estimate the size of my stack of stocks :-)

BTW, I found that the sum of the transaction costs of sorting all the socks every time I needed a pair were far less than doing it once and binding the socks. A just-in-time works better because then you don't have to bind the socks, and there's also a diminishing marginal return (that is, you keep looking for that two or three socks that when somewhere in the laundry and that you need to finish matching your socks and you lose time on that).
What I do is that I pick up the first sock and put it down (say, on the edge of the laundry bowl). Then I pick up another sock and check to see if it's the same as the first sock. If it is, I remove them both. If it's not, I put it down next to the first sock. Then I pick up the third sock and compare that to the first two (if they're still there). Etc.

This approach can be fairly easily be implemented in an array, assuming that "removing" socks is an option. Actually, you don't even need to "remove" socks. If you don't need sorting of the socks (see below), then you can just move them around and end up with an array that has all the socks arranged in pairs in the array.

Assuming that the only operation for socks is to compare for equality, this algorithm is basically still an n2 algorithm, though I don't know about the average case (never learned to calculate that).

Sorting, of course improves efficiency, especially in real life where you can easily "insert" a sock between two other socks. In computing the same could be achieved by a tree, but that's extra space. And, of course, we're back at NlogN (or a bit more, if there are several socks that are the same by sorting criteria, but not from the same pair).

Other than that, I cannot think of anything, but this method does seem to be pretty efficient in real life. :)
The theoretical limit is O(n) because you need to touch each sock (unless some are already paired somehow).

You can achieve O(n) with radix sort. You just need to pick some attributes for the buckets. 

First you can choose (hers, mine) - split them into 2 piles,
then use colors (can have any order for the colors, e.g. alphabetically by color name) - split them into piles by color (remember to keep the initial order from step 1 for all socks in the same pile),
then length of the sock,
then texture,
If you can pick a limited number of attributes, but enough attributes that can uniquely identify each pair, you should be done in O(k * n), which is O(n) if we can consider k is limited. 
This is asking the wrong question. Right question to ask is, why am I spending time sorting socks? How much does it cost on yearly basis, when you value your free time for X monetary units of your choice? And more often than not, this is not just any free time, it's morning free time, which you could be spending in bed, or sipping your coffee, or leaving a bit early and not being caught in the traffic.

It's often good to take a step back, and think a way around the problem.

And there is a way!

Find sock you like. Take all relevant features into account: color in different lighting conditions, overall quality and durability, comfort in different climatic conditions, odor absorption. Also important is, they should not lose elasticity in storage, so natural fabrics are good, and they should be available in plastic wrapping.

It's better if there's no difference between left and right foot socks, but it's not critical. If socks are left-right symmetrical, finding a pair is O(1) operation, and sorting the socks is approximate O(M) operation, where M is amount of places in your house, which you have littered with socks, ideally some small constant number.

If you chose a fancy pair with different left and right sock, doing a full bucket sort to left and right foot buckets takes O(N+M), where N is amount of socks and M is same as above. Somebody else can give the formula for average iterations of finding first pair, but worst case for finding a pair with blind search is N/2+1, which becomes astronomically unlikely case for reasonable N. This can be sped up by using advanced image recognition algorithms and heuristics, when scanning the pile of unsorted socks with Mk1 Eyeball.

So, algorithm for achieving O(1) sock pairing efficiency (assuming symmetrical sock):

You need to estimate how many pairs of socks you will need for the rest of your life, or perhaps until you retire and move to warmer climes with no need to wear socks ever again. If you are young, you could also estimate how long it takes before we'll all have sock-sorting robots in our homes, and the whole problem becomes irrelevant.
You need to find out how you can order your selected sock in bulk, and how much it costs, and do they deliver.
Order the socks!
Get rid of your old socks.
Alternative step 3 would involve comparing costs of buying same amount of perhaps cheaper socks a few pairs at a time over the years, and adding the cost of sorting socks, but take my word for it: buying in bulk is cheaper! Also, socks in storage increase in value at the rate of sock price inflation, which is more than you would get on many investments. Then again there is also storage cost, but socks really do not take much space at the top shelf of a closet.

Problem solved. So, just get new socks, throw/donate your old ones away, and live happily ever after knowing you are saving money and time every day for the rest of your life.
As a practical solution:

Quickly make piles of easily distinguishable socks. (Say by color)
Quicksort every pile and use the length of the sock for comparison. As a human you can make a fairly quick decision which sock to use to partition that avoids worst case. (You can see multiple socks in parallel, use that to your advantage!)
Stop sorting piles when they reached a threshold at which you are comfortable to find spot pairs and unpairable socks instantly
If you have 1000 socks, with 8 colors and an average distribution, you can make 4 piles of each 125 socks in c*n time. With a threshold of 5 socks you can sort every pile in 6 runs. (Counting 2 seconds to throw a sock on the right pile it will take you little under 4 hours.)

If you have just 60 socks, 3 colors and 2 sort of socks (yours / your wife's) you can sort every pile of 10 socks in 1 runs (Again threshold = 5). (Counting 2 seconds it will take you 2 min).

The initial bucket sorting will speed up your process, because it divides your n socks into k buckets in c*n time so than you will only have to do c*n*log(k) work. (Not taking into account the threshold). So all in all you do about n*c*(1 + log(k)) work, where c is the time to throw a sock on a pile.

This approach will be favourable compared to any c*x*n + O(1) method roughly as long as log(k) < x - 1.

In computer science this can be helpful:
We have a collection of n things, an order on them (length) and also an equivalence relation (extra information, for example the color of socks). The equivalence relation allows us to make a partition of the original collection, and in every equivalence class our order is still maintained. The mapping of a thing to it's equivalence class can be done in O(1), so only O(n) is needed to assign each item to a class. Now we have used our extra information and can proceed in any manner to sort every class. The advantage is that the data sets are already significantly smaller.

The method can also be nested, if we have multiple equivalence relations -> make colour piles, than within every pile partition on texture, than sort on length. Any equivalence relation that creates a partition with more than 2 elements that have about even size will bring a speed improvement over sorting (provided we can directly assign a sock to its pile), and the sorting can happen very quickly on smaller data sets.
Here's an Omega(n log n) lower bound in comparison based model. (The only valid operation is comparing two socks.)

Suppose that you know that your 2n socks are arranged this way:

p1 p2 p3 ... pn pf(1) pf(2) ... pf(n)

where f is an unknown permutation of the set {1,2,...,n}. Knowing this cannot make the problem harder. There are n! possible outputs (matchings between first and second half), which means you need log(n!) = Omega(n log n) comparisons. This is obtainable by sorting.

Since you are interested in connections to element distinctness problem: proving the Omega(n log n) bound for element distinctness is harder, because the output is binary yes/no. Here, the output has to be a matching and the number of possible outputs suffices to get a decent bound. However, there's a variant connected to element distinctness. Suppose you are given 2n socks and wonder if they can be uniquely paired. You can get a reduction from ED by sending (a1, a2, ..., an) to (a1, a1, a2, a2, ..., an, an). (Parenthetically, the proof of hardness of ED is very interesting, via topology.)

I think that there should be an Omega(n2) bound for the original problem if you allow equality tests only. My intuition is: Consider a graph where you add an edge after a test, and argue that if the graph is not dense the output is not uniquely determined.
This question is actually deeply philosophical. At heart it's about whether the power of people to solve problems (the "wetware" of our brains) is equivalent to what can be accomplished by algorithms.

An obvious algorithm for sock sorting is:

Let N be the set of socks that are still unpaired, initially empty
for each sock s taken from the dryer
  if s matches a sock t in N
    remove t from N, bundle s and t together, and throw them in the basket
    add s to N

Now the computer science in this problem is all about the steps

"if s pairs with a sock t in N".  How quickly can we "remember" what we've seen so far?
"remove t from N" and "add s to N".  How expensive is keeping track of what we've seen so far?
Human beings will use various strategies to effect these. Human memory is associative, something like a hash table where feature sets of stored values are paired with the corresponding values themselves. For example, the concept of "red car" maps to all the red cars a person is capable of remembering. Someone with a perfect memory has a perfect mapping.  Most people are imperfect in this regard (and most others).  The associative map has a limited capacity. Mappings may bleep out of existence under various circumstances (one beer too many), be recorded in error ("I though her name was Betty, not Nettie"), or never be overwritten even though we observe that the truth has changed ("dad's car" evokes "orange Firebird" when we actually knew he'd traded that in for the red Camaro).

In the case of socks, perfect recall means looking at a sock s always produces the memory of its sibling t, including enough information (where it is on the ironing board) to locate t in constant time.  A person with photographic memory accomplishes both 1 and 2 in constant time without fail.

Someone with less than perfect memory might use a few commonsense equivalence classes based on features within his capability to track: size (papa, mama, baby), color (greenish, redish, etc.), pattern (argyle, plain, etc.), style (footie, knee-high, etc.).  So the ironing board would be divided into sections for the categories. This usually allows the category to be located in constant time by memory, but then a linear search through the category "bucket" is needed. 

Someone with no memory or imagination at all (sorry) will just keep the socks in one pile and do a linear search of the whole pile.

A neat freak might use numeric labels for pairs as someone suggested.  This opens the door to a total ordering, which allows the human to use exactly the same algorithms we might with a CPU: binary search, trees, hashes, etc.

So the "best" algorithm depends on the qualities of the wetware/hardware/software that is running it and our willingness to "cheat" by imposing a total order on pairs.  Certainly a "best" meta-algorithm is to hire the worlds best sock-sorter: a person or machine that can aquire and quickly store a huge set N of sock attribute sets in a 1-1 associative memory with constant time lookup, insert, and delete. Both people and machines like this can be procured. If you have one, you can pair all the socks in O(N) time for N pairs, which is optimal. The total order tags allow you to use standard hashing to get the same result with either a human or hardware computer.
Cost: Moving socks -> high, finding/search socks in line -> small

What we want to do is reduce the number of moves, and compensate with the number of searches. Also, we can utilize the multithreded environment of the Homo Sapiens to hold more things in the descision cache. 

X = Yours, Y = Your spouses

From pile A of all socks:

Pick two socks, place corresponding X sock in X line, and Y sock in Y line at next available position.

Do until A is empty.

For each line X and Y

Pick the first sock in line, search along the line until it finds the corresponding sock.
Put into the corresponding finished line of socks. 
Optional While you are searching the line and and the current sock you are looking at is identical to the previous, do step 2 for these socks.
Optionally to step one, you pick up two sock from that line instead of two, as the caching memory is large enough we can quickly identify if either sock matches the current one on the line you are observing. If you are fortunate enough to have three arms, you could possibly parse three socks at the same time given that the memory of the subject is large enough.

Do until both X and Y is empty.


However, as this have simillar complexity as selection sort, the time taken is far less due to the speeds of I/O(moving socks) and search(searching the line for a sock).
This is how I actually do it, for p pairs of socks (n = 2p individual socks):

Grab a sock at random from the pile.
For the first sock, or if all previously-chosen socks have been paired, simply place the sock into the first "slot" of an "array" of unpaired socks in front of you.
If you have one or more selected unpaired socks, check your current sock against all the unpaired socks in the array. 
It is possible to separate socks into general classes or types (white/black, ankle/crew, athletic/dress) when building your array, and "drill-down" to only compare like-for-like.
If you find an acceptable match, put both socks together and remove them from the array.
If you do not, put the current sock into the first open slot in the array.

Repeat with every sock.
The worst-case scenario of this scheme is that every pair of socks is different enough that it must be matched exactly, and that the first n/2 socks you pick are all different. This is your O(n2) scenario, and it's extremely unlikely. If the number of unique types of sock t is less than the number of pairs p = n/2, and the socks in each type are alike enough (usually in wear-related terms) that any sock of that type can be paired with any other, then as I inferred above, the maximum number of socks you will ever have to compare to is t, after which the next one you pull will match one of the unpaired socks. This scenario is much more likely in the average sock drawer than the worst-case, and reduces the worst-case complexity to O(n*t) where usually t << n.
In order to say how efficient it is to pair socks from a pile, we have to define the machine first, because the pairing isn't done whether by a turing nor by a random access machine, which are normally used as the basis for an algorithmic analysis.

The machine

The machine is an abstraction of a the real world element called human being. It is able to read from the environment via a pair of eyes. And our machine model is able to manipulate the environment by using 2 arms. Logical and arithmetic operations are calculated using our brain (hopefully ;-)).

We also have to consider the intrinsic runtime of the atomic operations that can be carried out with these instruments. Due to physical constraints, operations which are carried out by an arm or eye have non constant time complexity. This is because we can't move an endlessly large pile of socks with an arm nor can an eye see the top sock on an endlessly large pile of socks.

However mechanical physics give us some goodies as well. We are not limited to move at most one sock with an arm. We can move a whole couple of them at once. 

So depending on the previous analysis following operations should be used in descending order:

logical and arithmetic operations
environmental reads
environmental modifications
We can also make use of the fact that people only have a very limited amount of socks. So an environmental modification can involve all socks in the pile.

The algorithm

So here is my suggestion:

Spread all socks in the pile over the floor.
Find a pair by looking at the socks on the floor.
Repeat from 2 until no pair can be made.
Repeat from 1 until there are no socks on the floor.
Operation 4 is necessary, because when spreading socks over the floor some socks may hide others. Here is the analysis of the algorithm:

The analysis

The algorithm terminates with high probability. This is due to the fact that one is unable to find pairs of socks in step number 2.

For the following runtime analysis of pairing n pairs of socks, we suppose that at least half of the 2n socks aren't hidden after step 1. So in the average case we can find n/2 pairs. This means that the loop is step 4 is executed O(log n) times. Step 2 is executed O(n^2) times. So we can conclude:

The algorithm involves O(ln n + n) environmental modifications (step 1 O(ln n) plus picking every pair of sock from the floor)
The algorithm involves O(n^2) environmental reads from step 2
The algorithm involves O(n^2) logical and arithmetic operations for comparing a sock with another in step 2
So we have a total runtime complexity of O(r*n^2 + w*(ln n + n)) where r and w are the factors for environmental read and environmental write operations respectively for a reasonable amount of socks. The cost of the logical and arithmetical operations are omitted, because we suppose that it takes a constant amount of logical and arithmetical operations to decide whether 2 socks belong to the same pair. This may not be feasible in every scenario.
Sorting solutions have been proposed but sorting is a little too much: We don't need order, we just need equality groups.

So hashing would be enough (and faster).

For each color of socks, form a pile. Iterate over all socks in your input basket and distribute them onto the color piles.
Iterate over each pile and distribute it by some other metric (e.g. pattern) into a second set of piles
Recursively apply this scheme until you have distributed all socks onto very small piles that you can visually process immediately
This kind of recursive hash partitioning is actually being done by SQL Server when it needs to hash join or hash aggregate over huge data sets. It distributes its build input stream into many partitions which are independent. This scheme scales to arbitrary amounts of data and multiple CPUs linearly.

You don't need recursive partitioning if you can find a distribution key (hash key) that provides enough buckets that each bucket is small enough to be processed very quickly. Unfortunately, I don't think socks have such a property.

If each sock had an integer called "PairID" one could easily distribute them into 10 buckets according to PairID % 10 (the last digit).

The best real-world partitioning I can think of is creating a rectangle of piles: one dimension is color, the other is pattern. Why a rectangle? Because we need O(1) random-access to piles. (A 3D cuboid would also work, but that is not very practical.)


What about parallelism? Can multiple humans match the socks faster?

The simplest parallization strategy is to have multiple workers take from the input basket and put the socks onto the piles. This only scales up so much - imagine 100 people fighting over 10 piles. The synchronization costs (manifesting themselves as hand-collisions and human communication) destroy efficiency and speed-up (see the Universal Scalability Law!).
It scales nearly indefinitely if each worker has its own set of piles. Workers can then take from the input basket big chunks of socks (very little contention as they are doing it rarely) and they do not need to sync when distributing the socks at all (because they have thread-local piles). At the end all workers need to union their pile-sets. I believe that can be done in O(log (worker count * piles per worker)) if the workers form an aggregation tree.
What about the element distinctness problem? As the article states, the element distinctness problem can be solved in O(N). This is the same for the socks problem (also O(N), if you need only one distribution step (I proposed multiple steps only because humans are bad at calculations - one step is enough if you distribute on md5(color, length, pattern, ...), i.e. a perfect hash of all attributes)).

Clearly, one cannot go faster than O(N) so we have reached the optimal lower bound.

Although the outputs are not exactly the same (in one case, just a boolean. In the other case, the pairs of socks), the asymptotic complexities are the same.
From your question it is clear you don't have much actual experience with laundry :). You need an algorithm that works well with a small number of non-pairable socks.

The answers till now don't make good use of our human pattern recognition capabilities. The game of Set provides a clue of how to do this well: put all socks in a two-dimensional space so you can both recognize them well and easily reach them with your hands. This limits you to an area of about 120 * 80 cm or so. From there select the pairs you recognize and remove them. Put extra socks in the free space and repeat. If you wash for people with easily recognizable socks (small kids come to mind), you can do a radix sort by selecting those socks first. This algorithm works well only when the  number of single socks is low
Whenever you pick up a sock, put it in one place.  Then the next sock you pick up, if it doesn't match the first sock, set it beside the first one.  If it does, there's a pair.  This way it doesn't really matter how many combinations there are, and there are only two possibilities for each sock you pick up -- either it has a match that's already in your array of socks, or it doesn't, which means you add it to a place in the array.

This also means that you will almost certainly never have all your socks in the array, because socks will get removed as they're matched.
Consider a hash-table of  size  'N'.

If we assume normal distribution, then the estimated number of 'insertions'  to have atleast one sock mapped to one bucket is NlogN  (ie, all buckets are full) 

I had derived this as a part of another puzzle,but I would be happy to be proven wrong. 
Here's my blog article on the same   

Let 'N' correspond to an approximate upper-bound on the number of  number of unique colors/pattern of socks that you have.

Once you have a collision(a.k.a  : a match) simply remove that pair of socks. 
 Repeat the same experiment with the next batch of NlogN socks. 
The beauty of it is that you could be making NlogN parallel comparisons(collision-resolution) because of the way the human mind works.  :-)
Socks, whether real ones or some analogous data structure, would be supplied in pairs. 

The simplest answer is prior to allowing the pair to be separated, a single data structure for the pair should have been initialized that contained a pointer to the left and right sock, thus enabling socks to be referred to directly or via their pair. A sock may also be extended to contain a pointer to its partner.

This solves any computational pairing problem by removing it with a layer of abstraction.

Applying the same idea to the practical problem of pairing socks, the apparent answer is: don't allow your socks to ever be unpaired. Socks are provided as a pair, put in the drawer as a pair (perhaps by balling them together), worn as a pair. But the point where unpairing is possible is in the washer, so all that's required is a physical mechanism that allows the socks to stay together and be washed efficiently. 

There are two physical possibilities:

For a 'pair' object that keeps a pointer to each sock we could have a cloth bag that we use to keep the socks together. This seems like massive overhead.

But for each sock to keep a reference to the other, there is a neat solution: a popper (or a 'snap button' if you're American), such as these:

Then all you do is snap your socks together right after you take them off and put them in your washing basket, and again you've removed the problem of needing to pair your socks with a physical abstraction of the 'pair' concept.
I came out with another solution which would not promise less operations, neither less time consumption but should be tried to see if it can be good enough heuristic to provide less time consumption in huge series of sock pairing.

There is no guarantee that there are the same socks. If they are of the same color it doesn't mean they have the same size or pattern. Socks are randomly shuffled. There can be odd number of socks(some are missing, we don't know how many). Prepare to remember a variable "index" and set it to 0.

The result will have one or two piles: 1. "matched" and 2. "missing"


Find most distinctive sock.
Find its match.
If there is no match put it on the "missing" pile.
Repeat from 1. until there are no more most distinctive socks.
If there are less then 6 socks, go to 11.
Pair blindly all socks to its neighbor (do not pack it)
Find all matched pairs, pack it and move packed pairs to "matched" pile; 
If there were no new matches - increment "index" by 1
If "index" is greater then 2 ( this could be value dependent on sock 
number because with greater number of socks there are less chance to
pair them blindly ) go to 11
Shuffle the rest 
Go to 1
Forget "index"
Pick a sock
Find its pair
If there is no pair for the sock move it to the "missing" pile
If match found pair it, pack pair and move it to the "matched" pile
If there are still more then one socks go to 12
If there is just one left go to 14
Smile satisfied :)
Also, there could be added check for damaged socks also, as if the removal of those. It could be inserted between 2 and 3, and between 13 and 14 

I have never tried this and looking forward to hear about any experiences or corrections.
Real world approach:

As rapidly as possible, remove socks from the unsorted pile one at a time and place in piles in front of you. The piles should be arranged somewhat space-efficiently, with all socks pointing the same direction; the number of piles is limited by the distance you can easily reach. The selection of a pile on which to put a sock should be -- as rapidly as possible -- putting a sock on a pile of apparently like socks; the occasional type I (putting a sock on a pile it doesn't belong to) or type II (putting a sock in its own pile when there's an existing pile of like socks) error can be tolerated -- the most important consideration is speed. Once all the sock are in piles, rapidly go through the multi-sock piles creating pairs and removing them (these are heading for the drawer). If there are non-matching socks in the pile, re-pile them to their best (within the as-fast-as-possible constraint) pile. When all the multi-sock piles have been processed, match up remaining pairable socks that weren't paired due to type II errors. Whoosh, you're done -- and I have a lot of socks and don't wash them until a large fraction are dirty. Another practical note: I flip the top of one of a pair of socks down over the other, taking advantage of their elastic properties, so they stay together while being transported to the drawer and while in the drawer.
I have taken simple steps to reduce my effort into a process taking O(1) time.

By reducing my inputs to one of two types of socks (white socks for recreation, black socks for work), I only need to determine which of two socks I have in hand. (Technically, since they are never washed together, I have reduced the process to O(0) time)

Some upfront effort is required to find desirable socks, and to purchase in sufficient quantity as to eliminate need for your existing socks. As I'd done this before my need for black socks, my effort was minimal, but mileage may vary.

Such an upfront effort has been seen many times in very popular and effective code. Examples include #DEFINE'ing pi to several decimals (other examples exist, but that's the one that comes to mind right now).
Create a hash table which will be used for unmatched socks, using the pattern as the hash. Iterate over the socks one by one. If the sock has a pattern match in the hash table, take the sock out of the table and make a pair. If the sock does not have a match, put it into the table.
What about doing some preprocess?. I would stitch a mark or id number in every sock in a way that every pair has the same mark/id number. This process might be done every time you buy a new pair of socks and if you ask your mom/grandma, maybe it wouldn't cost any effort. Then, you could do radix sort to get O(n) total cost. Find a place for every mark/id number and just pick all the socks one by one and put them into the correct place. 
I tought about this very often during my PhD (in Computer Sciences). I came up with multiple solutions, depending on the ability of distinguishing socks and thus find correct pairs as fast as possible.

Suppose the cost of looking at socks and memorizing their distinctive patterns is negligible (ε). Then the best solution is simply to throw all socks on a table. This involves those steps:

Throw all socks on a table (1) and create a hashmap {pattern: position} (ε)
While there are remaining socks (n/2):
Pick up one random sock (1)
Find position of corresponding sock (ε)
Retrieve sock (1) and store pair

This is indeed the fastest possibility and is executed in n + 1 = O(n) complexity. But it supposes that you perfectly remember all patterns... In practice, this is not the case, and my personal experience is that you sometimes don't find the matching pair at first attempt:

Throw all socks on a table (1)
While there are remaining socks (n/2):
Pick up one random sock (1)
while it is not paired (1/P):
Find sock with similar pattern
Take sock and compare both (1)
If ok, store pair

This now depends on our ability to find matching pairs. This is particularly true if you have dark/grey pairs or white sport socks that often have very similar patterns! Let's admit that you have a probability of P of finding the corresponding sock. You'll need, on average, 1/P tries before finding the corresponding sock to form a pair. The overall complexity is 1 + (n/2) * (1 + 1/P) = O(n).

Both are linear in the number of socks, and are very similar solutions. Let's slightly modify the problem and admit you have multiple pairs of similar socks in the set, and that it is easy to store multiple pairs of socks in one move (1+ε). For K distinct patterns, you may implement:

For each sock (n):
Pick up one random sock (1)
Put it on its pattern's cluster

For each cluster (K):
Take cluster and store pairs of socks (1+ε)

The overall complexity becomes n+K = O(n). Still linear, but choosing correct algorithm may now greatly depend on the values of P and K! But one may object again that you may have difficulties to find (or create) cluster for each ssock.

Besides, you may also loose time by looking on websites what is the best algorithm and proposing your own solution :)
When i sort socks, i do an approximate radix sort, dropping socks near other socks of the same colour/pattern type. Except in the case when i can see an exact match at/near the location i'm about to drop the sock i extract the pair at that point.

Almost all the above algorithms (including the top scoring answer by Peter Mortensen) sort, then remove pairs. I find that, as a human, It is better to minimize the number of socks being considered at one time. 

I do this by:

Pick a distinctive sock (whatever catches my eye first in the pile).
Start a radix sort from that conceptual location by pulling socks from the pile based on similarity to that one. 
Place the new sock near into the current pile, with a distance based on how different it is. If you find yourself putting the sock on top of another because it is identical, form the pair there, and remove them. This means that future comparisons take less effort to find the correct place.
This takes advantage of the human ability to fuzzy-match in O(1) time, which is somewhat equivalent to the establishment of a hash-map on a computing device.

by pulling the distinctive socks first, you leave space to "zoom" in on the features which are less distinctive to begin with.

After eliminating the fluro coloured, the socks with stripes, and the 3 pairs of long socks, you might end up with mostly white socks roughly sorted by how worn they are.

At some point the differences between socks are small enough that other people won't notice the difference, and any further matching effort is not needed.
I've finished to pair my socks just right now, and I found that the best way to do it is the following:
- choose one of the socks and put it away (create a 'bucket' for that pair)
- if the next one is the pair of the previous one, than put it to the existing bucket, otherwise create a new one

in the worst case it means that you will have n/2 different buckets, and you will have n-2 determinations about that which bucket contains the pair of the current sock. Obviously this algorithm works well if you have just a few pairs, I did it with 12 pairs.

Not so scientific, but works well:)
The problem of sorting your n pairs of socks is O(n). Before you throw them in the laundry basket, you thread the left one to the right one. On taking them out, you cut the thread and put each pair into your drawer - 2 operations on n pairs, so O(n).

Now the next question is simply whether you do your own laundry and your wife does hers. That is a problem likely in an entirely different domain of problems. :)
Your Answer