Thoughts On Randomizing
One night not long ago my subconscious kicked me awake with an apparently pressing problem regarding choosing a number from one to sixty-four, inclusive. I'm not sure why sixty-four was the number. Perhaps it had to do with picking any square a chess board or other eight-by-eight grids. Maybe that there are eight bits in a byte and my unconscious self needed to figure out how to pick just one bit in a set of eight bytes. My sleeping brain had apparently been reeling about with this problem for some time and decided it needed some conscious thought to confirm some doubts.
I'm sure smarter people than me have worked on this problem with much more finesse and even experimentation, but just to peek inside my head a little bit, here's what I recall waking up to and fighting my brain to stop working out so I could resume my sleeping.
Of course, my brain had surmised, the most accurate way is to have sixty-four objects and choose one at random. This works best if the objects are just as likely to be chosen as any other, like being equal in shape and size and weight, and probably marked with the numbers. Like the ping-pong balls of a lottery, stones or other chits in a bag, or even papers in a hat. Allowing for a significant shuffling or stirring, it should be the case that a fairly even distribution of numbers will result. Of course, in a small enough sample it could be that one number could be selected more frequently, but that's just chance.
The “pick one” method has a downside as sets get larger. Sixty-four isn't that large, if the tiles being chosen are the size of Scrabble tiles or marbles, but if the tiles were the size of golf balls or baseballs there'd need to be lots of space to spread them out, as it'd be unrealistic to get a fair dig deep into a container. Also, if the tiles were small but there were more than hundreds of them, there'd be a similar problem “getting in there.”
Thinking back to the chess board, my sleeping brain considered that it seems like a fair way to randomly choose a number. Put numbers on the squares in the grid and cast an object onto the grid until the object is clearly in just one square. The numbers could be rearranged periodically, or even at each iteration, adding to the randomness.
One problem is that it seems like there might be a skewing of the position where the object lands, as there might be a tendency for the object to bounce off the edges toward the middle; this results in giving the numbers in the middle a greater chance of being used. It could be argued either way that the edges are given fair or unfair advantage as hitting the edge could prematurely stop the object on the edge squares, or cause it to bounce further. Certainly the corners would be given either an unfair advantage as striking both edges might cause enough energy loss for the object to remain nearby, or an unfair disadvantage as the additional bouncing might compel the object away.
Another problem is that it seems like it'd be possible for a person to “aim” a shot toward a number. To combat this, it might be better to have the board spin while the throw is made. This introduces difficulty as it could skew the results by putting some centrifugal force on the object, tending the results away from the center.
Spinning a board leads the sleepy brain toward something more like a roulette wheel, where sixty-four evenly-sized slices are laid out, perhaps with pockets in the ends, and the wheel is spun and a ball is sent against the spin, perhaps with some obstacles to encourage chaotic bouncing. Eventually the ball will come to rest in one of the pockets. As long as enough spin is provided to the wheel, the speed of rotation varies between spins, the ball is released at different times and velocities, and obstacles provide enough chaos, there should be enough variation to curb any unfair tendencies toward particular spots on the wheel. As long as all of that happens, that is, otherwise it should be possible to time things and “aim” for a spot on the wheel, too.
The numbers on the wheel could be spread in order or rearranged periodically, so this doesn't seem like a terrible way to do it. Works for gambling, though. Of course, casino roulette wheels have been configured to give slight advantage to the house, mostly in terms of tilting the winnings awarded for each bet in the favor of the house, but at least the chance of landing on any given spot is fairly random.
A one-piece option would be a spinning wheel with a pointer indicating into which slice contains the selected number. Commonly called “prize wheels,” these take the ball and obstacles out of the solution and rely solely on the random starting positions and velocity of the spin. Of course, numbers could be rearranged to add to the randomness and help defeat “aiming” toward a particular slice.
Really, the spinning wheel is just another mechanism of choosing from a fixed set. It has a benefit of removing some of the difficulties that having a bag of chits might have, like ensuring a fair stirring or count of the chits. If it has any drawbacks it would be that the wheel would have to be kind of large and the process would be kind of slow.
My sleepy brain spent a little time thinking of coin-flips. Heads or tails is binary. In binary the decimal 64 is represented as 1000000. Since zero isn't an allowed choice, using the binary 111111, which is decimal 63. Zero to sixty-three is sixty-four available numbers, and adding one to the result gives a number from 1 to 64. A coin could be flipped six times, each head or tail would be marked as a zero or one, as would be decided in advance. On the surface, this seems pretty random.
One could take out any skilled coin flipping trickery by using six coins at once, perhaps shaken in a container; marked in advance for position so there'd be no arguing which was the fourth or sixth bit, or whatever.
The problem with coin flipping is that there's an apparent opportunity of an uneven distribution of the numbers. How often would one flip all heads or all tails six times in a row? That's what it would take to reach the number one (or sixty-four). What if only one tails was flipped? That could be one of six values, depending on which order it was flipped.
Simply counting the ones in the binary numbers from zero to sixty-three gives us our six bits, but there's a small bell curve for the numbers of bits used in any given number.
0: 1 1: 6 2: 15 3: 20 4: 15 5: 6 6: 1
This seems like it might be more likely to land on a split three-and-three flip than on a six-of-one flip, but that really isn't the case.
The real tricky part to understanding this is remembering that each coin toss is independent. That means that it doesn't matter what the results were of any previous flip or flips. Since there's a fair 50% chance of a fairly flipped-coin landing either heads or tails (that means cheating not withstanding), there's a 1-4 chance that on two flips, both of them will be the heads or tails while there's a 2-4 chance that they'll be split one-side each; but really that means there's a ¼ chance of head-and-tail, and a ¼ chance of tail-and-head. Since this is represented by multiplying ½ by the number of flips, which in our case is six. So that's 1/2 x 1/2 x 1/2 x 1/2 x 1/2 x 1/2 or 1/64. Wait...that's what we want: a one-in-sixty-four chance of getting any particular combination.
Really there are only sixty-four possible outcomes of flipping a coin six times and treating it as a binary representation. This is a fine way of getting a random number between one and sixty four, but it could be a bit time consuming, and requires (or develops) some proficiency in converting binary to decimal. Some of us can do this in our heads, some will need paper or fingers, and still more will need some kind of calculator.
Turning to computers, which work in binary, there's a little potential for trouble. Using the computer for converting the coin flips into a decimal value would be quick, as the computer would be very fast and accurate at converting binary to decimal, but using the computer to select the random number wouldn't be quite as fair as computer random number generation is usually pseudo-randomly generated. This means that there's an underlying formula which can lead to patterns. They are usually “seeded,” or given a starting-point, which means that their randomness can be repeated, by providing the same seed. These formulas try really hard to work out random numbers, sometimes even to the point of trying to exclude previously chosen numbers, which removes some randomness from the process.
There are some ways to get more “truly” random numbers from other large sets of data. One could use an image and take the bytes apart. The image can be selected “randomly,” by pulling from whatever's on the front page of a set of news sites, or even on the front of a frequently rotating image site like flikr.com's explore page. A problem with images is that there tend to be swaths of the same or similar colors, so the true nature of random bits might require multiple images.
One site on the Internet, Random.org (conveniently http://random.org), uses atmospheric noise to come up with random numbers. What's more random than wind blowing across a microphone or whatever kind of sensor they use, right? They turn the noise into numbers in their database, and then pull the numbers out as people ask for them, which is kind of random since the order of users and the number of bits they might request is pretty random. They even offer an API for accessing their random numbers for use in other applications. Their site offers a selection of ways to request random numbers.
Here, for example, is a pull of sixty-four random numbers between one and sixty-four from their random integer generator.
58 47 3 50 22 40 19 30 31 18 39 45 27 29 46 11 45 3 31 64 34 38 39 42 24 56 15 55 38 57 3 63 61 44 60 2 34 46 9 32 4 38 14 63 20 62 59 52 28 63 16 17 63 10 26 25 5 64 52 9 56 51 3 42
It would take a larger set to see if the distribution is even, but this looks pretty random with a small number of repeats, with a few pairs and a few less numbers with more matches, and even a few missing numbers. There are, in this small set, no 1s, but there are one 2, four 3s, two 64s, and four 63s, so it seems that hitting the edges of our range isn't so hard. There are other missing digits and duplicates, so it seems pretty random.
For another example, here are sixty-four eight-bit numbers in binary from the same website.
11101011 01001100 00110000 01001101 00110110 01110110 11111111 01000111 11111000 11000010 01001000 11011011 01101100 01110111 01101000 00011011 01100010 00111000 01010110 10111111 00001100 11111110 00001100 10101011 01110101 01111011 00000110 10010100 10000110 10001001 01011110 01010100 00000110 11101111 00001110 00001011 11010011 10110001 00101010 01011011 00010101 10010110 00000001 10011000 11011011 11000011 10100001 10010110 00111101 10100101 10010010 01110010 00010001 11000111 11100001 10001000 11010100 11000011 10101011 00111100 10011010 00111011 01000011 10110100
We can see a fairly even spread of zeros and ones. There are short spans of repeated digits, and a few longer stretches. And even a few stretches of alternating digits. Ignore the two left-most digits of each number, for ease, and what is left is a range from zero to sixty-three.
There are two numbers with all ones, but none with all zeros. There are only eight numbers with only one 1 or one 0, but there are twenty-eight with two 1s or two 0s. That leaves twenty-four with a split of three 1s and 0s. This shows there is the bit distribution curve mentioned above, but as with the coin flipping, this doesn't mean there's any more or less chance of choosing a number from one to sixty-four.
Tying into a huge dataset to find “truly” random numbers can add complexity. Additionally, there may be situations where using a computer is not acceptable or possible. There might be times where using a computer might be fine, but Internet access is unavailable. A solution needed to be had where the math wouldn't be so much of a problem and the portability of the solution lends itself into selecting a random number from one to sixty-four easily.
My tired brain tried to work out how to do this with dice. Cubic, six-sided dice are easy to come by. Almost any comic, game, or book store also will sell six, eight, ten, twelve, and twenty-sided dice. I've got at least a handful of each from my board and role-playing game days.
It works out well to use a pair of ten- or twenty-sided dice to roll an even distribution of random numbers from zero to nine, or to ninety-nines and beyond. By treating 10 as zero, or adding one to your end result, you can get from one to one hundred, or even a thousand or million. Simply pick a die to represent each digit position and roll (or roll one die in order). The distribution of digits zero to nine is even no matter the size of number needed. As with the coins, the likelihood of any number coming up is one-in-however-many-numbers-the-die-has. For a ten-sided die (or only the digits on a twenty-sided die), this is one-in-ten for any digit. Rolling two times (or two dice) for a two-digit number is 1/10 x 1/10, or 1/100 for any given number. Just as random as we need.
This works, as long as you need all of the digits. But for one to sixty-four, we need to somehow just have sixty-four things to choose from. We can't use the decimal dice trick and just ignore some, unless we keep rolling until we get numbers in our range. That's a loss of an easy third of the rolls, and whatever impact one can imagine to the randomness of each die.
Obviously a 64-sided die would be the answer, but that's a pretty big demand. There isn't a simple cubic, dodecahedron, or icosahedron (a six- or twelve- or twenty-sided) shape after which a die could be modeled. I suppose someone could make something like the eight-sided die, which looks like a couple of pyramids stuck together at their four-sided bases, but with a 32-sided conic shape on each end. The die would need to be pretty large to make sure the flat part was enough to be certain which number was face-up. There might be a better design than the dual cones that a person with more time to work out the geometry could come up with. Since there isn't such a die now (or at least not at your local game shop), it isn't a realistic offer to the solution.
My thoughts jumped to the 8-sided die. Some of them are numbered one-to-four on each side, so care needs to be made to choose dice that are numbered one-to-eight, or to use one color (they tend to have two when numbered one-to-four) as one-to-four and the other color as five-to-eight.
At first I tried to work out if rolling two dice and using their digits would work. It seems to work, but you end up with numbers from one-to-eighty-eight, except anything that ends in nine. That leaves some big gaps, and gives some extra numbers we don't need.
My sleepy head thought of rolling two dice and multiplying their values would work; after all, eight times eight is sixty four. However, I quickly stopped that thought train as there are a few prime and other numbers that would never be reached. You can reach the prime numbers three and five and seven just fine, but you can't reach eleven, thirteen, or seventeen. Further, you can't reach some non-prime values like twenty-two, or thirty-three, and so on.
I thought of how to add the numbers of multiple dice together. One could roll an eight sided die, subtracting one from the result (or by treating eight as zero), nine times, and add one to the final result to get to numbers from one to sixty-four (this works because 7x9=63).
The problem with adding the values of multiple dice, as any craps or Yahtzee player knows, is that the low and high numbers have fewer combinations that lead to them than the middle numbers have. There'd be only one way each to get one (all dice roll 1s) or sixty-four (all dice roll 8s), but there are nine ways to get two (eight dice roll 1s, and the remaining die rolls a 2, in any order) or sixty-three (eight dice roll 8s, and the last die rolls a 7, in any order).
Instead of describing each, here's a breakdown of the staggering 134,217,728 combinations of rolling nine eight-sided dice result in the different possible values of one to sixty-four.
1: 1 2: 9 3: 45 4: 165 5: 495 6: 1287 7: 3003 8: 6435 9: 12861 10: 24229 11: 43353 12: 74097 13: 121515 14: 191907 15: 292743 16: 432399 17: 619677 18: 863109 19: 1170073 20: 1545777 21: 1992195 22: 2507067 23: 3083103 24: 3707559 25: 4362297 26: 5024385 27: 5667237 28: 6262237 29: 6780735 30: 7196247 31: 7486635 32: 7635987 33: 7635987 34: 7486635 35: 7196247 36: 6780735 37: 6262237 38: 5667237 39: 5024385 40: 4362297 41: 3707559 42: 3083103 43: 2507067 44: 1992195 45: 1545777 46: 1170073 47: 863109 48: 619677 49: 432399 50: 292743 51: 191907 52: 121515 53: 74097 54: 43353 55: 24229 56: 12861 57: 6435 58: 3003 59: 1287 60: 495 61: 165 62: 45 63: 9 64: 1
This, unlike the misleading binary curve, is a proper representation of the probability of achieving a particular number. In the middle of the heap we can see that to reach 32 or 33 is 7,635,987 times more likely than reaching a 1 or 64. That's not quite a one-in-sixty-four chance necessary for a fair distribution of possibilities.
After thinking of all of that, reflecting on the binary and decimal uses, I thought to just use the eight-sided dice as base-8 values. Either subtracting one from the number or treating 8 as zero, one die would represent the “eights” and the other would represent the “ones.” For the “eights” die, the number rolled (subtracting one or using eight as zero) is multiplied by eight, and the number (likewise modified) is added to the product of the first die. This gives a range of zero to sixty-three, or sixty-four choices. This is close, so adding one would give us our desired range.
This, finally gives us the a fair and easy distribution of numbers from one to sixty-four without requiring a lot of equipment or math. All right, it requires a little bit of proficiency an accuracy multiplying by eights, but it's a lot faster than working out the binary.
As with using a pair of ten- or twenty-sided dice to reach a range of one to one hundred, each die is equally likely to have any digit. Since we're using multiplication, there's none of the bell curve that happens when adding the dice together. Our odds of reaching any number in our range is 1/8 x 1/8, or 1/64.
Relieved to have worked this out, I finally fell asleep. After a little more consideration before (and during) tapping this out, it seems that the only absolute way to be sure of a fair distribution for weird-sized number sets is to have the right number of things and pick one. Return the picked item, shuffle the set, and repeat as necessary. It doesn't matter how many of the things there are. The same, pick one and repeat. This works great for smaller sets of values, but for very large sets it gets unrealistic to choose from a set of things.
For sets aligned with the decimal, use ten- or twenty-sided dice. For needs aligning with the binary, use coins or computers. For sets aligning themselves with base-4 or -6 or -8 or -12, use the appropriate dice, with the same multiplication-based solution.
For one to sixty-four, use a pair of eight-sided dice and some base-8 math.
Image from GMDice.com. Go buy some dice!
2 comments
5 star: | (1) | |
---|---|---|
4 star: | (0) | |
3 star: | (0) | |
2 star: | (0) | |
1 star: | (0) | |
(5.0)
Comment from: Jonny Awesomesausage Visitor
Comment from: Scrabble Madre Visitor
A lot of work you’ve done here. Maybe I’ll try my hand at it somehow for Sudoku. I’ve been using this http://www.anagrammer.com/sudoku-solver/
You need a hobby :P