Sunday, May 27, 2012

Prime Numbers: A Computational Approach

I have two obsessions: prime numbers and brute force algorithms.

The first ones are almost useless in my daily life, but they're somehow... beautiful.

Brute force is a very useful way to solve simple problems, because you don't have to think about how to get a result when the computer can do it for you. Well, as I said, the problem has to be simple. Today's post isn't the case, but it will be fun.

There's that urban legend about the FBI buying prime numbers over 100 digits, which I don't know if it's true. However, it certainly meets my two obsessions. So, I made a program that prints all the prime numbers that exist.

Of course, this is just for fun. I don't have the methods neither the necessary machinery to actually get over 100 digits in a reasonable period of time. But I want to share one optimization to the method I discovered in a book while I was teaching C programming to a student of Civil Engineering (the former "IngenierĂ­a de Caminos Canales y Puertos").

So, the first function I ever programmed that checked if a given number is a prime number was the following:

int isPrime(long long unsigned number)
{
    long long unsigned aux;
    for(aux = 2; aux < number; aux++)
    {
        if(number % aux == 0)
        {
            return 0;
        }
    }
    return 1;
}

As you can see, the function checks if any number divides the given number one by one until it gets to the number given in the first place. Let's say: pure brute force.

Now let's see the first optimization (founded in the previously mentioned book):

int isPrimeFancy(long long unsigned number)
{
    long long unsigned aux;
    for(aux = 2; aux*aux < number; aux++)
    {
        if(number % aux == 0)
        {
            return 0;
        }
    }
    return 1;
}

What has changed is the condition of iteration in the for loop. Why can the loop stop there? Let's give an intuitive approach.

Imagine that we want to check if 19 is a prime number. The first method (which we're sure that works because is pure brute force) will iterate exactly:

(19 - 2) times = 17 times

The second algorithm will do something like this:

does 2 divide 19? No
does 3 divide 19? No
does 4 divide 19? No

Wait. The next number is 5. If 5 divided 19, another number should have appeared first, but it hasn't. Why? Because 5 times 5 is the minimum value that would match 19 here, and that's not the case, because 5 times 5 is 25, which it's over 19. In next iterations the minimum possible match would be 6 times 6. We're getting even farther from 19. This will happen with 7, 8 and the following numbers. So, those iterations will be certainly unnecessary.

Let's compare the performance of these two algorithms in this specific case. The first one will iterate 17 times, while the second one will iterate only 3 times. So, using the second algorithm we'll save 14 iterations, more than the 400% of the execution time! Imagine how much time could be saved on bigger numbers.

So, in the general case, we have the first algorithm with a theoretical complexity of O(n) and the second one with a complexity of O(sqrt(n)). Now it's time to try this for real.

Let me introduce you to my laptop:
HP Pavilion dv6700
Memory: 3GiB
Processor: Intel® Core™2 Duo CPU T5750 @ 2.00GHz × 2
Running: Ubuntu 12.04 Precise Pangolin

In order to calculate the real performance of these two algorithms in my laptop, I've used the Linux command time in it's most simple way:

time ./prime

These are the results for the first algorithm checking numbers up to 999999.

05-27-2012_1

And here they are for the second.

05-27-2012_2

I can assure you the main program is the same. It checks number by number if it's a prime, and if it is, the program prints it on the screen. You can download the complete source code of this program by clicking here. By the way, the main program in that link will iterate forever, so you will need to add a little piece of code to make it stop when you want.

That's a difference, is it not?

That's all for today. In later posts I might do some research about how to get this to the next level...

As I said in my first post, any suggestions, mistakes, comments... congratulations... are welcome.

Saturday, May 26, 2012

The Tungsten Sword

I'm a young student that does Computer Science in Spain. I love maths, science, programming... I specially love the Python Programming language (it's like driving a Rolls-Royce: not the fastest but the most elegant car). I hope to learn Web programming soon.

I've started this blog to share my knowledge with the world. I also would like this to be a learning experience for me. Many of your comments will be very useful. Whether you point out a mistake - starting with my English - or suggest a new perspective, your comments along with my posts will serve us to reach the state that every artist wants to reach: perfection.