Sieve of Eratosthenes in R and SQL

I just completed Computing for Data Analysis on Coursera. The course is brief introduction to R programming language. R has been around for years but is gaining much attention recently due to Big Data and Data Science trends. I had an idea about R and the course offered a wonderful opportunity to learn in a systematic manner. So for some more practice and a bit of fun (ok, I admit, more for fun than practice), I decided to implement ‘Sieve of Eratosthenes’ in R and SQL and see which one is faster (because that’s what you do on a lazy Saturday!!) This is a method to find primes up to a given number. The R code looks like this.

getPrimeNumTilln <- function(n) {

# a holds a sequence of numbers from 2 to n
a <- c(2:n)
# we start from 2 since it is the beginning of prime numbers,
# it is also the loop varibale
l <- 2
# r this vector holds the results
r <- c()
#while the square of loop variable is less than n
while (l*l < n) {
# the prime number is the first element in vector a
# for e.g. in first iteration it will be 2
r <- c(r,a[1])

# remove elements from a which are multiples of l
# for e.g. in first iteration it will remove 2,4,6,8…
a <- a[-(which(a %% l ==0))]

# the loop varibale if the first variable in remaining a
# for e.g after first iteration, it will be 3, then 5 (since 4 has been removed)…
l <- a[1]
# the result is r and all the remaining elements in a

And the SQL code looks like this.

DECLARE @maxnum INT = 1000 /* The number under which you want to find primes*/

DECLARE @l INT = 2 /* Beginning of prime numbers */

DECLARE @table TABLE (a INT NOT NULL PRIMARY KEY) /* Holding table*/

;WITH ct /* Generate the sequence of numbers*/
AS (


SELECT id + 1
WHERE id < @maxnum


WHILE (@l * @l < @maxnum)
/*remove records which are divisible by l*/
FROM @table
WHERE a != @l
AND (a % @l) = 0

/* the first remaining number is the prime number */
SELECT @l = MIN(a)
FROM @table
WHERE a > @l

FROM @table
Now, I am no expert in either maths or algorithms but that looks neat. To validate that the results are right, I ran it to check how many prime numbers are under 1000000 and both returned 78948. Wolfram Alpha seems to agree.

For smaller up to  10k, the results are comparable; they are in milliseconds. Above that,however,R seems to have an upper hand.

100000 0.02 1.72
1000000 0.56 19.00
10000000 11.25 246.21

Please note that the time is in seconds. The difference is quite stark especially for large n. R is killing SQL.

A few observations on SQL side are

  1. A significant time is being spent on generating sequence. With SQL Server 2012 Sequences, we might be able improve time.
  2. The delete operation is quite slow as we all know. I tried replacing it with update but that made it worse.

I know that there are many improvements that can be made to this but I am happy with my little testing. As usual comments are always welcome.