Database Programming: Denis’ Prime Number Challenge: Fourth Place Finisher


Hugo Kornelis posted a wonderful reply on his blog to the Prime Number Challenge, which runs in 9.3 seconds on my laptop.  Denis posted his solution in a reply on Hugo's blog; Denis' solution runs in 6 seconds.


WOW!  Perhaps this isn't such a crazy undertaking after all.


I've been working on my previous syntax a little bit and I've got it to where I think it will run in four hours.  It's reproduced at the end of this post for completeness' sake.  I'll chase it down even though it stands thoroughly discredited.


I've got one potential tweak to Hugo's syntax I'm looking at that might bring it down below Denis'; I'll pursue that presently and make another post when I've got an answer.  The code below will run overnight and and update provided tomorrow.


These are the major changes:



  • DataTypes changed to int from decimal(12,2)

  • Division operations replaced with modulus tests

  • Performance tweaks in INSTEAD OF INSERT trigger for ComparisonMatrix table

  • No persisted quotient column in ComparisonMatrix table

  • No persisted SetBuilder table

  • In cursor loop, values are compared to Candidate table rather than obsolete SetBuilder table (no reason to compare to numbers we've already eliminated)

This final step did a great deal to reduce the log churn this process generates.


Here is the current state of this discredited tangent, which will finish a distant fourth in this contest.  Thanks to Denis for a great brain-teaser.


     -wp


--WAITFOR DELAY '00:30'  -- wait for 30 minutes


USE PrimeNumberTestBeta


GO


 


--  log a status message


SELECT GETDATE() AS [Starting: Building and Populating Schema..]


SET NOCOUNT ON


GO


 


--  remove the three tables we're going to use


--  if they're present from a previous run


IF EXISTS (SELECT 1 FROM sys.objects WHERE name = 'Candidate')


BEGIN


    DROP TABLE Candidate


--    SELECT GETDATE() AS [SchemaCreate: Candidate Dropped..]


END


GO


IF EXISTS (SELECT 1 FROM sys.objects WHERE NAME = 'ComparisonMatrix')


BEGIN


    DROP TABLE ComparisonMatrix


--    SELECT GETDATE() AS [SchemaCreate: ComparisonMatrix Dropped..]


END


GO


 


--  create the table to hold the set of values to test from primacy


--  (we are going to remove some low-hanging fruit for performance reasons,


--      so we need a distinct table


CREATE TABLE Candidate (


    CandidateValue  int --CONSTRAINT Candidate_PK PRIMARY KEY CLUSTERED WITH FILLFACTOR = 100


)


--    SELECT GETDATE() AS [SchemaCreate: Candidate Created..]


GO


 


--  create the table where the calculations will be performed


--      Base is the number we're testing for primacy


--      Factor is the number we're dividing it by


--      IsQuotientInteger is a bit flag indicating whether the Quotient is an integer


CREATE TABLE ComparisonMatrix (


    Base int,


    Factor int,


--    Remainder AS Base % Factor,


    IsQuotientInteger AS CASE


        WHEN Base % Factor = 0 THEN CAST(1 AS int)


        ELSE CAST(0 AS int)


    END           PERSISTED


)   


--  SELECT GETDATE() AS [SchemaCreate: ComparisonMatrix Created..]


GO


 


--  build indexes on the ComparisonMatrix table to improve perfomance


--CREATE UNIQUE /*CLUSTERED*/ INDEX ComparisonMatrix_PK ON ComparisonMatrix(Base, Factor) WITH FILLFACTOR = 90


--SELECT GETDATE() AS [SchemaCreate: Clustered Index On ComparisonMatrix Created..]


GO


 


-- CREATE INDEX ComparisonMatrix_IDX1 ON ComparisonMatrix (IsQuotientInteger) WITH FILLFACTOR = 100


--SELECT GETDATE() AS [SchemaCreate: Index On ComparisonMatrix Created..]


GO


 


--    build an INSTEAD OF INSERT trigger so we only save the integer factors


--      THIS IS PURELY A PERFORMANCE ISSUE;


--      it's cheaper to write this trigger


--  this syntax is new in SQL Server 2005; the BOL reference may be found here:


--  ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/udb9/html/0001f651-fcfa-401b-baf2-0ec6b1671116.htm


--


--  A BRIEF DISCUSSION:


--  what we're doing here is tuning our INSERT statement for the table CompressionMatrix below to ignore


--      the transactions which do not result in an integer quotient.  This will make the commiting of


--      transactions orders of magnitude faster, since only a small fraction of the values will result in


--      integer quotient


 


 


CREATE TRIGGER IOI_ComparisonMatrix


ON    ComparisonMatrix


INSTEAD OF INSERT


AS


INSERT      ComparisonMatrix (


            Base,


            Factor


)


SELECT      TOP 3


        Base,


            Factor


FROM  inserted


WHERE IsQuotientInteger = 1


OPTION  (FAST 3)


 


GO


--  SELECT GETDATE() AS [SchemaCreate: ComparisonMatrix Trigger Created..]


GO


 


--  log a status message


--  SELECT GETDATE() AS [SchemaCreate: Tables Created..]


GO


 


--  using Shaun's syntax (because it doesn't have a database footprint),


--  populate Candidate


;WITH digits(i) AS (


    SELECT 1 AS I UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL


    SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL


    SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9 UNION ALL


    SELECT 0


),


--      generate 1M rows each with a unique row number, i


sequence(i) AS (


    SELECT d1.i + (10*d2.i) + (100*d3.i) + (1000*d4.i) + (10000*d5.i) + (100000*d6.i)


      FROM digits AS d1,


           digits AS d2,


           digits AS d3,


           digits AS d4,


           digits AS d5,


           digits AS d6


)


INSERT  Candidate (CandidateValue)


SELECT  i+1 -- CTE gives us 0-999999; we want 1-1000000


FROM    sequence


 


--  SELECT GETDATE() AS [Candidate Populated..]


GO


 


--  now delete the low-hanging fruit from Candidate


--    WE ARE ONLY DONG THIS FOR PERFORMANCE; the algorhy\itm can handle them


--    NOTE that these fifteen DELETE statements, which encompass the first 15 prime numbers and


--      thier multiples, together remove 86.13% of the candidate values


DELETE  Candidate WHERE   (CandidateValue % 2 = 0  AND CandidateValue >= 2+1)


DELETE  Candidate WHERE   (CandidateValue % 3 = 0  AND CandidateValue >= 3+1)


DELETE  Candidate WHERE   (CandidateValue % 5 = 0  AND CandidateValue >= 5+1)


DELETE  Candidate WHERE   (CandidateValue % 7 = 0  AND CandidateValue >= 7+1)


DELETE  Candidate WHERE   (CandidateValue % 11 = 0 AND CandidateValue >= 11+1)


DELETE  Candidate WHERE   (CandidateValue % 13 = 0 AND CandidateValue >= 13+1)


DELETE  Candidate WHERE   (CandidateValue % 17 = 0 AND CandidateValue >= 17+1)


DELETE  Candidate WHERE   (CandidateValue % 19 = 0 AND CandidateValue >= 19+1)


DELETE  Candidate WHERE   (CandidateValue % 23 = 0 AND CandidateValue >= 23+1)


DELETE  Candidate WHERE   (CandidateValue % 29 = 0 AND CandidateValue >= 29+1)


DELETE  Candidate WHERE   (CandidateValue % 31 = 0 AND CandidateValue >= 31+1)


DELETE  Candidate WHERE   (CandidateValue % 37 = 0 AND CandidateValue >= 37+1)


DELETE  Candidate WHERE   (CandidateValue % 41 = 0 AND CandidateValue >= 41+1)


DELETE  Candidate WHERE   (CandidateValue % 43 = 0 AND CandidateValue >= 43+1)


DELETE  Candidate WHERE   (CandidateValue % 47 = 0 AND CandidateValue >= 47+1)


 


--  these bear less fruit, but are included because they'll still


--  help performance; I continued to add a line as long as 1% of remaining candidates


--  was deleted


DELETE  Candidate WHERE   (CandidateValue % 53  = 0 AND CandidateValue >= 53+1)


DELETE  Candidate WHERE   (CandidateValue % 59  = 0 AND CandidateValue >= 59+1)


DELETE  Candidate WHERE   (CandidateValue % 61  = 0 AND CandidateValue >= 61+1)


DELETE  Candidate WHERE   (CandidateValue % 67  = 0 AND CandidateValue >= 67+1)


DELETE  Candidate WHERE   (CandidateValue % 71  = 0 AND CandidateValue >= 71+1)


DELETE  Candidate WHERE   (CandidateValue % 73  = 0 AND CandidateValue >= 73+1)


DELETE  Candidate WHERE   (CandidateValue % 79  = 0 AND CandidateValue >= 79+1)


DELETE  Candidate WHERE   (CandidateValue % 83  = 0 AND CandidateValue >= 83+1)


DELETE  Candidate WHERE   (CandidateValue % 89  = 0 AND CandidateValue >= 89+1)


DELETE  Candidate WHERE   (CandidateValue % 97  = 0 AND CandidateValue >= 97+1)


DELETE  Candidate WHERE   (CandidateValue % 101 = 0 AND CandidateValue >= 101+1)


 


--  SELECT GETDATE() AS [Candidate DePopulated For Performance..]


 


--  retune indexes for performance


ALTER INDEX ALL ON Candidate REBUILD


 


UPDATE STATISTICS Candidate


 


--  uncomment this SELECT statement to confirm that we have cut 1M candidates down to 119,590


--  SELECT COUNT(1) AS [Candidate Depopulated Count] FROM Candidate


GO


 


--  log a status message


--  SELECT GETDATE() AS [Tables Populated..]


GO


 


--  keep some house for what comes..


DECLARE @CandidateValue int,


        @Counter int,


        @Now DATETIME,


        @PreviousInterval DECIMAL(12,2),


        @Interval DECIMAL(12,2)


 


--  initalize a counter


SET     @Counter = 1


 


--  declare a cursor to traverse the Candidate table


DECLARE TraverseCandidateTable CURSOR FAST_FORWARD READ_ONLY FOR


SELECT  CandidateValue


FROM    Candidate


ORDER BY CandidateValue


--DESC


 


--  open the cursor and get the first record


OPEN    TraverseCandidateTable


 


FETCH NEXT


FROM  TraverseCandidateTable


INTO  @CandidateValue


 


--  log a status message


SELECT GETDATE() AS [Processing: Cursor Loop Starts..]


 


--  set timer


SET @Now = GETDATE()


 


--  open a transaction


--      (careful transaction control is necessary to avoid rampant growth of the transaction logs)


BEGIN TRAN


 


--  start a loop which will run as long as the cursor finds a record


WHILE @@FETCH_STATUS = 0


BEGIN


 


--  insert the full range of remaining possible factors for the candidate value


--      into the comparison matrix


--      (use the Candidate table to aid this work)


 


        INSERT  ComparisonMatrix (


            Base,


            Factor


        )


        SELECT  @CandidateValue,


                CandidateValue


        FROM    Candidate


        WHERE   CandidateValue <= @CandidateValue


--        OPTION  (RECOMPILE)


 


--  log a database commit and start a new transaction


--  if it's been 200 transactions since we did so


    IF  (@CandidateValue >= 20000 AND @Counter % 200 = 0)


    OR  (@CandidateValue <= 20000 AND @Counter % 100 = 0)


    BEGIN


        COMMIT TRAN


 


        --  retune indexes for performance


        ALTER INDEX ALL ON ComparisonMatrix REBUILD


 


            CHECKPOINT


 


--  log a status message if it's been 2000 transactions since we did so


        IF  @Counter % 2000 = 0


 


        BEGIN


            SET @Interval = (DATEDIFF (mi, @now, GETDATE()) * 60) +


                             DATEDIFF (ss, @now, GETDATE())


 


            SELECT  GETDATE()               AS [Processing: Status Datetime..],


                    @Counter                AS [Current Count..],


                    @CandidateValue         AS [Current Candidate..],


                    @Interval               AS [ElapsedSecondsThisBatch..],


                    (@Interval/@PreviousInterval) * 100


                                            AS [PerformanceChangeSinceLastBatch]


 


--  initialize loop variables for the next batch


 


            SET @Now = GETDATE()


            SET @PreviousInterval = @Interval


 


        END


 


        BEGIN TRAN


    END


 


--  get the next candidate value


    FETCH NEXT


    FROM  TraverseCandidateTable


    INTO  @CandidateValue


 


--  increment the final transaction


    SET   @Counter = @Counter + 1


END


 


--  we're out of the loop;


--  we still need to commit the final transaction


COMMIT TRAN


 


--  clost and deallocate the cursor


CLOSE TraverseCandidateTable


 


DEALLOCATE TraverseCandidateTable


 


--  log a status message


SELECT GETDATE() AS [Cursor Loop Complete..]


 


--  generate the list of prime numbers


--      if the SUM(IsQuotientInteger)=2 (where IsQuotientInteger=1 when a particular member of the set


--      is a precise integer factor of the candidate value), then that candidate is a prime number,


--      because IsQuotientInteger will equal 1 for 1 and the candidate itself.  All candidates with


--      additional factors will thus be filtered out by the HAVING cluase.


SELECT  Base


FROM    ComparisonMatrix


GROUP BY Base


HAVING SUM(IsQuotientInteger) = 2


ORDER BY Base

Comments (6)

  1. Anonymous says:

    Our blogging site is going to go dark in an hour or so for a software upgrade, so I won’t be able to…

  2. Anonymous says:

    No less a figure than Louis Davidson has weighed in on the Prime Number Challenge . I left a comment

  3. Anonymous says:

    One more PDA-incompatible post: lessons learned from &quot;discredited&quot; prime number syntax

  4. Anonymous says:

    Well, folks, there’s a new sheriff in town..&amp;nbsp; this is a continuation of a conversation here and…

  5. Anonymous says:

    Denis’ Prime Number Challenge just won’t die.&amp;nbsp; I think this topic has spurred more dialog than any&amp;nbsp;other…

  6. Anonymous says:

    One more PDA-incompatible post: lessons learned from "discredited" prime number syntax

Skip to main content