Database Programming: Denis’ Prime Number Challenge: Fourth Place Finisher


Hugo Kornelis posted a wonderful reply on his blog to the Prime Number Challenge, which runs in 9.3 seconds on my laptop.  Denis posted his solution in a reply on Hugo’s blog; Denis’ solution runs in 6 seconds.


WOW!  Perhaps this isn’t such a crazy undertaking after all.


I’ve been working on my previous syntax a little bit and I’ve got it to where I think it will run in four hours.  It’s reproduced at the end of this post for completeness’ sake.  I’ll chase it down even though it stands thoroughly discredited.


I’ve got one potential tweak to Hugo’s syntax I’m looking at that might bring it down below Denis’; I’ll pursue that presently and make another post when I’ve got an answer.  The code below will run overnight and and update provided tomorrow.


These are the major changes:



  • DataTypes changed to int from decimal(12,2)

  • Division operations replaced with modulus tests

  • Performance tweaks in INSTEAD OF INSERT trigger for ComparisonMatrix table

  • No persisted quotient column in ComparisonMatrix table

  • No persisted SetBuilder table

  • In cursor loop, values are compared to Candidate table rather than obsolete SetBuilder table (no reason to compare to numbers we’ve already eliminated)

This final step did a great deal to reduce the log churn this process generates.


Here is the current state of this discredited tangent, which will finish a distant fourth in this contest.  Thanks to Denis for a great brain-teaser.


     -wp


–WAITFOR DELAY ’00:30′  — wait for 30 minutes


USE PrimeNumberTestBeta


GO


 


  log a status message


SELECT GETDATE() AS [Starting: Building and Populating Schema..]


SET NOCOUNT ON


GO


 


  remove the three tables we’re going to use


  if they’re present from a previous run


IF EXISTS (SELECT 1 FROM sys.objects WHERE name = ‘Candidate’)


BEGIN


    DROP TABLE Candidate


    SELECT GETDATE() AS [SchemaCreate: Candidate Dropped..]


END


GO


IF EXISTS (SELECT 1 FROM sys.objects WHERE NAME = ‘ComparisonMatrix’)


BEGIN


    DROP TABLE ComparisonMatrix


    SELECT GETDATE() AS [SchemaCreate: ComparisonMatrix Dropped..]


END


GO


 


  create the table to hold the set of values to test from primacy


  (we are going to remove some low-hanging fruit for performance reasons,


      so we need a distinct table


CREATE TABLE Candidate (


    CandidateValue  int –CONSTRAINT Candidate_PK PRIMARY KEY CLUSTERED WITH FILLFACTOR = 100


)


    SELECT GETDATE() AS [SchemaCreate: Candidate Created..]


GO


 


  create the table where the calculations will be performed


      Base is the number we’re testing for primacy


      Factor is the number we’re dividing it by


      IsQuotientInteger is a bit flag indicating whether the Quotient is an integer


CREATE TABLE ComparisonMatrix (


    Base int,


    Factor int,


    Remainder AS Base % Factor,


    IsQuotientInteger AS CASE


        WHEN Base % Factor = 0 THEN CAST(1 AS int)


        ELSE CAST(0 AS int)


    END           PERSISTED


)   


  SELECT GETDATE() AS [SchemaCreate: ComparisonMatrix Created..]


GO


 


  build indexes on the ComparisonMatrix table to improve perfomance


–CREATE UNIQUE /*CLUSTERED*/ INDEX ComparisonMatrix_PK ON ComparisonMatrix(Base, Factor) WITH FILLFACTOR = 90


–SELECT GETDATE() AS [SchemaCreate: Clustered Index On ComparisonMatrix Created..]


GO


 


— CREATE INDEX ComparisonMatrix_IDX1 ON ComparisonMatrix (IsQuotientInteger) WITH FILLFACTOR = 100


–SELECT GETDATE() AS [SchemaCreate: Index On ComparisonMatrix Created..]


GO


 


    build an INSTEAD OF INSERT trigger so we only save the integer factors


      THIS IS PURELY A PERFORMANCE ISSUE;


      it’s cheaper to write this trigger


  this syntax is new in SQL Server 2005; the BOL reference may be found here:


  ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/udb9/html/0001f651-fcfa-401b-baf2-0ec6b1671116.htm



  A BRIEF DISCUSSION:


  what we’re doing here is tuning our INSERT statement for the table CompressionMatrix below to ignore


      the transactions which do not result in an integer quotient.  This will make the commiting of


      transactions orders of magnitude faster, since only a small fraction of the values will result in


      integer quotient


 


 


CREATE TRIGGER IOI_ComparisonMatrix


ON    ComparisonMatrix


INSTEAD OF INSERT


AS


INSERT      ComparisonMatrix (


            Base,


            Factor


)


SELECT      TOP 3


        Base,


            Factor


FROM  inserted


WHERE IsQuotientInteger = 1


OPTION  (FAST 3)


 


GO


  SELECT GETDATE() AS [SchemaCreate: ComparisonMatrix Trigger Created..]


GO


 


  log a status message


  SELECT GETDATE() AS [SchemaCreate: Tables Created..]


GO


 


  using Shaun’s syntax (because it doesn’t have a database footprint),


  populate Candidate


;WITH digits(i) AS (


    SELECT 1 AS I UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL


    SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL


    SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9 UNION ALL


    SELECT 0


),


      generate 1M rows each with a unique row number, i


sequence(i) AS (


    SELECT d1.i + (10*d2.i) + (100*d3.i) + (1000*d4.i) + (10000*d5.i) + (100000*d6.i)


      FROM digits AS d1,


           digits AS d2,


           digits AS d3,


           digits AS d4,


           digits AS d5,


           digits AS d6


)


INSERT  Candidate (CandidateValue)


SELECT  i+1 — CTE gives us 0-999999; we want 1-1000000


FROM    sequence


 


  SELECT GETDATE() AS [Candidate Populated..]


GO


 


  now delete the low-hanging fruit from Candidate


    WE ARE ONLY DONG THIS FOR PERFORMANCE; the algorhy\itm can handle them


    NOTE that these fifteen DELETE statements, which encompass the first 15 prime numbers and


      thier multiples, together remove 86.13% of the candidate values


DELETE  Candidate WHERE   (CandidateValue % 2 = 0  AND CandidateValue >= 2+1)


DELETE  Candidate WHERE   (CandidateValue % 3 = 0  AND CandidateValue >= 3+1)


DELETE  Candidate WHERE   (CandidateValue % 5 = 0  AND CandidateValue >= 5+1)


DELETE  Candidate WHERE   (CandidateValue % 7 = 0  AND CandidateValue >= 7+1)


DELETE  Candidate WHERE   (CandidateValue % 11 = 0 AND CandidateValue >= 11+1)


DELETE  Candidate WHERE   (CandidateValue % 13 = 0 AND CandidateValue >= 13+1)


DELETE  Candidate WHERE   (CandidateValue % 17 = 0 AND CandidateValue >= 17+1)


DELETE  Candidate WHERE   (CandidateValue % 19 = 0 AND CandidateValue >= 19+1)


DELETE  Candidate WHERE   (CandidateValue % 23 = 0 AND CandidateValue >= 23+1)


DELETE  Candidate WHERE   (CandidateValue % 29 = 0 AND CandidateValue >= 29+1)


DELETE  Candidate WHERE   (CandidateValue % 31 = 0 AND CandidateValue >= 31+1)


DELETE  Candidate WHERE   (CandidateValue % 37 = 0 AND CandidateValue >= 37+1)


DELETE  Candidate WHERE   (CandidateValue % 41 = 0 AND CandidateValue >= 41+1)


DELETE  Candidate WHERE   (CandidateValue % 43 = 0 AND CandidateValue >= 43+1)


DELETE  Candidate WHERE   (CandidateValue % 47 = 0 AND CandidateValue >= 47+1)


 


  these bear less fruit, but are included because they’ll still


  help performance; I continued to add a line as long as 1% of remaining candidates


  was deleted


DELETE  Candidate WHERE   (CandidateValue % 53  = 0 AND CandidateValue >= 53+1)


DELETE  Candidate WHERE   (CandidateValue % 59  = 0 AND CandidateValue >= 59+1)


DELETE  Candidate WHERE   (CandidateValue % 61  = 0 AND CandidateValue >= 61+1)


DELETE  Candidate WHERE   (CandidateValue % 67  = 0 AND CandidateValue >= 67+1)


DELETE  Candidate WHERE   (CandidateValue % 71  = 0 AND CandidateValue >= 71+1)


DELETE  Candidate WHERE   (CandidateValue % 73  = 0 AND CandidateValue >= 73+1)


DELETE  Candidate WHERE   (CandidateValue % 79  = 0 AND CandidateValue >= 79+1)


DELETE  Candidate WHERE   (CandidateValue % 83  = 0 AND CandidateValue >= 83+1)


DELETE  Candidate WHERE   (CandidateValue % 89  = 0 AND CandidateValue >= 89+1)


DELETE  Candidate WHERE   (CandidateValue % 97  = 0 AND CandidateValue >= 97+1)


DELETE  Candidate WHERE   (CandidateValue % 101 = 0 AND CandidateValue >= 101+1)


 


  SELECT GETDATE() AS [Candidate DePopulated For Performance..]


 


  retune indexes for performance


ALTER INDEX ALL ON Candidate REBUILD


 


UPDATE STATISTICS Candidate


 


  uncomment this SELECT statement to confirm that we have cut 1M candidates down to 119,590


  SELECT COUNT(1) AS [Candidate Depopulated Count] FROM Candidate


GO


 


  log a status message


  SELECT GETDATE() AS [Tables Populated..]


GO


 


  keep some house for what comes..


DECLARE @CandidateValue int,


        @Counter int,


        @Now DATETIME,


        @PreviousInterval DECIMAL(12,2),


        @Interval DECIMAL(12,2)


 


  initalize a counter


SET     @Counter = 1


 


  declare a cursor to traverse the Candidate table


DECLARE TraverseCandidateTable CURSOR FAST_FORWARD READ_ONLY FOR


SELECT  CandidateValue


FROM    Candidate


ORDER BY CandidateValue


–DESC


 


  open the cursor and get the first record


OPEN    TraverseCandidateTable


 


FETCH NEXT


FROM  TraverseCandidateTable


INTO  @CandidateValue


 


  log a status message


SELECT GETDATE() AS [Processing: Cursor Loop Starts..]


 


  set timer


SET @Now = GETDATE()


 


  open a transaction


      (careful transaction control is necessary to avoid rampant growth of the transaction logs)


BEGIN TRAN


 


  start a loop which will run as long as the cursor finds a record


WHILE @@FETCH_STATUS = 0


BEGIN


 


  insert the full range of remaining possible factors for the candidate value


      into the comparison matrix


      (use the Candidate table to aid this work)


 


        INSERT  ComparisonMatrix (


            Base,


            Factor


        )


        SELECT  @CandidateValue,


                CandidateValue


        FROM    Candidate


        WHERE   CandidateValue <= @CandidateValue


        OPTION  (RECOMPILE)


 


  log a database commit and start a new transaction


  if it’s been 200 transactions since we did so


    IF  (@CandidateValue >= 20000 AND @Counter % 200 = 0)


    OR  (@CandidateValue <= 20000 AND @Counter % 100 = 0)


    BEGIN


        COMMIT TRAN


 


          retune indexes for performance


        ALTER INDEX ALL ON ComparisonMatrix REBUILD


 


            CHECKPOINT


 


  log a status message if it’s been 2000 transactions since we did so


        IF  @Counter % 2000 = 0


 


        BEGIN


            SET @Interval = (DATEDIFF (mi, @now, GETDATE()) * 60) +


                             DATEDIFF (ss, @now, GETDATE())


 


            SELECT  GETDATE()               AS [Processing: Status Datetime..],


                    @Counter                AS [Current Count..],


                    @CandidateValue         AS [Current Candidate..],


                    @Interval               AS [ElapsedSecondsThisBatch..],


                    (@Interval/@PreviousInterval) * 100


                                            AS [PerformanceChangeSinceLastBatch]


 


  initialize loop variables for the next batch


 


            SET @Now = GETDATE()


            SET @PreviousInterval = @Interval


 


        END


 


        BEGIN TRAN


    END


 


  get the next candidate value


    FETCH NEXT


    FROM  TraverseCandidateTable


    INTO  @CandidateValue


 


  increment the final transaction


    SET   @Counter = @Counter + 1


END


 


  we’re out of the loop;


  we still need to commit the final transaction


COMMIT TRAN


 


  clost and deallocate the cursor


CLOSE TraverseCandidateTable


 


DEALLOCATE TraverseCandidateTable


 


  log a status message


SELECT GETDATE() AS [Cursor Loop Complete..]


 


  generate the list of prime numbers


      if the SUM(IsQuotientInteger)=2 (where IsQuotientInteger=1 when a particular member of the set


      is a precise integer factor of the candidate value), then that candidate is a prime number,


      because IsQuotientInteger will equal 1 for 1 and the candidate itself.  All candidates with


      additional factors will thus be filtered out by the HAVING cluase.


SELECT  Base


FROM    ComparisonMatrix


GROUP BY Base


HAVING SUM(IsQuotientInteger) = 2


ORDER BY Base

Comments (6)

  1. Anonymous says:

    Our blogging site is going to go dark in an hour or so for a software upgrade, so I won’t be able to…

  2. Anonymous says:

    No less a figure than Louis Davidson has weighed in on the Prime Number Challenge . I left a comment

  3. Anonymous says:

    One more PDA-incompatible post: lessons learned from &quot;discredited&quot; prime number syntax

  4. Anonymous says:

    Well, folks, there’s a new sheriff in town..&amp;nbsp; this is a continuation of a conversation here and…

  5. Anonymous says:

    Denis’ Prime Number Challenge just won’t die.&amp;nbsp; I think this topic has spurred more dialog than any&amp;nbsp;other…

  6. Anonymous says:

    One more PDA-incompatible post: lessons learned from "discredited" prime number syntax