Hello,
sorry for my late answer ( i wrote the code friday but i could only debug today, being busy and on travel last week-end)
i modified my code to works with 'futures' , the speed is now incredible !!! (and it computes exact)
the speed seems multiplied by even more than the number of CPUs (6):
cat /proc/cpuinfo | grep processor
processor : 0
processor : 1
processor : 2
processor : 3
processor : 4
processor : 5
(declare minterms-vector unified-minterms-vector-1)
(define (funct-unify-minterms-set-1-unit-future set1 set2)
{function-unify-minterms-list <+ (λ (L) (apply function-unify-two-minterms-and-tag L))}
;; note : sorting is useless
{minterms-set <+ (product-set-with-set-imperative-sorted set1 set2)} ;;(product-set-with-set-imperative set1 set2)} ;;(product-set-with-set set1 set2)} ;;(associate-set-with-set set1 set2)} ;; set multiplication : create list of pair of minterms
{minterms-vector <- (list->vector minterms-set)}
{minterms-vector-length <+ (vector-length minterms-vector)}
{nb-procs <+ (current-processor-count)}
{segmts <+ (segment 0 {minterms-vector-length - 1} nb-procs)} ;; compute the segments
{unified-minterms-vector-1 <- (make-vector minterms-vector-length #f)}
(run-in-parallel segmts proc-unify-minterms-seg) ;; run the parallel code
{unified-minterms-set-1 <+ (vector->list unified-minterms-vector-1)}
{unified-minterms-set-2 <+ (filter (λ (x) x) unified-minterms-set-1)} ;; remove #f results
{unified-minterms-set <+ (remove-duplicates-sorted unified-minterms-set-2)} ;; uniq
unified-minterms-set)
i keeped your 'segment' definitions to index an array of data after convert it from list to vector (list->vector) which probably consume additional time on big data list ( i had more than 1000000 element list lengths at some point)
i used a simplified version of run-in parallel (that do not do 'reduce after processing data):
the computation on a segment is done by those functions:
{function-unify-minterms-list <+ (λ (L) (apply function-unify-two-minterms-and-tag L))}
;; proc to be called with futures
(define (proc-unify-minterms-seg seg)
{start <+ (segment-start seg)}
{end <+ (segment-end seg)}
(for ({i <+ start} {i <= end} {i <- {i + 1}})
{mtL <+ {minterms-vector[i]}}
{unified-minterms-vector-1[i] <- (function-unify-minterms-list mtL)}))
i use those code in another program symbolically to compute a value named Cn:
C0 = T
C1 = T
C2 = T
C3 = (B1 ∨ B2)
C4 = (B2 ∨ (B1 ∧ B3))
C5 = ((B1 ∧ B3) ∨ (B2 ∧ B3) ∨ (B2 ∧ B4) ∨ (B3 ∧ B4))
C6 = ((B1 ∧ B3 ∧ B5) ∨ (B2 ∧ B3 ∧ B5) ∨ (B2 ∧ B4) ∨ (B3 ∧ B4) ∨ (B4 ∧ B5))
C7 = ((B1 ∧ B3 ∧ B5) ∨ (B2 ∧ B3 ∧ B5) ∨ (B2 ∧ B4 ∧ B6) ∨ (B3 ∧ B4 ∧ B6) ∨ (B4 ∧ B5) ∨ (B5 ∧ B6))
C8 = ((B1 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B4 ∧ B6) ∨ (B3 ∧ B4 ∧ B6) ∨ (B4 ∧ B5 ∧ B7) ∨ (B5 ∧ B6) ∨ (B6 ∧ B7))
C9 = ((B1 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B4 ∧ B6 ∧ B8) ∨ (B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B4 ∧ B5 ∧ B7) ∨ (B5 ∧ B6 ∧ B8) ∨ (B6 ∧ B7) ∨ (B7 ∧ B8))
from C1 to C9 used to be fast,less than a minute for the whole with or without //
C10 = ((B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B2 ∧ B4 ∧ B6 ∧ B8) ∨ (B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B4 ∧ B5 ∧ B7 ∧ B9) ∨ (B5 ∧ B6 ∧ B8) ∨ (B6 ∧ B7 ∧ B9) ∨ (B7 ∧ B8) ∨ (B8 ∧ B9))
C10 takes a few minutes
but C11 used to take one day before // with 'future' and i got now the result during less than one hour!!!
C11 = ((B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B10 ∧ B2 ∧ B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B5 ∧ B6 ∧ B8) ∨ (B10 ∧ B7 ∧ B8) ∨ (B10 ∧ B9) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B4 ∧ B5 ∧ B7 ∧ B9) ∨ (B6 ∧ B7 ∧ B9) ∨ (B8 ∧ B9))
i never got C12 in the past ,even with 7 days of computation! and i got it today during the lunchbreak !!!
C12 = ((B1 ∧ B11 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B10 ∧ B11) ∨ (B10 ∧ B2 ∧ B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B5 ∧ B6 ∧ B8) ∨ (B10 ∧ B7 ∧ B8) ∨ (B10 ∧ B9) ∨ (B11 ∧ B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B11 ∧ B4 ∧ B5 ∧ B7 ∧ B9) ∨ (B11 ∧ B6 ∧ B7 ∧ B9) ∨ (B11 ∧ B8 ∧ B9))
(note that a wise people can find a general formula for Cn based on Cn-1 but that is another (mathematical) story....)
i'm computing C13 but i see that it consumes 12gb out of 16gb of my system ! and stopped on the linux box :
GC Warning: Repeated allocation of very large block (appr. size 510935040):
May lead to memory leak and poor performance
Processus arrêté
but still running on the Mac laptop...ok will see that but i think that the data set is now huge and there is noting that can be done with that...
note that there is still an hash table used in function-unify-two-minterms-and-tag and i will use another data structure and algo to avoid that, i feared that the concurrent access to hash table can cause the guile 'future' to crash or fails or to slow down the process but not! result is incredibly fast.Also i use continuation and i read it can cause problem with 'future' i will improve that too...
I will see where i can improve the algo and data structure to optimize again but this is already really good.
Thanks
Damien