pspp-cvs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Pspp-cvs] pspp/src data/ChangeLog data/dictionary.c data/...


From: Ben Pfaff
Subject: [Pspp-cvs] pspp/src data/ChangeLog data/dictionary.c data/...
Date: Mon, 13 Aug 2007 03:44:46 +0000

CVSROOT:        /cvsroot/pspp
Module name:    pspp
Changes by:     Ben Pfaff <blp> 07/08/13 03:44:46

Modified files:
        src/data       : ChangeLog dictionary.c dictionary.h procedure.c 
                         scratch-writer.c 
        src/language/data-io: get.c 
        src/language/stats: ChangeLog flip.c 
        src/ui/gui     : ChangeLog psppire-dict.c 

Log message:
        * psppire-dict.c (psppire_dict_dump): Don't use
        dict_get_compacted_dict_index_to_case_index, as that function has
        been deleted.
        
        * flip.c: Drop use of dict_get_compacted_dict_index_to_case_index
        and just use the ordinary case indexes.  There seemed to be no
        reason for the former method.
        
        * get.c (case_map_get_value_cnt): New function.
        
        * dictionary.c (dict_compact_values): Don't delete scratch
        variables as well as compacting case indexes.  Update all callers.
        (dict_get_compacted_value_cnt): Rename dict_count_values and
        change interface.  Update all callers.
        (dict_get_compacted_value_cnt): Remove.
        (dict_compacting_would_shrink): Remove.
        (dict_compacting_would_change): Remove.
        (dict_make_compactor): Add new parameter.  Update all callers.
        
        * procedure.c (proc_casereader_read): Use casewriter_get_value_cnt
        instead of dict_count_values, changing an O(N) operation into
        O(1).

CVSWeb URLs:
http://cvs.savannah.gnu.org/viewcvs/pspp/src/data/ChangeLog?cvsroot=pspp&r1=1.150&r2=1.151
http://cvs.savannah.gnu.org/viewcvs/pspp/src/data/dictionary.c?cvsroot=pspp&r1=1.41&r2=1.42
http://cvs.savannah.gnu.org/viewcvs/pspp/src/data/dictionary.h?cvsroot=pspp&r1=1.18&r2=1.19
http://cvs.savannah.gnu.org/viewcvs/pspp/src/data/procedure.c?cvsroot=pspp&r1=1.33&r2=1.34
http://cvs.savannah.gnu.org/viewcvs/pspp/src/data/scratch-writer.c?cvsroot=pspp&r1=1.8&r2=1.9
http://cvs.savannah.gnu.org/viewcvs/pspp/src/language/data-io/get.c?cvsroot=pspp&r1=1.38&r2=1.39
http://cvs.savannah.gnu.org/viewcvs/pspp/src/language/stats/ChangeLog?cvsroot=pspp&r1=1.60&r2=1.61
http://cvs.savannah.gnu.org/viewcvs/pspp/src/language/stats/flip.c?cvsroot=pspp&r1=1.27&r2=1.28
http://cvs.savannah.gnu.org/viewcvs/pspp/src/ui/gui/ChangeLog?cvsroot=pspp&r1=1.77&r2=1.78
http://cvs.savannah.gnu.org/viewcvs/pspp/src/ui/gui/psppire-dict.c?cvsroot=pspp&r1=1.31&r2=1.32

Patches:
Index: data/ChangeLog
===================================================================
RCS file: /cvsroot/pspp/pspp/src/data/ChangeLog,v
retrieving revision 1.150
retrieving revision 1.151
diff -u -b -r1.150 -r1.151
--- data/ChangeLog      13 Aug 2007 00:43:48 -0000      1.150
+++ data/ChangeLog      13 Aug 2007 03:44:45 -0000      1.151
@@ -1,5 +1,20 @@
 2007-08-12  Ben Pfaff  <address@hidden>
 
+       * dictionary.c (dict_compact_values): Don't delete scratch
+       variables as well as compacting case indexes.  Update all callers.
+       (dict_get_compacted_value_cnt): Rename dict_count_values and
+       change interface.  Update all callers.
+       (dict_get_compacted_value_cnt): Remove.
+       (dict_compacting_would_shrink): Remove.
+       (dict_compacting_would_change): Remove.
+       (dict_make_compactor): Add new parameter.  Update all callers.
+       
+       * procedure.c (proc_casereader_read): Use casewriter_get_value_cnt
+       instead of dict_count_values, changing an O(N) operation into
+       O(1).
+
+2007-08-12  Ben Pfaff  <address@hidden>
+
        * casereader.c (casereader_read): Don't require cases read by a
        casereader to be exactly the expected size: as long as they're big
        enough, it's OK.

Index: data/dictionary.c
===================================================================
RCS file: /cvsroot/pspp/pspp/src/data/dictionary.c,v
retrieving revision 1.41
retrieving revision 1.42
diff -u -b -r1.41 -r1.42
--- data/dictionary.c   23 Jul 2007 05:05:45 -0000      1.41
+++ data/dictionary.c   13 Aug 2007 03:44:46 -0000      1.42
@@ -824,30 +824,22 @@
   return sizeof (union value) * dict_get_next_value_idx (d);
 }
 
-/* Deletes scratch variables in dictionary D and reassigns values
-   so that fragmentation is eliminated. */
+/* Reassigns values in dictionary D so that fragmentation is
+   eliminated. */
 void
 dict_compact_values (struct dictionary *d)
 {
   size_t i;
 
   d->next_value_idx = 0;
-  for (i = 0; i < d->var_cnt; )
+  for (i = 0; i < d->var_cnt; i++)
     {
       struct variable *v = d->var[i];
-
-      if (dict_class_from_id (var_get_name (v)) != DC_SCRATCH)
-        {
           set_var_case_index (v, d->next_value_idx);
           d->next_value_idx += var_get_value_cnt (v);
-          i++;
-        }
-      else
-        dict_delete_var (d, v);
     }
 }
 
-
 /*
    Reassigns case indices for D, increasing each index above START by
    the value PADDING.
@@ -874,88 +866,34 @@
 }
 
 
-/* Returns the number of values that would be used by a case if
-   dict_compact_values() were called. */
+/* Returns the number of values occupied by the variables in
+   dictionary D.  All variables are considered if EXCLUDE_CLASSES
+   is 0, or it may contain one or more of (1u << DC_ORDINARY),
+   (1u << DC_SYSTEM), or (1u << DC_SCRATCH) to exclude the
+   corresponding type of variable.
+
+   The return value may be less than the number of values in one
+   of dictionary D's cases (as returned by
+   dict_get_next_value_idx) even if E is 0, because there may be
+   gaps in D's cases due to deleted variables. */
 size_t
-dict_get_compacted_value_cnt (const struct dictionary *d)
+dict_count_values (const struct dictionary *d, unsigned int exclude_classes)
 {
   size_t i;
   size_t cnt;
 
-  cnt = 0;
-  for (i = 0; i < d->var_cnt; i++)
-    if (dict_class_from_id (var_get_name (d->var[i])) != DC_SCRATCH)
-      cnt += var_get_value_cnt (d->var[i]);
-  return cnt;
-}
+  assert ((exclude_classes & ~((1u << DC_ORDINARY)
+                               | (1u << DC_SYSTEM)
+                               | (1u << DC_SCRATCH))) == 0);
 
-/* Creates and returns an array mapping from a dictionary index
-   to the case index that the corresponding variable will have
-   after calling dict_compact_values().  Scratch variables
-   receive -1 for case index because dict_compact_values() will
-   delete them. */
-int *
-dict_get_compacted_dict_index_to_case_index (const struct dictionary *d)
-{
-  size_t i;
-  size_t next_value_idx;
-  int *map;
-
-  map = xnmalloc (d->var_cnt, sizeof *map);
-  next_value_idx = 0;
+  cnt = 0;
   for (i = 0; i < d->var_cnt; i++)
     {
-      struct variable *v = d->var[i];
-
-      if (dict_class_from_id (var_get_name (v)) != DC_SCRATCH)
-        {
-          map[i] = next_value_idx;
-          next_value_idx += var_get_value_cnt (v);
-        }
-      else
-        map[i] = -1;
-    }
-  return map;
-}
-
-/* Returns true if a case for dictionary D would be smaller after
-   compacting, false otherwise.  Compacting a case eliminates
-   "holes" between values and after the last value.  Holes are
-   created by deleting variables (or by scratch variables).
-
-   The return value may differ from whether compacting a case
-   from dictionary D would *change* the case: compacting could
-   rearrange values even if it didn't reduce space
-   requirements. */
-bool
-dict_compacting_would_shrink (const struct dictionary *d)
-{
-  return dict_get_compacted_value_cnt (d) < dict_get_next_value_idx (d);
-}
-
-/* Returns true if a case for dictionary D would change after
-   compacting, false otherwise.  Compacting a case eliminates
-   "holes" between values and after the last value.  Holes are
-   created by deleting variables (or by scratch variables).
-
-   The return value may differ from whether compacting a case
-   from dictionary D would *shrink* the case: compacting could
-   rearrange values without reducing space requirements. */
-bool
-dict_compacting_would_change (const struct dictionary *d)
-{
-  size_t case_idx;
-  size_t i;
-
-  case_idx = 0;
-  for (i = 0; i < dict_get_var_cnt (d); i++)
-    {
-      struct variable *v = dict_get_var (d, i);
-      if (var_get_case_index (v) != case_idx)
-        return true;
-      case_idx += var_get_value_cnt (v);
+      enum dict_class class = dict_class_from_id (var_get_name (d->var[i]));
+      if (!(exclude_classes & (1u << class)))
+        cnt += var_get_value_cnt (d->var[i]);
     }
-  return false;
+  return cnt;
 }
 
 /* How to copy a contiguous range of values between cases. */
@@ -977,10 +915,14 @@
    compact cases for dictionary D.
 
    Compacting a case eliminates "holes" between values and after
-   the last value.  Holes are created by deleting variables (or
-   by scratch variables). */
+   the last value.  (Holes are created by deleting variables.)
+
+   All variables are compacted if EXCLUDE_CLASSES is 0, or it may
+   contain one or more of (1u << DC_ORDINARY), (1u << DC_SYSTEM),
+   or (1u << DC_SCRATCH) to cause the corresponding type of
+   variable to be deleted during compaction. */
 struct dict_compactor *
-dict_make_compactor (const struct dictionary *d)
+dict_make_compactor (const struct dictionary *d, unsigned int exclude_classes)
 {
   struct dict_compactor *compactor;
   struct copy_map *map;
@@ -988,6 +930,10 @@
   size_t value_idx;
   size_t i;
 
+  assert ((exclude_classes & ~((1u << DC_ORDINARY)
+                               | (1u << DC_SYSTEM)
+                               | (1u << DC_SCRATCH))) == 0);
+
   compactor = xmalloc (sizeof *compactor);
   compactor->maps = NULL;
   compactor->map_cnt = 0;
@@ -998,9 +944,10 @@
   for (i = 0; i < d->var_cnt; i++)
     {
       struct variable *v = d->var[i];
-
-      if (dict_class_from_id (var_get_name (v)) == DC_SCRATCH)
+      enum dict_class class = dict_class_from_id (var_get_name (v));
+      if (exclude_classes & (1u << class))
         continue;
+
       if (map != NULL && map->src_idx + map->cnt == var_get_case_index (v))
         map->cnt += var_get_value_cnt (v);
       else
@@ -1023,8 +970,7 @@
    COMPACTOR.
 
    Compacting a case eliminates "holes" between values and after
-   the last value.  Holes are created by deleting variables (or
-   by scratch variables). */
+   the last value.  (Holes are created by deleting variables.) */
 void
 dict_compactor_compact (const struct dict_compactor *compactor,
                         struct ccase *dst, const struct ccase *src)

Index: data/dictionary.h
===================================================================
RCS file: /cvsroot/pspp/pspp/src/data/dictionary.h,v
retrieving revision 1.18
retrieving revision 1.19
diff -u -b -r1.18 -r1.19
--- data/dictionary.h   23 Jul 2007 05:05:45 -0000      1.18
+++ data/dictionary.h   13 Aug 2007 03:44:46 -0000      1.19
@@ -102,13 +102,12 @@
 int dict_get_next_value_idx (const struct dictionary *);
 size_t dict_get_case_size (const struct dictionary *);
 
+size_t dict_count_values (const struct dictionary *,
+                          unsigned int exclude_classes);
 void dict_compact_values (struct dictionary *);
-size_t dict_get_compacted_value_cnt (const struct dictionary *);
-int *dict_get_compacted_dict_index_to_case_index (const struct dictionary *);
-bool dict_compacting_would_shrink (const struct dictionary *);
-bool dict_compacting_would_change (const struct dictionary *);
 
-struct dict_compactor *dict_make_compactor (const struct dictionary *);
+struct dict_compactor *dict_make_compactor (const struct dictionary *,
+                                            unsigned int exclude_classes);
 void dict_compactor_compact (const struct dict_compactor *,
                              struct ccase *, const struct ccase *);
 void dict_compactor_destroy (struct dict_compactor *);

Index: data/procedure.c
===================================================================
RCS file: /cvsroot/pspp/pspp/src/data/procedure.c,v
retrieving revision 1.33
retrieving revision 1.34
diff -u -b -r1.33 -r1.34
--- data/procedure.c    26 Jul 2007 02:02:23 -0000      1.33
+++ data/procedure.c    13 Aug 2007 03:44:46 -0000      1.34
@@ -170,11 +170,13 @@
   /* Prepare sink. */
   if (!ds->discard_output)
     {
-      ds->compactor = (dict_compacting_would_shrink (ds->permanent_dict)
-                       ? dict_make_compactor (ds->permanent_dict)
+      struct dictionary *pd = ds->permanent_dict;
+      size_t compacted_value_cnt = dict_count_values (pd, 1u << DC_SCRATCH);
+      bool should_compact = compacted_value_cnt < dict_get_next_value_idx (pd);
+      ds->compactor = (should_compact
+                       ? dict_make_compactor (pd, 1u << DC_SCRATCH)
                        : NULL);
-      ds->sink = autopaging_writer_create (dict_get_compacted_value_cnt (
-                                             ds->permanent_dict));
+      ds->sink = autopaging_writer_create (compacted_value_cnt);
     }
   else
     {
@@ -257,7 +259,7 @@
           struct ccase tmp;
           if (ds->compactor != NULL)
             {
-              case_create (&tmp, dict_get_compacted_value_cnt (ds->dict));
+              case_create (&tmp, casewriter_get_value_cnt (ds->sink));
               dict_compactor_compact (ds->compactor, &tmp, c);
             }
           else
@@ -325,8 +327,10 @@
       if (ds->compactor != NULL)
         {
           dict_compactor_destroy (ds->compactor);
-          dict_compact_values (ds->dict);
           ds->compactor = NULL;
+
+          dict_delete_scratch_vars (ds->dict);
+          dict_compact_values (ds->dict);
         }
 
       /* Old data sink becomes new data source. */

Index: data/scratch-writer.c
===================================================================
RCS file: /cvsroot/pspp/pspp/src/data/scratch-writer.c,v
retrieving revision 1.8
retrieving revision 1.9
diff -u -b -r1.8 -r1.9
--- data/scratch-writer.c       13 Aug 2007 00:41:35 -0000      1.8
+++ data/scratch-writer.c       13 Aug 2007 03:44:46 -0000      1.9
@@ -27,6 +27,7 @@
 #include <data/dictionary.h>
 #include <data/file-handle-def.h>
 #include <data/scratch-handle.h>
+#include <data/variable.h>
 #include <libpspp/compiler.h>
 #include <libpspp/taint.h>
 
@@ -68,9 +69,11 @@
 
   /* Copy the dictionary and compact if needed. */
   scratch_dict = dict_clone (dictionary);
-  if (dict_compacting_would_shrink (scratch_dict))
+  dict_delete_scratch_vars (scratch_dict);
+  if (dict_count_values (scratch_dict, 0)
+      < dict_get_next_value_idx (scratch_dict))
     {
-      compactor = dict_make_compactor (scratch_dict);
+      compactor = dict_make_compactor (scratch_dict, 0);
       dict_compact_values (scratch_dict);
     }
   else

Index: language/data-io/get.c
===================================================================
RCS file: /cvsroot/pspp/pspp/src/language/data-io/get.c,v
retrieving revision 1.38
retrieving revision 1.39
diff -u -b -r1.38 -r1.39
--- language/data-io/get.c      13 Aug 2007 00:41:35 -0000      1.38
+++ language/data-io/get.c      13 Aug 2007 03:44:46 -0000      1.39
@@ -337,6 +337,7 @@
       goto error;
     }
 
+  dict_delete_scratch_vars (dict);
   dict_compact_values (dict);
 
   if (fh_get_referent (handle) == FH_REF_FILE)
@@ -983,6 +984,7 @@
       || !create_flag_var ("LAST", last_name, mtf.dict, &mtf.last))
     goto error;
 
+  dict_delete_scratch_vars (mtf.dict);
   dict_compact_values (mtf.dict);
   mtf.output = autopaging_writer_create (dict_get_next_value_idx (mtf.dict));
   taint = taint_clone (casewriter_get_taint (mtf.output));

Index: language/stats/ChangeLog
===================================================================
RCS file: /cvsroot/pspp/pspp/src/language/stats/ChangeLog,v
retrieving revision 1.60
retrieving revision 1.61
diff -u -b -r1.60 -r1.61
--- language/stats/ChangeLog    5 Aug 2007 17:20:22 -0000       1.60
+++ language/stats/ChangeLog    13 Aug 2007 03:44:46 -0000      1.61
@@ -1,3 +1,9 @@
+2007-08-12  Ben Pfaff  <address@hidden>
+
+       * flip.c: Drop use of dict_get_compacted_dict_index_to_case_index
+       and just use the ordinary case indexes.  There seemed to be no
+       reason for the former method.
+
 2007-08-03  Ben Pfaff  <address@hidden>
 
        * rank.q (rank_cmd): Instead of sorting by SPLIT FILE vars, group

Index: language/stats/flip.c
===================================================================
RCS file: /cvsroot/pspp/pspp/src/language/stats/flip.c,v
retrieving revision 1.27
retrieving revision 1.28
diff -u -b -r1.27 -r1.28
--- language/stats/flip.c       25 Jul 2007 04:03:58 -0000      1.27
+++ language/stats/flip.c       13 Aug 2007 03:44:46 -0000      1.28
@@ -62,7 +62,6 @@
   {
     struct pool *pool;          /* Pool containing FLIP data. */
     const struct variable **var;      /* Variables to transpose. */
-    int *idx_to_fv;             /* var[]->index to compacted sink case fv. */
     size_t var_cnt;             /* Number of elements in `var'. */
     int case_cnt;               /* Pre-flip case count. */
 
@@ -101,8 +100,6 @@
 
   flip = pool_create_container (struct flip_pgm, pool);
   flip->var = NULL;
-  flip->idx_to_fv = dict_get_compacted_dict_index_to_case_index (dict);
-  pool_register (flip->pool, free, flip->idx_to_fv);
   flip->var_cnt = 0;
   flip->case_cnt = 0;
   flip->new_names = NULL;
@@ -171,7 +168,6 @@
   flip->case_cnt = 1;
 
   /* Read the active file into a flip_sink. */
-  proc_make_temporary_transformations_permanent (ds);
   proc_discard_output (ds);
 
   input = proc_open (ds);
@@ -318,11 +314,10 @@
   if (flip->new_names != NULL)
     {
       struct varname *v = pool_alloc (flip->pool, sizeof *v);
-      int fv = flip->idx_to_fv[var_get_dict_index (flip->new_names)];
       v->next = NULL;
       if (var_is_numeric (flip->new_names))
         {
-          double f = case_num_idx (c, fv);
+          double f = case_num (c, flip->new_names);
 
           if (f == SYSMIS)
             strcpy (v->name, "VSYSMIS");
@@ -336,7 +331,7 @@
       else
        {
          int width = MIN (var_get_width (flip->new_names), MAX_SHORT_STRING);
-         memcpy (v->name, case_str_idx (c, fv), width);
+         memcpy (v->name, case_str (c, flip->new_names), width);
          v->name[width] = 0;
        }
 
@@ -350,15 +345,8 @@
   /* Write to external file. */
   for (i = 0; i < flip->var_cnt; i++)
     {
-      double out;
-
-      if (var_is_numeric (flip->var[i]))
-        {
-          int fv = flip->idx_to_fv[var_get_dict_index (flip->var[i])];
-          out = case_num_idx (c, fv);
-        }
-      else
-        out = SYSMIS;
+      const struct variable *v = flip->var[i];
+      double out = var_is_numeric (v) ? case_num (c, v) : SYSMIS;
       fwrite (&out, sizeof out, 1, flip->file);
     }
   return true;

Index: ui/gui/ChangeLog
===================================================================
RCS file: /cvsroot/pspp/pspp/src/ui/gui/ChangeLog,v
retrieving revision 1.77
retrieving revision 1.78
diff -u -b -r1.77 -r1.78
--- ui/gui/ChangeLog    12 Aug 2007 23:18:03 -0000      1.77
+++ ui/gui/ChangeLog    13 Aug 2007 03:44:46 -0000      1.78
@@ -1,3 +1,9 @@
+2007-08-12  Ben Pfaff  <address@hidden>
+
+       * psppire-dict.c (psppire_dict_dump): Don't use
+       dict_get_compacted_dict_index_to_case_index, as that function has
+       been deleted.
+
 2007-08-13  John Darrington <address@hidden>
 
         * psppire-case-file.c (psppire_case_file_append_case):  

Index: ui/gui/psppire-dict.c
===================================================================
RCS file: /cvsroot/pspp/pspp/src/ui/gui/psppire-dict.c,v
retrieving revision 1.31
retrieving revision 1.32
diff -u -b -r1.31 -r1.32
--- ui/gui/psppire-dict.c       18 Jul 2007 00:50:59 -0000      1.31
+++ ui/gui/psppire-dict.c       13 Aug 2007 03:44:46 -0000      1.32
@@ -781,17 +781,14 @@
   gint i;
   const struct dictionary *d = dict->dict;
 
-  int *map = dict_get_compacted_dict_index_to_case_index (d);
-
   for (i = 0; i < dict_get_var_cnt (d); ++i)
     {
       const struct variable *v = psppire_dict_get_variable (dict, i);
       int di = var_get_dict_index (v);
-      g_print ("\"%s\" idx=%d, fv=%d(%d), size=%d\n",
+      g_print ("\"%s\" idx=%d, fv=%d, size=%d\n",
               var_get_name(v),
               di,
               var_get_case_index(v),
-              map[di],
               value_cnt_from_width(var_get_width(v)));
 
     }




reply via email to

[Prev in Thread] Current Thread [Next in Thread]