bug-gnu-pspp
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

PSPP-BUG: [bug #35758] regression: incorrect results with many missing v


From: Ben Pfaff
Subject: PSPP-BUG: [bug #35758] regression: incorrect results with many missing values
Date: Thu, 08 Mar 2012 06:31:09 +0000
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.16) Gecko/20120201 Iceweasel/3.5.16 (like Firefox/3.5.16)

URL:
  <http://savannah.gnu.org/bugs/?35758>

                 Summary: regression: incorrect results with many missing
values
                 Project: PSPP
            Submitted by: blp
            Submitted on: Wed Mar  7 22:31:08 2012
                Category: Numerical Errors
                Severity: 7 - Major
                  Status: None
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any
                 Release: None
                  Effort: 0.00

    _______________________________________________________

Details:

Renan Levine writes on pspp-users (archived at
http://lists.gnu.org/archive/html/pspp-users/2012-03/msg00015.html):

> This appears to be a bug in the PSPP regression routine with data with a  
> large amount of missing values!
>
> I recently noticed some small discrepancies between simple bivariate  
> regression results between IBM SPSS, STATA and PSPP. Until Prof.  
> Shackman's email, I hadn't realized that the discrepancies only occur  
> when there are many missing values. I was just confused...
>
> Sadly, I also find problems when running linear regressions using PSPP  
> on data with missing values. I wish I knew what was causing the problem.
>
> So, using Dropbox, I wanted to make available some data which seems to  
> illustrate the issue.
>
> Using psppire.exe 0.7.9-gab8ce2 on Windows AND psppire 0.7.8 on  
> LinuxMint LXDE, PSPP calculates descriptive statistics just like SPSS  
> and STATA on the same dataset, but does not calculate identical b  
> coefficients when running bivariate or multivariate regressions.
>
> I created the following public opinion survey data files consisting of  
> three variables from the 2004 Canadian Election Study which I recoded  
> and declared certain values to be missing:
> http://dl.dropbox.com/u/35198072/ces2004-regtest.sav  
> <http://www.queensu.ca/cora/ces.html> has many observations with missing  
> values.
> http://dl.dropbox.com/u/35198072/ces2004-regtest2.sav has the same three  
> variables, but I dropped all of the cases with missing values.
>
> This is the syntax file used to run descriptive statistics and three  
> regression analyses.
> http://dl.dropbox.com/u/35198072/regression-tests.sps
>
> PSPP generates these regression results and descriptive statistics with  
> missing values:
> http://dl.dropbox.com/u/35198072/regression-test-pspp1.html
> PSPP generates these regression results and descriptive statistics using  
> the data without any missing values:
> http://dl.dropbox.com/u/35198072/regression-test-pspp2.html
>
> Here is the STATA output on the same output (.log is a text file - email  
> me if you have a problem opening it). The first three regressions should  
> match the output in regression-test-pspp1.html
> They are close, but not close enough... The bottom three regressions use  
> the data with no missing values and these DO match PSPP's output (in  
> regression-test-pspp2.html).
> http://dl.dropbox.com/u/35198072/regression-test-stata.log
>
> I also ran the data on SPSS and found results consistent with STATA.  
> There did not seem to be any problems with Pearson's Chi-Square or  
> Kendall's Tau-B when running a crosstab on the data with the missing 
> values.
>
> I am sorry I don't know what has gone wrong, so I am making available  
> this data in hopes someone might figure out where there is a mistake.  I  
> caution other users running regression on PSPP.
>




    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?35758>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]