[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
PSPP-BUG: [bug #35758] regression: incorrect results with many missing v
From: |
Ben Pfaff |
Subject: |
PSPP-BUG: [bug #35758] regression: incorrect results with many missing values |
Date: |
Thu, 08 Mar 2012 06:31:09 +0000 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.16) Gecko/20120201 Iceweasel/3.5.16 (like Firefox/3.5.16) |
URL:
<http://savannah.gnu.org/bugs/?35758>
Summary: regression: incorrect results with many missing
values
Project: PSPP
Submitted by: blp
Submitted on: Wed Mar 7 22:31:08 2012
Category: Numerical Errors
Severity: 7 - Major
Status: None
Assigned to: None
Open/Closed: Open
Discussion Lock: Any
Release: None
Effort: 0.00
_______________________________________________________
Details:
Renan Levine writes on pspp-users (archived at
http://lists.gnu.org/archive/html/pspp-users/2012-03/msg00015.html):
> This appears to be a bug in the PSPP regression routine with data with a
> large amount of missing values!
>
> I recently noticed some small discrepancies between simple bivariate
> regression results between IBM SPSS, STATA and PSPP. Until Prof.
> Shackman's email, I hadn't realized that the discrepancies only occur
> when there are many missing values. I was just confused...
>
> Sadly, I also find problems when running linear regressions using PSPP
> on data with missing values. I wish I knew what was causing the problem.
>
> So, using Dropbox, I wanted to make available some data which seems to
> illustrate the issue.
>
> Using psppire.exe 0.7.9-gab8ce2 on Windows AND psppire 0.7.8 on
> LinuxMint LXDE, PSPP calculates descriptive statistics just like SPSS
> and STATA on the same dataset, but does not calculate identical b
> coefficients when running bivariate or multivariate regressions.
>
> I created the following public opinion survey data files consisting of
> three variables from the 2004 Canadian Election Study which I recoded
> and declared certain values to be missing:
> http://dl.dropbox.com/u/35198072/ces2004-regtest.sav
> <http://www.queensu.ca/cora/ces.html> has many observations with missing
> values.
> http://dl.dropbox.com/u/35198072/ces2004-regtest2.sav has the same three
> variables, but I dropped all of the cases with missing values.
>
> This is the syntax file used to run descriptive statistics and three
> regression analyses.
> http://dl.dropbox.com/u/35198072/regression-tests.sps
>
> PSPP generates these regression results and descriptive statistics with
> missing values:
> http://dl.dropbox.com/u/35198072/regression-test-pspp1.html
> PSPP generates these regression results and descriptive statistics using
> the data without any missing values:
> http://dl.dropbox.com/u/35198072/regression-test-pspp2.html
>
> Here is the STATA output on the same output (.log is a text file - email
> me if you have a problem opening it). The first three regressions should
> match the output in regression-test-pspp1.html
> They are close, but not close enough... The bottom three regressions use
> the data with no missing values and these DO match PSPP's output (in
> regression-test-pspp2.html).
> http://dl.dropbox.com/u/35198072/regression-test-stata.log
>
> I also ran the data on SPSS and found results consistent with STATA.
> There did not seem to be any problems with Pearson's Chi-Square or
> Kendall's Tau-B when running a crosstab on the data with the missing
> values.
>
> I am sorry I don't know what has gone wrong, so I am making available
> this data in hopes someone might figure out where there is a mistake. I
> caution other users running regression on PSPP.
>
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?35758>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
- PSPP-BUG: [bug #35758] regression: incorrect results with many missing values,
Ben Pfaff <=
- PSPP-BUG: [bug #35758] regression: incorrect results with many missing values, John Darrington, 2012/03/11
- Re: PSPP-BUG: [bug #35758] regression: incorrect results with many missing values, Gene Shackman, 2012/03/11
- Re: PSPP-BUG: [bug #35758] regression: incorrect results with many missing values#, John Darrington, 2012/03/12
- Re: PSPP-BUG: [bug #35758] regression: incorrect results with many missing values#, Gene Shackman, 2012/03/14
- Re: PSPP-BUG: [bug #35758] regression: incorrect results with many missing values#, Ben Pfaff, 2012/03/15
- Re: PSPP-BUG: [bug #35758] regression: incorrect results with many missing values#, Harry Thijssen, 2012/03/15
- Re: PSPP-BUG: [bug #35758] regression: incorrect results with many missing values#, Ben Pfaff, 2012/03/15
- Re: PSPP-BUG: [bug #35758] regression: incorrect results with many missing values#, Gene Shackman, 2012/03/15