Re: urlread

help-octave

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: urlread

From:	John W. Eaton
Subject:	Re: urlread
Date:	Tue, 09 Oct 2007 19:20:01 -0400

On  9-Oct-2007, Alexander Barth wrote:

| I would not remove the progress callback function entierly since it also
| calls the macro OCTAVE_QUIT. Without the function callback, a large download
| can only be interrupted by stopping octave.

OK, I wasn't thinking about interrupts.

| Attached is a patch that
| restores the callback but there is no output within the callback function.

| Index: urlwrite.cc
| ===================================================================
| RCS file: /cvs/octave/src/DLD-FUNCTIONS/urlwrite.cc,v
| retrieving revision 1.12
| diff -u -r1.12 urlwrite.cc
| --- urlwrite.cc       9 Oct 2007 17:43:00 -0000       1.12
| +++ urlwrite.cc       9 Oct 2007 20:29:07 -0000
| @@ -58,6 +58,19 @@
|    return (stream.fail () ? 0 : size * nmemb);
|  }
|  
| +
| +// Progress callback function for curl.
| +
| +int
| +progress_func (const char /**url*/, double /*dltotal*/, double /*dlnow*/,
| +               double /*ultotal*/, double /*ulnow*/)
| +{
| +  // macro that picks up Ctrl-C signalling
| +  OCTAVE_QUIT;
| +  return 0;
| +}
| +
| +
|  // Form the query string based on param.
|  
|  static std::string
| @@ -131,6 +144,7 @@
|    curl_easy_setopt (curl, CURLOPT_FOLLOWLOCATION, 1);
|  
|    curl_easy_setopt (curl, CURLOPT_NOPROGRESS, false);
| +  curl_easy_setopt (curl, CURLOPT_PROGRESSFUNCTION, progress_func);
|    curl_easy_setopt (curl, CURLOPT_PROGRESSDATA, url.c_str ());
|    curl_easy_setopt (curl, CURLOPT_FAILONERROR, true);

I don't think this is the right thing to do.  First, OCTAVE_QUIT
throws an exception, which is probably not what we want to do from the
callback.  Instead, I think you would want to do something like this:

--- urlwrite.cc 09 Oct 2007 13:42:30 -0400      1.12
+++ urlwrite.cc 09 Oct 2007 18:51:44 -0400      
@@ -58,6 +58,20 @@
   return (stream.fail () ? 0 : size * nmemb);
 }
 
+// Progress callback function for curl.
+
+int
+progress_func (const char *, double, double, double, double)
+{
+  // If octave_interrupt_state is greater than zero, an interrupt is
+  // pending so we should tell curl to stop the transfer so we can
+  // clean up.  Once that's done, we will eventually encounter a call
+  // to OCTAVE_QUIT that will throw an exception that will get us back
+  // to the main loop, cleaning up any local C++ objects as we go.
+
+  return octave_interrupt_state > 0;
+}
+
 // Form the query string based on param.
 
 static std::string
@@ -131,6 +145,7 @@
   curl_easy_setopt (curl, CURLOPT_FOLLOWLOCATION, 1);
 
   curl_easy_setopt (curl, CURLOPT_NOPROGRESS, false);
+  curl_easy_setopt (curl, CURLOPT_PROGRESSFUNCTION, progress_func);
   curl_easy_setopt (curl, CURLOPT_PROGRESSDATA, url.c_str ());
   curl_easy_setopt (curl, CURLOPT_FAILONERROR, true);
 
that works fine except that I noticed that the connection was hanging
when I tried to get the url

  ftp://ftp.octave.org/pub/octave/octave-2.9.14.tar.gz

and that the progress callback was apparently never called.  Then the
only way to interrupt the process was to hit C-c multiple times and
abort the Octave process.  I'm not sure precisely why this is
hanging.  If I wait long enough (a minute or two), urlwrite returns
with

  octave:2> [a,b,c] = urlwrite 
("ftp://ftp.octave.org/pub/octave/octave-2.9.14.tar.gz";, "foo.tgz")
  a = /scratch/jwe/build/octave-trunk/foo.tgz
  b =  1
  c = 

but that isn't telling me much.

I tried switching to allowing anonymous FTP without a password on
ftp.octave.org, but that doesn't seem to be it.

In any case, even if I solve the problem for ftp.octave.org, we can't
solve the connection problem for all URLs out there, so I think
another solution is needed.  Is it intentional that the progress
function is not called when curl is attempting to make the network
connection?  Is there some other callback that is available here that
would allow us to check the Octave interrupt state and stop the
transfer?

If there is no way to do this reliably with callbacks, then I think we
will need to use something like this instead:

--- urlwrite.cc 09 Oct 2007 13:42:30 -0400      1.12
+++ urlwrite.cc 09 Oct 2007 19:06:57 -0400      
@@ -90,6 +90,13 @@
 
 // curl front-end
 
+static void
+urlget_cleanup (CURL *curl)
+{
+  curl_easy_cleanup (curl);
+  curl_global_cleanup ();
+}
+
 static CURLcode
 urlget (const std::string& url, const std::string& method,
        const Cell& param, std::ostream& stream)
@@ -130,19 +137,44 @@
   // Follow redirects.
   curl_easy_setopt (curl, CURLOPT_FOLLOWLOCATION, 1);
 
-  curl_easy_setopt (curl, CURLOPT_NOPROGRESS, false);
+  curl_easy_setopt (curl, CURLOPT_NOPROGRESS, true);
   curl_easy_setopt (curl, CURLOPT_PROGRESSDATA, url.c_str ());
   curl_easy_setopt (curl, CURLOPT_FAILONERROR, true);
 
   // Switch on full protocol/debug output
   // curl_easy_setopt(curl, CURLOPT_VERBOSE, true);
 
-  CURLcode res = curl_easy_perform (curl);
+  CURLcode res = CURLE_OK;
 
-  // Always cleanup.
-  curl_easy_cleanup (curl);
+  // To understand the following, see the definitions of these macros
+  // in libcruft/misc/quit.h.  The idea is that we call sigsetjmp here
+  // then the signal handler calls siglongjmp to get back here
+  // immediately.  Then we perform some cleanup and throw an interrupt
+  // exception which will get us back to the top level, cleaning up
+  // any local C++ objects on the stack as we go.
 
-  curl_global_cleanup ();
+  BEGIN_INTERRUPT_IMMEDIATELY_IN_FOREIGN_CODE_1;
+
+  // We were interrupted (this code is inside a block that is only
+  // called when siglongjmp is called from a signal handler).
+
+  // Is there a better error code to use?  Maybe it doesn't matter
+  // because we are about to throw an execption.
+
+  res = CURLE_ABORTED_BY_CALLBACK;
+  urlget_cleanup (curl);
+  throw_interrupt_exception ();
+
+  BEGIN_INTERRUPT_IMMEDIATELY_IN_FOREIGN_CODE_2;
+
+  res = curl_easy_perform (curl);
+
+  END_INTERRUPT_IMMEDIATELY_IN_FOREIGN_CODE_2;
+
+  // If we are not interuppted, we will end up here, so we still need
+  // to clean up.
+
+  urlget_cleanup (curl);
 
   return res;
 }

However, I'm not sure that this will clean up everything that happens
inside the curl_easy_perform function.  Are all the resources required
by curl_easy_perform stored in the CURL structure?  If so, then I
think this is OK.

jwe

[Prev in Thread]

Current Thread

[Next in Thread]

urlread, GAIL, 2007/10/09
- Re: urlread, Alexander Barth, 2007/10/09
  - Re: urlread, John W. Eaton, 2007/10/09
    - Re: urlread, Alexander Barth, 2007/10/09
    - Re: urlread, John W. Eaton, 2007/10/09
    - Re: urlread, Alexander Barth, 2007/10/09
    - Re: urlread, John W. Eaton <=
  - Re: urlread, GAIL, 2007/10/09
    - Re: urlread, Alexander Barth, 2007/10/09

Prev by Date: Re: urlread
Next by Date: Re: Solaris 10 x86 compilation success leads to 2.9.13 "octave" core, segmentation violation. :-<.....
Previous by thread: Re: urlread
Next by thread: Re: urlread
Index(es):
- Date
- Thread