Here are a few thoughts on numjac.
numjac implements an algorithm for finite difference (FD) step size selection based on
this paper by Salane:
D. E. Salane, Adaptive routines for forming Jacobians numerically, Tech. Report SAND86
1319, Sandia National Laboratories, Albuquerque, NM, 1986.
I have done some experimentation with this algorithm and numjac and am much less
positive on this approach than either Salane or the numjac authors; I have seen
cases where the algorithm prescribes very tiny step sizes that lead to large
roundoff errors.
A second part of numjac is that, for sparse matrices (which I assume applies
to the bvp4c algorithm) it is usually necessary to make far fewer than n
(number of equations) calls to the residual function when evaluating the
jacobian by FD. This substantially improves performance.
A discussion of this can be found in this paper:
T. F. Coleman, B. S. Garbow, and J. J. More, Software for estimating sparse Jacobian
matrices, ACM Transactions on Mathematical Software, 11 (1984), pp. 329-345.
Unfortunately, implementing this is not trivial.
A possible alternative to both of these issues is the approach used by the algorithm
for calculation of jacobians by FD when the matrix is banded (I'm
assuming this applies to bvp4c). The routine is idaDlsBandDQJac and it is discussed
somewhat in the Sundials documentation. A downside of this (other than the
implementation being in C) is that, as far as I am aware, Octave (and MATLAB) has
no specific support for banded matrices.