Friday, July 13, 2012

Progress Report

This blog post is concerned with the milestones achieved and some upcoming features in statsmodels

NONLINEAR MODELS

The model implemented is

y = f(x,θ) + e
where the y is one dimensional endogenous data matrix, f is the nonlinear function, x is exogenous data  matrix, θ is parameter matrix and 'e' is the noise in data. 

Estimation
The estimation of parameters is done using the 'leastq' method from scipy.optimize which minimizes the sum of squares of residuals. We subclass the model class 'NonlinearLS' and provide the 'expr' function which calculates 'f' in the above expression using the parameter values and exogenous data provided to it. It is encouraged that the user provides the analytical derivative of the given function by defining 'jacobian' function in the similar way as 'expr'.

Testing
For testing purposes we used the 'Misra1a' model from NIST data. Details regarding this given in previous post. In summary, we obtained satisfactory results as compared to 'Gretl' which uses the same minpack module used by scipy.

Miscellaneous Features

  • Parameters calculated at each iteration by the algorithm can be viewed using view_iter() method
  • Prediction table with confidence intervals for each predicted value of endogenous data using prediction_table(alpha) method

Example
A complete example can be viewed here https://github.com/divyanshubandil/statsmodels/commit/db2e388232303323cc9bb36e0fe9f682892993ba

ROBUST NONLINEAR MODELS
I have been working on the M-estimation of nonlinear models for some time now. The best research paper I found having all the tests, computational algorithm and simulation data is here. http://www.tandfonline.com/doi/abs/10.1080/03610920802074836
Recently, I have been able to implement the algorithm in my first commit regarding this topic here https://github.com/divyanshubandil/statsmodels/commit/1745e02b45ebe3f83a8e0d55f477fcef33621d6f
Now I am working with testing this model for the 'Numerical example' given in the paper.

Monday, July 9, 2012

Nonlinear models - Results and Tests

After computing parameters for nonlinear models using least squares (LQ) method of estimation, next step was to add result statistics and writing tests comparing results from other statistical package to those from statsmodels.

Model and Results
The model used for analysis was the 'Misra1a' model selected from NIST nonlinear models. Dataset misra was addded to datasets directory. Parameter estimates and standard error of residuals were provided in the NIST data file. Other results of nonlinear regression analysis of the given model were obtained from Gretl. Reason for choosing gretl was that it uses the same Minpack module for LQ estimation that is used in scipy.optimize. All the results were stored in python class Misra1a.

Testing
Four sets of 18 tests ( 2 start values with and without the jacobian ) provided in test_Misra1a.py were run and we obtained exact matches upto 4 decimal places.

Weighted NLS
Weights (1 for the first seven observations and 0.49 for the next seven observations) were provided while fitting the data.
Gretl does not provide the feature of weighted nls and hence we had to use results from R which has the provision to do so. Test class for weighted NLS was added to test_Misra1a.py and 14 tests were added. Only one test regarding confidence intervals of parameters failed which is only a precision issue (No digits after the decimal match).