MVR generates and selects non-linear regression models. It was written in the Matlab language and intended to be used as an open-source code.
Introduction
This software is intended a curve-fitting tool. The models (curves) are generated using the set of primitive functions. More information on the algorithms could be found in the presentation, and in the paper. The complete documentation in English is coming. The fields applications are biology, physics, ecology, economics, etc.
Mathematical modelling has two issues: first, to create a model of a dynamic system using knowledge and second, to discover a model and knowledge using the measured data. So there are the model-driven and the data-driven approaches, and each one has its own strengths and weaknesses. The first one gives models that could be interpreted by experts in a field of application but usually they have poor prediction quality. The second one gives models of good quality but often too complex and non-interpretable by experts. The suggested approach gathers strong sides of these two: the result the model could be explained and it relies on the measured data. It allows getting the model with fair quality and generalization ability in comparison to universal models.
A model is selected from an inductive generated set of the trial models according to the notion of adequacy: the model must be simple, stable and precise. These criterions are target functions and they are assigned according to given data. It is supposed that given data carries the information on the searched model and the noise as well. The hypothesis of the probability distribution function defines a data generation hypothesis and as follows, the target functions.
The outline of the automatic model creation is the following. A sample data, which consist of several independent variables and one dependent variable are given. Experts makes set of terminal function. These models are arbitrary superposition, inductively generated using terminal functions. Experts could also make initial models for inductive modification. When generated models are tuned, a model of the optimal structure is selected.
Thus, the result is the non-linear regression model of the optimal structure and
- model as a formula, the symbolic description to be used in further research and publication
- vector of the model parameters, to be used in forecasting
- plot of the model is .png of in .eps for TeX publications.
Installation
This software is a Matlab toolbox so you need the Matlab system. There are two ways to use the software:
A. Download it from http://strijov.com/files/mvr61.zip, unzip and run "main.m"
.
B. Get the latest version: connect your SVN shell extension to sourceforge.net. To do that you need:
- download and install TortioseSVN;
- make the folder somedrive:\somefolder\mvr;
- click the folder to get the context menu and choose Tortoise->Checkout;
- put URL of Repository https://mvr.svn.sourceforge.net/svnroot/mvr;
- this will download the software.
Run demo project
To watch the MVR demo you must run main.m
— demo project. There are tree examples:
main('demo.prj.txt')
two-variatemain('sinc.prj.txt')
one-variate regressionmain('options.prj.txt')
stock-market options (Brent Crude Oil) modelling
Make your own project
To make your own project you have to do the following.
- Make data file
"filename.dat.txt"
with the contenty, x1, x2, ..., xn ... y, x1, x2, ..., xn
see for example
"demo.dat.txt"
or"sinc.dat.txt"
. - Make registry file
"filename.dat.txt"
with the contentfunction_, n, m, [opt initial parameters], [domain]
n
is the number of the arguments of the function,m
is the number of the parameters of the function. The initial parameters so that the function should be identity, if possible, for exampleparabola_ 1, 3, [0 1 0], [] % y = w(1) + w(2)*x + w(3)*x.^2;
See more examples in
"demo.reg.txt"
. The file"function_.m"
must be placed in the folder"mvr\func\"
with the contentfunction y=function_(w,x) y = w(1)*x;
See more examples in this folder, and note the main rules:
- no matter what a shape x has, scalar, vector or matrix, y must be of the same shape.
- use parameter vector
w
as a set of scalars say,w(1), ..., w(k)
. See example above. - function names are
"function[number of arguments][a|l]_.m"
, wherea
for the affine transformation of the argument,say y = sqrt(w(2)* x + w(1));
l
for the linear transformation, sayy = sqrt(w(1)* x);
- the sign
"_"
is used to avoid possible collision with the other Matlab functions
- Make the initial model file
"filename.mdl.txt"
with the contentfoo2_(foo_(foo2_(x1, foo_(x2))),...)
all function
foo_, foo2_
must be in the registry file. See more examples in"demo.mdl.txt"
. - Make the project file
"filename.prj.txt"
with the contentDataFile = 'filename.dat.txt'; ModelsFile = 'filename.mdl.txt'; RegistryFile = 'filename.reg.txt'; ...
etc. see for example
"demo.prj.txt"
. - Place these files in the folder
"mvr\data\"
and runmain('filename.prj.txt')
.