.LOG
-can we assign the bettas to the lm.starlag, lm.starerr, lm.stargen
-^correlation between variables
-listw2mat for the STAR weights.
-different random selection
-wald test
-different start value of optimization
optimiztaion does not respond to many running. all of them are the
same.
-different Lphi function
-what if the matrices are not lower triangular
-why minus in phis
-is there a minus in logliklihood?
"Tue Jan 17 10:52:16 2012"
we rand the experiments on lag,error and general. the result shows
that the star model is incompetent in the prediction process and is
always behind the other models and in the best situation is works like
the base model. We should try and remove the heteroskedasticity.
Also there is a big problem and it is that since our star model is
constructed based on literature, and it differs with other models in that
it dose not come from lm(), this could be the problem. Since some of
the preliminery results for the error model showed good stats
(pdpi>50) whihc is good. However for the case of clarity and prove
that our model works we will proceed.
"Tue Jan 17 10:58:00 2012"
time to make a backup of our data.
"Tue Jan 17 11:00:56 2012"
now we will proceed with another probable cause of variation which
is the likelihood function. we will try to see the variation results.
the issue of no neighbor for time and in face some of the weights that
are zero can be annoying.
"Tue Jan 17 13:00:37 2012"
definitly our weight matrices are not making any difference. we shuld
then try to see
-at least one different neighbour
-at least one different weight
-remove triangulation and see
"Tue Jan 17 13:27:56 2012"
nope, the matrices of time and space are not making any difference at
all.... the result shows that my star model is not effective at all and in
the best scenario,
"Tue Jan 17 13:59:27 2012"
at least the result shows that -since the result of error star model
result is the same as the lm.base model, the code for making the test
results is correct. and of course all the computations does not make
any difference. so now I am going to test if remving triangulation
works.
the time matrix is lower triangular by itself since we order ascending.
But the mulitiplication is not lower triangular.
TS is row standardized with exactly 11 zeros:
11 regions with no links:
1 2 3 20 21 23 125 142 146 174 175
ST is not row standardized. does not have any zeros. but it has a
combination of 1s and other less than 1 values. it has 170 of 1 and 54
other values.
-test remove lower triangularity
the result showed not improvement with T only in the optimization.
Now repeating with full parameters.
"Tue Jan 17 14:39:42 2012"
nopes, it didnt have any impact whatsoever.
checking to see how the true residual will work in the prediction.
"Tue Jan 17 14:57:20 2012"
nope true residual also did not work reducing heteroskedasticity.
definitly there was an impact though since (I-W) has some.
-checking (I-W): it is row-standardized and of course contains
negative records.
Next: what if we say we want to not work on pur y and instead we
want to work on pure e so that we will workout the prediciton
formula that has nothing attached to e. and we will compute the yhat
again and see what will be the impact.
"Tue Jan 17 16:22:01 2012"
The break through was that the betas coming from the regression
model (errorsarlm () style) was completely conformant to the betas
that we computed manually.
now lets see how this will help our prediction.
nope. it didnt. the %pdp is 25 that it was.
"Tue Jan 17 16:57:23 2012"
my study shows that there is perfect relationshp betwee the lm.starerr
and lm.base. their residulas are conforming to a strait line, their
coefficients are almost the same. their fitted values are also almost the
same. These results are from error model with full matrix neighbours
(not lower triangular spatial).
Next we want to check if using only S will be the same as the
lm.errorsarlm results?
this will be a good test of our procedure since we will only optimze
on S (which is the rho for error sarlm) .
the result shows that using only S in our procedure does not create a
result close to errorsarlm().
between IV%*%(yV -
(ter1RIorigV_S1aWT1W_STARW_ERR8%*%yV))
and (IV%*%(yV - (listw2mat(ter1RIorigV_S1aW)%*%yV)))-yV
is alot of difference. that mean my weight matrix does not actually add
anything and if fact it is nothing.
Next: what is the difference between ter1RIorigV_S1aW_matrix and
ter1RIorigV_STARW_matrix?
can we use the later in the errorsarlm?
I did use mat2listw for the W matrix that I created with only S
parameter and it worked (probably because we dont have negtive
weights on the final W) and I insert it to the errorsarlm. Got a lot of
warnings and the result was the same as using the lagged varible
inside errorsarlm.
As if my W does not make any difference at all whatsoever.
"Tue Jan 17 18:12:00 2012"
the conclusion is that our laged varibles are not working well. we
need to check the other neighbours.
MFA is minus in coefficient while for us and lm.base it is positive.
> lm.errorsarlm
Call:
errorsarlm(formula = yV ~ XV - 1, listw = ter1RIorigV_S1aW,
method = "spam")
Type: error
Coefficients:
lambda XVLA XVMFA XVAFA XVCBDdist
0.851 230.696 -17.694 636.777 2.814
Log likelihood: -1111
> lm(yV~XV-1)
Call:
lm(formula = yV ~ XV - 1)
Coefficients:
XVLA XVMFA XVAFA XVCBDdist
329.778 606.487 827.800 0.315
> lm.starerr
Call:
lm(formula = IV %*% (yV -
(ter1RIorigV_S1aWT1W_STARW_ERR8 %*%
yV)) ~ IV %*% (XV - ter1RIorigV_S1aWT1W_STARW_ERR8
%*% XV) -
1)
Coefficients:
IV %*% (XV - ter1RIorigV_S1aWT1W_STARW_ERR8 %*%
XV)LA
329.778
IV %*% (XV - ter1RIorigV_S1aWT1W_STARW_ERR8 %*%
XV)MFA
606.485
IV %*% (XV - ter1RIorigV_S1aWT1W_STARW_ERR8 %*%
XV)AFA
827.799
IV %*% (XV - ter1RIorigV_S1aWT1W_STARW_ERR8 %*%
XV)CBDdist
0.315
"Tue Jan 17 18:16:18 2012"
checking diferent time neighbours to see if it will make a difference.
"Tue Jan 17 22:29:34 2012"
next we did for T3 of 1 year or 365 days to see how it goes. for T2
that was 200 days the pdpi droped to 24 from original 25.
T3 shows only 24 again. and the same results for heteroskedasticity.
all of the above was for error model.
all the codes that we have done till now should be corrrected because
we have changed the code for the starerr model and now we are
using the lm.starerr instead of doing the manual computation over the
AIC, bptest and real estate indicators.
"Wed Jan 18 13:08:22 2012"
running ter1.r to make different samples for ter1RIorigV and M. to
see the effect. and will also run test.r which is basically ERR1 with
lower triangularity of S not removed but it is row-standardized since
we used style W. Code for making sure that the S and T are lower
triangular and rowstandardized in commened out. these are test
results for making the raw data ter1RIorigM and V using different
random sampling.
tab
AIC bp.pvalue bp.statistic Median Ri COD PRD %PDP
[1,] 2614 0.00e+00 6.09e-32 1.15 23.7 1.24 28.1
[2,] 2578 2.71e-07 3.34e+01 1.08 59.4 1.23 15.6
[3,] 2579 2.61e-07 3.34e+01 1.11 57.1 1.23 15.6
[4,] 2578 2.71e-07 3.34e+01 1.09 59.4 1.23 15.6
AIC bp.pvalue bp.statistic Median Ri COD PRD %PDP
[1,] 2257 0.0000 1.87e-33 0.990 18.9 1.05 42.7
[2,] 2275 0.0291 9.01e+00 0.940 23.4 1.04 22.9
[3,] 2226 0.8131 9.51e-01 0.986 13.6 1.03 55.2
[4,] 2276 0.0264 9.23e+00 0.932 23.5 1.04 22.9
AIC bp.pvalue bp.statistic Median Ri COD PRD %PDP
[1,] 2268 0.000 3.62e-31 0.985 25.5 1.10 34.4
[2,] 2291 0.119 5.85e+00 0.971 26.2 1.07 26.0
[3,] 2240 0.425 2.79e+00 0.972 17.5 1.06 55.2
[4,] 2291 0.121 5.82e+00 0.942 26.5 1.07 28.1
AIC bp.pvalue bp.statistic Median Ri COD PRD %PDP
[1,] 2368 0.0000 4.98e-34 1.008 39.5 1.26 40.6
[2,] 2377 0.0784 6.80e+00 0.992 31.8 1.13 26.0
[3,] 2367 0.0666 7.17e+00 1.000 28.8 1.15 32.3
[4,] 2377 0.0694 7.08e+00 0.958 32.3 1.12 25.0
AIC bp.pvalue bp.statistic Median Ri COD PRD %PDP
[1,] 2612 0.00e+00 3.91e-32 1.14 27.8 1.27 35.4
[2,] 2580 2.81e-06 2.85e+01 1.23 49.9 1.23 17.7
[3,] 2578 2.50e-06 2.88e+01 1.17 50.1 1.20 16.7
[4,] 2580 2.81e-06 2.85e+01 1.23 49.9 1.23 17.7
AIC bp.pvalue bp.statistic Median Ri COD PRD %PDP
[1,] 2297 0.000 9.24e-32 1.007 26.3 1.10 32.3
[2,] 2295 0.227 4.35e+00 1.004 22.5 1.05 30.2
[3,] 2238 0.979 1.94e-01 0.999 14.4 1.04 56.2
[4,] 2295 0.230 4.31e+00 1.009 22.6 1.05 28.1
AIC bp.pvalue bp.statistic Median Ri COD PRD %PDP
[1,] 2297 0.000 9.24e-32 1.007 26.3 1.10 32.3
[2,] 2295 0.227 4.35e+00 1.004 22.5 1.05 30.2
[3,] 2238 0.979 1.94e-01 0.999 14.4 1.04 56.2
[4,] 2295 0.230 4.31e+00 1.009 22.6 1.05 28.1
AIC bp.pvalue bp.statistic Median Ri COD PRD %PDP
[1,] 2358 0.000 4.23e-32 1.04 23.1 1.11 35.4
[2,] 2364 0.457 2.60e+00 1.04 23.7 1.08 26.0
[3,] 2353 0.403 2.92e+00 1.04 18.5 1.07 37.5
[4,] 2364 0.455 2.61e+00 1.03 23.7 1.08 25.0
AIC bp.pvalue bp.statistic Median Ri COD PRD %PDP
[1,] 2290 0.0000 9.94e-32 1.03 21.2 1.07 35.4
[2,] 2308 0.0984 6.29e+00 1.02 23.1 1.05 20.8
[3,] 2267 0.4758 2.50e+00 1.01 15.1 1.04 50.0
[4,] 2308 0.0983 6.29e+00 1.02 23.1 1.05 20.8
"Wed Jan 18 15:25:46 2012"
Yes the MRA and STAR are the same. That means my weigth lists
are not making any difference.
Repeat with the code that were commented out is not back to the
game.
AIC bp.pvalue bp.statistic Median Ri COD PRD %PDP
[1,] 2290 0.0000 9.94e-32 1.03 21.2 1.07 35.4
[2,] 2308 0.0984 6.29e+00 1.02 23.1 1.05 20.8
[3,] 2267 0.4758 2.50e+00 1.01 15.1 1.04 50.0
[4,] 2308 0.1024 6.20e+00 1.01 23.2 1.05 22.9
NEXT: cleanly see if we can reproduce the result of "sptial only" using
lagsarlm and errorsarlm using the spatial only weighting matrix and
optim or nlminb. This is the bottleneck and if we can solve it, I guess
we will be able to have good results.
The Idea is to recreate the
1) optimization parameter or rho and lambda that is done inside the
errorsarlm and lagsarlm with our procedure. so that we could see if
the sptatial autoregressive parameter coming out of the errorsarlm
and lagsarlm is equal to our optimization parameter.
2)later we should see if the betas are the same.
"Fri Jan 20 14:00:00 2012"
Starting with the imposition 1. we will only convert the listw opject
create by the nb2listw to matrix using mat2listw and then we will use
the that in optimiztion and in the errorsarlm and lag sarlm to see what
is the value for rho and then we will compare it.
by studying the lagsarlm and errorsarlm it turns out that this code
does seperate w from phi*w in the code. wo we are going to create a
wx and wy and will use the rho or lambad inside the regression which
is now simply called phi.
ok the result. showed that the optimize function and nlminb functions
create the same opjective (3134 for optimize there was a minus
before that) and the parameter is 0.21. This however was different
than the lagsarlm which showed 0.13 and log-liklihood was -2925.
increasing the tol.solve in lagsarlm made possible using eigen method
but the result is the same as before.
"Fri Jan 20 16:56:58 2012"
I couldnt find out how the lagsarlm could do the optimization. it
stocks at ..... So I gues we need to check out how the optimizatio is
done for error and lag in the literature. The candidates will be the
Anseline book, the the lesage tutorial with matlab and the Lesage
book.
Also changing lagsarlm and errorsarlm to accomodate ours and also
ingonring the problem and continuting with the experiment and see
how it will show up .
Finall we should ask roger himself for the problem.
"Fri Jan 20 18:47:28 2012"
it seems that the models with intercept are more accurate in terms of
sign of the betas.
Call:
lm(formula = yV ~ 1)
Coefficients:
(Intercept)
133799
Call:
lm(formula = yV ~ XV - 1)
Coefficients:
XVLA XVMFA XVAFA XVCBDdist
407.538 548.856 644.434 0.442
Call:
lm(formula = yV ~ XV)
Coefficients:
(Intercept) XVLA XVMFA XVAFA XVCBDdist
155480.59 221.58 -138.44 538.92 -2.62
Call:
lagsarlm(formula = yV ~ XV - 1, listw = ter1RIorigV_S1aW,
tol.solve = 1e-40)
Type: lag
Coefficients:
rho XVLA XVMFA XVAFA XVCBDdist
0.626 237.221 86.271 579.197 -0.301
Log likelihood: -1121
Call:
lagsarlm(formula = yV ~ XV, listw = ter1RIorigV_S1aW, tol.solve =
1e-40)
Type: lag
Coefficients:
rho (Intercept) XVLA XVMFA XVAFA
XVCBDdist
2.66e-01 1.03e+05 2.12e+02 -1.04e+02 5.47e+02 -
1.90e+00
Log likelihood: -1114
my own: rho=0.21
(solve(t(XV)%*%XV))%*%(t(XV))%*%(IV-
ter1RIorigV_S1aWT1W_STARW_LAG1)%*%yV
[,1]
LA 350.408
MFA 393.690
AFA 622.551
CBDdist 0.193
I read the manual for the le sage matlab code and it was interesting
that it had a procedure special to each model to find the parameters.
for example for error model it first runs a simple regression, then it
will take betas and residuals for that simple "base" regression to find
the best lambda (for us are phis). the criteria was to reduce the error
model close to error of the base model. I dont know why.
second the interesting things was as I reduced number of days two
observations were apart, the criteria for pdp was increaseing so that
for 2 days apart I had 34 for pdp and 23 for cod. this shows that the
combinations for the spatial and temporal arrangements should have
an impact on the results:
AIC bp.pvalue bp.statistic Median Ri COD PRD %PDP
[1,] 2297 0.000 9.24e-32 1.007 26.3 1.10 32.3
[2,] 2295 0.227 4.35e+00 1.004 22.5 1.05 30.2
[3,] 2227 0.970 5.36e-01 1.019 13.6 1.04 55.2
[4,] 2287 0.294 3.71e+00 0.917 23.4 1.05 34.4
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
the model with intercept and 2 days aprat:
> tab
AIC bp.pvalue bp.statistic Median Ri COD PRD %PDP
[1,] 2297 0.000 9.24e-32 1.007 26.3 1.10 32.3
[2,] 2234 0.961 6.20e-01 1.023 14.9 1.05 52.1
[3,] 2227 0.970 5.36e-01 1.019 13.6 1.04 55.2
[4,] 2231 0.954 6.77e-01 0.931 15.3 1.04 49.0
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
changing the weight matrices for both time and space to B and 2 days
:
AIC bp.pvalue bp.statistic Median Ri COD PRD %PDP
[1,] 2297 0.000 9.24e-32 1.007 26.3 1.10 32.29
[2,] 2234 0.961 6.20e-01 1.023 14.9 1.05 52.08
[3,] 2229 0.941 7.78e-01 1.018 13.9 1.04 56.25
[4,] 2245 0.797 1.67e+00 0.729 17.0 1.04 8.33
"Mon Jan 23 00:25:58 2012"
1- Prof bivand answer
He gave me his maximum likelihood function and also introduced me
to the problem of nlminb for estimating the parameters. I should see
my likelihood and check if something is missing there. I guess it
should be i-eigw.
2- splm package
this package tries to extend fixed effect models with panel data and
then will try to introduce the lag, error and general models to this kind
of problem. I read quite a log of stuff about the random and fixed
effect models as well mixed models that will help me to comprehend
better.
my observation is that I might be able to see and incorporate the
code for my lag, error and general model from splm and more
importantly, I might be able to see how their ml function is formed .
3.- I noticed that splm uses intercept model. why not we try model
with intercept or just add another option for our grand comparison.
4. see if we can implement Husman test in our thing.
5- extend the the lesage methodology (using his FAR code) to see if
we can improve our LAG models.
"Wed Jan 25 13:42:01 2012"
will start on the answer of bivand to see how we can improve our
prediction. Till now we could only improve the prediction with
reducing the number of days the transactions are different. we are
developing on the "test for imposition 2 lesage way" whihc will be our
benchmark for the problem.
"Wed Jan 25 14:12:30 2012"
I checked the code for lagsarlm. I think eventually the code will
resemble the Lesage case where both perform regression on (y~x-1)
and (wy~wx-1) and then tries to converge both residuals to find the
best set of parameters. The code for Bivand is very complicated and
he only gave me a clue about his lagsarlm likelihood function. I need
to see how I can skip the do_idet part.
10:27 PM 1/23/2012
the bivand ways need more knowledge and search since it uses a lot
of measures and also most of these measures are for speeding up the
process of finding the best parameter. So for now I think we will skip
looking at the bivand code and will see if we can come back after
some time and check again. especially after reading the Le sage book
on spatial econometrics.
"Wed Jan 25 14:49:42 2012"
the result for
Spatial weight matrix: lower-triangular and row-sochastic,
timed=2days, all models with intercept:
tab
AIC bp.pvalue bp.statistic Median Ri COD PRD %PDP
[1,] 2297 0.000 9.24e-32 1.007 26.3 1.10 32.3
[2,] 2234 0.961 6.20e-01 1.023 14.9 1.05 52.1
[3,] 2227 0.970 5.36e-01 1.019 13.6 1.04 55.2
[4,] 2231 0.954 6.77e-01 0.931 15.3 1.04 49.0
this means that lowertriangular and rowstandardization does not have
an impact on the result, time however still does and the results are
behind. repeating the experiment with all models whithout intercept.
AIC bp.pvalue bp.statistic Median Ri COD PRD %PDP
[1,] 2297 0.000 9.24e-32 1.007 26.3 1.10 32.3
[2,] 2295 0.227 4.35e+00 1.004 22.5 1.05 30.2
[3,] 2238 0.979 1.94e-01 0.999 14.4 1.04 56.2
[4,] 2287 0.294 3.71e+00 0.917 23.4 1.05 34.4
model without intercept actually reduced our resul dramatically.
LOOKS LIKE THERE IS A GOOD RELATIONSHIP BETWEEN
THE GOOD PDPI RESULTS AND THE AMOUNT OF
HETERSKEDASTICITY THAT IS REMOVED BY THE
MODELS.
NEXT: Add coeficients to the table.
"Wed Jan 25 16:45:53 2012"
added the r-squared and also the betas.
AIC bp.pvalue bp.statistic Median Ri COD PRD %PDP R-
squared
[1,] 2297 0.000 9.24e-32 1.007 26.3 1.10 32.3 0.000
[2,] 2234 0.961 6.20e-01 1.023 14.9 1.05 52.1 0.525
[3,] 2227 0.970 5.36e-01 1.019 13.6 1.04 55.2 0.569
[4,] 2231 0.954 6.77e-01 0.931 15.3 1.04 49.0 0.519
intercept LA MFA AFA CBDdist
[1,] 125910 NA NA NA NA
[2,] 137469 335 -40.88 318 -2.98
[3,] 126266 337 2.93 361 -2.66
[4,] 120221 350 -30.42 362 -2.84
NEXT: Correct for the LAG model based on the lasage code and
also the previous T3ERR1 code.
"Wed Jan 25 23:15:57 2012"
The imposition 1 was designed for one parameter so I am going to
have to change the LAG1 into the new process which is
1) change the optimization algorithm
2) add the intercept to the models
3) add the rsquare and betas to the code
4).....
DONT FORGET TO EQUIP THE CODE FOR ERROR AND
GENERAL TOO.
The lag code also gets the betas that we dont need. remove it from
the code.
"Thu Jan 26 00:27:49 2012"
reading La sage manual for optimzing the code for the lag.
MAIN ERR code: test of imposition 2 the le sage way.R
MAIN LAG code: test of impostition 1 lag 2.R
FACTFACT: if the matrix is not row standardized the minimum and
maximum values for the autocorrelation parameters are set by min
and max of the eigenvalues of the W. So we need to set that if our
weight matrix is not syle=W.
FACT: Lesage also return a negative of the log likelihood. So we
need to put a negative in front of the objective.
FACT: least squar is biased. we use ML to find parameters. The
dispersion of the parameters estimated using ML are made with
partial derivative over all parameters in the Log likelihood. Fisher
information matrix is the covariance matrix of all the estimated
paraemters: phis, sigma and betas. so here we will have for example:
4+1+4.
IDEA: follow the code from Le sage to create a t-statistics for our
estimate p. 62.
IDEA: see if we can make the information matrix for our parameters
using the R partial derivative and theta=c(phis,sigma,betas). or
whithoug betas since in our likelihood we will have only phis and
sigma.
IDEA: use temporal weight for error and spatial weights for error.
W1=spatial, W2=temporal. like the general spatial autoregressive
model.
IDEA:see if we can implement Husman test in our thing.
x_IDEA: add the coeficients to the table.
IDEA: read the damn Lesage book
x_IDEA: R square
IDEA: death to kappa
x_IDEA: make the lower boundary of phis as 0 instead of 1 and: this
has resolved since I now get good coeficients for phis see what
happens.
PROBLEM: should ask roger why mat2listw with the same listw that
has been converted to matrix gives error in errorsarlm.
PROBLEM: what is the difference between (I-W) and (I-eigW)?I
guess this is for increasing the speed of computation only.
x_NEXT: check to see if change yhat will make any differene: yes it
somehow does and we decided to continue with the procedure in the
errorsarlm and lagsarlm codes to find the rho and betas. and we dont
need to do it since the regression takes care of it.
IDEA-test add some value to no neighbour in time.
1:40 PM 1/25/2012
1:59 PM 1/25/2012
5:02 PM 1/25/2012
10:11 PM 1/26/2012
12:55 PM 1/27/2012
5:31 PM 1/27/2012
4:19 PM 1/28/2012