Comments (28)
Whats the output of reghdfe, version ?
from reghdfe.
. reghdfe, version
2.1.47 12may2015
Dependencies installed?
- ivreg2 not
- avar yes
- tuples not
- parallel not
from reghdfe.
I'll try with the GitHub bleeding edge.
from reghdfe.
No change with
. reghdfe, version
3.0.10 13may2015
Dependencies installed?
- ivreg2 not
- avar yes
- tuples not
from reghdfe.
Hi Nils,
I hunted the bug a bit but I'm not sure how to proceed:
The problem lies behind -test- (which in turn calls the builtin _test)
What I did:
-
Run just the demeaning without the regression:
reghdfe y x1##x2##(x3 x4)##x5, absorb(fe) vce(cl clustervar, suite(default) ) tol(1e-8) savecache -
Renamed and saved the resulting dta here:
https://github.com/sergiocorreia/reghdfe/blob/updated_mata/misc/fstat.dta -
Ran this:
use fstat, clear
reg y x_, vce(cluster clustervar) nocons
testparm x_
Sometimes, test/testparm drops constant 2, and sometimes it drops constant 5. On the first case, you'll see Fstat=6.31, on the second (and more usual, occurs in 5/6 cases) you'll see 31.10
Why? I'm not sure at this stage (so far I've been using -test- as a means to avoid having to code the wald test by myself)
from reghdfe.
What does areg do?
from reghdfe.
Wait, there is no "x" in that dataset.
from reghdfe.
If I reg and test "x*", it always drops 5.
from reghdfe.
Sorry, I meant x*
(markdown replaced the two stars/asterisks)
What areg does is call _regress and _robust2 and somewhere in the middle compute the values
from reghdfe.
Oh, I need to reg again for it to sometimes drop constraint 2.
from reghdfe.
. reg y x*, vce(cluster clustervar) nocons
[omitted]
. testparm x*
( 1) x12 = 0
( 2) x13 = 0
( 3) x14 = 0
( 4) x16 = 0
( 5) x17 = 0
( 6) x18 = 0
( 7) x19 = 0
( 8) x20 = 0
( 9) x21 = 0
(10) x23 = 0
Constraint 2 dropped
F( 9, 19) = 6.31
Prob > F = 0.0004
In most cases it drops 5, but at least on the Stata version I tested (v13.1) it dropped 2 a few cases
from reghdfe.
I don't remember right now, but I had some pretty good reason for running -test- instead of just using the output of -regress-. In any case, the output of both -test- and the -regress- should coincide, so something is off.
Since this problem can be framed in terms of Statacorp regressions, one option may be to contact them; maybe they have more insight about what's going behind scenes in _test (I get it that sometimes you drop 2 and sometimes you drop 5, but if they are equivalent the results should be the same)
from reghdfe.
I think I might do that. The demeaned set of regressors seem to be well specified to my eye. Thanks for your help, Sergio.
from reghdfe.
Let me know what they reply.
Best,
S
from reghdfe.
Of course.
from reghdfe.
Sorting by clustervar eliminates the instability, FWIW.
from reghdfe.
Interesting.. I'm still wondering about what's the correct FStat in any case.
from reghdfe.
I got a beautiful response from StataCorp. TL;DR report a missing F statistic if r(drop) == 1. There is also r(dropped_[1,2,3,…]), which is probably unnecessary for this use case.
Dear Nils,
For this model, it is not possible to perform a joint test that the
coefficients on all of the x variables are zero. Both -regress- and
-testparm- inform you of this issue, but they have different ways of
handling the situation.The -regress- command tries to test the 10 constraints jointly as an overall
model test. It would be misleading for us to report a test for 9 of the
coefficients as if it was a test for all 10. Stata's estimation commands
report missing values for the overall model test in cases where it is not
possible to perform a joint test of coefficients being zero. You can typehelp j_robustsingular
for more information on this.
The -testparm- command, on the other hand, tests as many constraints as possible
and reports which constraint or constraints were dropped. Because of finite
precision math, the algorithm that determines which constraint to drop may
occasionally choose different constraints. However, the note in the output
indicates which constraints are dropped, and the results that are reported
are a test of the remaining constraints. You may not be interested in the
results from -testparm- when it is not possible to test all of the constraints
you requested. If you are not looking at -testparm- output to see the note
about constraints that are omitted (say you are using -testparm- within a
larger program), you can check the returned scalar r(drop) to determine
whether any constraints were dropped before utilizing other results from the
-testparm- command.I hope this helps.
Sincerely,
XXXXXX
from reghdfe.
The problem of setting missing when r(drop)==1 is that it would lead to every regression with omitted variables to have a missing FStat.
One option may be to only give test
the variables that were not missing in the regress
step (i.e. without the o. prefix), and then check r(drop). In that case, we may i) set the F as missing, or ii) just give a warning in red. I need to think about it.. (specially because it touches not just the underlying regress, but the calls to ivreg2, ivregress, and multi-way-clustering)
from reghdfe.
Running -testparm- on non-omitted covariates seems like the right thing to do. The only reason to not set the F as missing would be if you also reported the dropped covariate(s), right? That could get out of hand, so it seems better to set it to missing. I don't know all the implications, though.
from reghdfe.
This was partly fixed with c74bd59 so I'm closing the issue, but feel free to reopen it if needed.
For the record, what the commit does is:
test x1 x2 ...
will now exclude variables omitted in theregress
stage.- If
test
reports r(drop), a warning in red is given that the FStats are unreliable. - I thought about either converting e(F) to missing or raising an error, but I need to think about it a bit more.
- For instance, are there cases where r(drop) occurs but the FStats are reliable, or can be made reliable?
from reghdfe.
OK. How can I determine whether that warning was raised when I run reghdfe in non-interactive mode? I would like to replace the wrong F statistics myself in those cases.
It seems to me like the conservative thing to do is convert e(F) to missing, as a missing number is always preferable to a wrong one, but I'm sure you're thinking about this harder than I am.
from reghdfe.
Ok just pushed a weird rebase that should fix the typo in the commit; hopefully I didn't mess up anyone's private repo.
About the Ftest thing.. yeah I agree a simple thing would be to set it as missing. Anyone that wants it later can just recover it with the -test- command.
Speaking of which, maybe we can replace the Wald test (what -test- uses) with a LR test..
"The likelihood ratio test, on the other hand ... is computationally more demanding, but also provides the asymptotically more powerful and reliable test"
I already have all the inputs for a LR test, so I can just replace it or maybe add it as an alternative that will not suffer from the problems the Wald test has...
from reghdfe.
Nevermind, LR tests are not compatible with robust SEs, so the Wald test will have to do...
from reghdfe.
I'd like to reopen this issue to keep it on your radar, since I think hasn't been resolved. If you declare it as "wontfix" I'll be happy to take that as a resolution, but in the meantime there are loose threads.
Specifically, your comments include:
* I thought about either converting e(F) to missing or raising an error, but I need to think about it a bit more.
* For instance, are there cases where r(drop) occurs but the FStats are reliable, or can be made reliable?
* About the Ftest thing.. yeah I agree a simple thing would be to set it as missing. Anyone that wants it later can just recover it with the -test- command.
Again, IMHO, areg
has the desirable behavior.
from reghdfe.
Oh wait, I just saw bf218f3. Whoops. Disregard. Time to update reghdfe. I didn't get an email notification of that commit.
from reghdfe.
Github doesn't allow me to auto-close from commits unless the commit goes to the master branch, so yeah that leads to a bit of confusion.
Thus, I'm thinking about pushing dev-branch (v3) into master (v2), which also gives it more eyeballs. The only missing features are rewriting hdfe.ado
and updating the help files, and then hopefully after a month or so it will be stable enough for SSC (It's gotten to the point that even minor improvements to the DoF computations lead to emails from authors/referees wondering why they can't perfectly replicate past regressions).
from reghdfe.
Indeed, bf218f3 seems lovely so far. Thank you!
from reghdfe.
Related Issues (20)
- Questions about the redundant FE reported and yhat predicted after running reghdfe
- [BUG] The factor varlist feature is not working in some instance of reghdfe
- Could we implement -margins- after running -reghdfe-?
- How to romove the "_cons" fixed effect in "estfe"[BUG] HOT 2
- hdfe HOT 2
- Different standard errors in reghdfe 3.2.7 versus reghdfe 5.7.3
- [BUG] Version 6.12.3 of reghdfe appears to break ppmlhdfe HOT 2
- [BUG] error occurred while loading reghdfe.ado HOT 2
- Incorrect "parentheses unbalanced" error HOT 5
- [BUG] Missing `strok` option with `nopartialout` and `varlist_is_touse` HOT 1
- [BUG] HOT 2
- Predict residual outside e(sample)
- Fail to replicate the Example: OLS regression in reghdfe_programming help file
- [BUG] Summarize Breaks and Other Mata Load Error
- [BUG] Ensure vcov matrix is symmetric to avoid numerical precision issues HOT 1
- [BUG] Neither `noregress` nor `keepmata` store `HDFE.solution`
- [BUG] Version 6.12.0 breaks ivreghdfe with "last estimates not found" HOT 6
- [BUG] Failure to generate parallel processes on remote server HOT 3
- [BUG] _assert_abort(): 3498 error partialling out; missing values found HOT 1
- [BUG] Different SE's on main regressors using time series operator in absorb() vs as "regular" control in main line HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from reghdfe.