Comments (6)
The speed gains are not there for me, so I can't assume it will be faster in every version of Stata.
I have an internal check to see if it's sorted for hashsort. Basically a recursive call to
for (i = start + 1; i < end; i++)
if ( comp(i - 1, i) > 0 ) return (0);
return (1);
So if the previous element is greater, it is not sorted. If all elements are s.t. i - 1 <= i
then it is sorted. I think I can modify it for isid:
for (i = start + 1; i < end; i++)
if ( (rc = comp(i, i - 1)) <= 0 ) return (rc);
return (1);
If 1 then this is an id because every element is such that i > i - 1
; if 0 then this is not an id because there exists an element such that i == i - 1
. If -1
then this is not sorted and I can proceed with the hash normally.
from stata-gtools.
I just tried a version on this on Linux and had a 5x speedup. I'll write tests to make sure I'm not misidentifying when a set of variable is or not an id
from stata-gtools.
Cool! Let me know if you want me to run any benchmark,
from stata-gtools.
I haven't pushed. I need to write tests to make sure I haven't mis-coded anything. I'll write those tomo and let you know to see if the speedup holds on your end.
from stata-gtools.
I have added the changes. Let me know how it benchmarks.
from stata-gtools.
Yep, much faster!
gisid
is now at half the speed of the gen + assert
trick (which is 3x faster than gisid
on the master branch, which is 4x faster than fisid
, which is 5-10x faster than isid)
On the past I rarely used isid
because it was slow, so the new way of doing it is a game changer in my opinion., as I can check almost instantly if I have unique IDs or not.
from stata-gtools.
Related Issues (20)
- gegen total vs. egen total HOT 6
- Could not load gtools_macosx_v3.plugin, error 9999 HOT 9
- gegen normalize does not realize that a new variable shall be created HOT 1
- gunique missing scalars when there are no observations
- gtools version of merge HOT 4
- gtools not installing on macos Stata 16 HOT 3
- Problem with -if- condition in several commands HOT 1
- Please update the benchmark using Stata 17. HOT 5
- gtools 1.8.1 not working *at all* with Stata MP 16.1 on MacOS 11.6 HOT 7
- Plugin download error when using "ssc install gtools" HOT 2
- the option cw in gcollapse is invalid. HOT 2
- Error r(111) in Stata MP 16.1 and SE 17.0, macOS Monterey HOT 4
- OSX plugin fails; move OSX Compilation to github
- gegen max does not properly evaluate string expressions HOT 2
- Error trying to copy gtop.sthlp in Stata 14 HOT 3
- Could not load gtools_macosx_v3.plugin, error 9999 HOT 17
- Some commands appear to ignore [w=weights] HOT 3
- Export results to word or excel HOT 4
- Wrong number of groups HOT 1
- Will greshape support strL variabes in the future? HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from stata-gtools.