Resources

Stata programs (snippets)

These are a couple of programs that I have written for Stata. They come without instructions as most of them I believe are rather self explanatory to the average Stata user. If you want to use them just copy paste the code into your .do-file or save the code in an .ado-file at put it in your ado folder.  If you don’t know where that folder is located type “sysdir” in your Stata console.

FAST MAX  : Equivalent to the egen (max) command but speeds up the process by a factor of 20.

/*           FASTMAX PROGRAM
Creator:     Jonas Cederlöf
Date:        March 2017
Contact:     jonas.cederlof@ne.su.se
Description: The program draws on the "fegen" package by Sergio Corriea.
             The purpose of the program is to speed up the max function
             in the much slower egen command.
*/

program define fastmax, rclass
    syntax varlist [if] [in] , [by(varlist)] name(string)

    tempvar maxvar
    clonevar `maxvar' = `varlist'

    if  "`by'" != "" {
        bys `by' :     replace `maxvar' = max( `maxvar'[_n-1], `maxvar') 
        bys `by' :     gen      `name'   =  `maxvar'[_N] 
    }
    else {
                    replace `maxvar' = max( `maxvar'[_n-1], `maxvar') 
                    gen      `name'   = `maxvar'[_N] 
}
end

GSAMPLE : Sampling by group (a command that is completely lacking in Stata)

* Created by     : Jonas Cederlöf
* Date           : February 2017
* Contact        : jonas.cederlof@ne.su.se
* Description    : Random sampling by group-var. Keeps all observations within 
*                  specified group while keeping keep(.%) of the population. 


program drop _all
program define gsample , rclass
    syntax varlist [if] [in] , keep(numlist>0)
    
    qui count
    local xN = r(N)
    
    tempvar x_randid
    tempvar x_rand
    tempvar x_maxrand
    
    qui bys `varlist' : gen  `x_randid'  = `varlist'[_n==1]
    qui                 gen  `x_rand'      = runiform() if `x_randid'!=.
    qui bys `varlist' : egen `x_maxrand' = max(`x_rand')
    
    local temp = `keep'*100
    qui keep if `x_maxrand'< `keep'
    qui count 
    local xnewN = r(N)
    display "You have sampled `temp'% of the population by the variable(s) `varlist'."
    display "Number of remaning observations are `xnewN' out of the original `xN'."

    
end program
Advertisements