*** Translation between Stata format data and plain text files used in GEODE processor. *********************************************. ***** File locations :. * The name of the user's original data file :. global file1 "c:\geode\workshops\stir07\demos\data\lfs_2002extract.dta" * The name of the plain text file used for the GEODE processor :. global file2 "c:\temp\lfs_input.dat" * The name of the plain text file produced by GEODE processor :. global file3 "c:\temp\lfs_output.dat" * The name of given to the final Stata file produced by this exercise :. global file4 "c:\geode\workshops\stir07\demos\data\lfs_2002extract_v2.dta" ********************************************. *****************. ** Step (1) Convert the original SPSS file format into plain text ** (with variable names in first row). *****************. use $file1, clear outsheet using $file2, nolabel replace *****************. ** Step (2) {Run the GEODE matching procedure on the plain text file}. *****************. * {no Stata contribution}. * {GEODE portal reads in file2, and produces file3}. *****************. ** Step (3) Read the derived plain text file and convert it into Stata . *****************. insheet using $file3, clear sav $file4, replace ********************************************. ** Extension :. ** Note that translating out to plain text format then reading back in will have the * effect of loosing 'data dictionary' information from the original SPSS data file * (i.e. variable labels, missing value declarations, file notes, etc). ** There are a few different ways to prevent this happening though all involve taking * additional steps in the analysis process. ** We recommend the following, which involves extracting out a subset from your data * and running the GEODE procedure only on that subset. ** (This example involves a file where there are two key indentifier variables, called * soc2km and ukempst; in other examples other names would be used, and there may be only * one identifier variable needed). ** Further file location declaration:. * A temporary file name :. global file5 "c:\temp\part1.dta" ** Define linking variables :. global var1 "soc2km" global var2 "ukempst" use $file1, clear gen caseid= _n sort caseid save $file5, replace gen occ1= $var1 gen occ2= $var2 * Stage (1). keep caseid occ1 occ2 outsheet using $file2, nolabel replace * {Stage (2)}. * Stage (3). insheet using $file3, clear sort caseid merge caseid using $file5 drop _merge sav $file4, replace **************************************************************. ** EOF.