This code builds off of the original script provided by Nick Hillman available here. It adds a little code to take advantage of Stata's recent addition of frames for appending files together, deletes some unnecessary files to save you space, and updates the loops to process newer data files.
Note: The Scorecard data is organized by collection year, but the collection year maps to cohorts of students differently depending on the variable you are looking at. Take the example of earnings. The Scorecard reports average earnings at a variety of time horizons, and so the 5- and 7-year earnings estimates collected in a given year will be estimated using different cohorts of students. Many analyses are interested in analyzing cohorts across a number of variables, and thus the analyst must be careful to use the cohort maps provided by the Department of Education to be sure that the variables they use all refer to the same cohort. This data intake code does not do anything to create a version of the data organized by cohort year rather than collection year, so users should take care to map cohorts effectively.