Show Menu
Cheatography

Proc Sort Options in SAS Cheat Sheet (DRAFT) by

Sorting the data is always a resource-intensive operation. Therefore, using PROC SORT efficiently can save you both time and computing resources. There are a number of options associated with PROC SORT that can be used not only to control the performance and capabilities of the procedure but also to the resulting data set.

This is a draft cheat sheet. It is a work in progress and is not finished yet.

NODUPREC

Removes duplicate observ­ations that are adjacent after sorting.

NODUPKEY

NODUPKEY option checks and eliminates observ­ations with duplicate BY values keeping only the first occurrence in the BY group.

DUPOUT

 

NOUNIQ­UEKEY and UniqueOUT

The NOUNIQ­UEKEY option checks and eliminates observ­ations from the output data set that has a unique sort key.
The <st­ron­g>U­NIQ­UEO­UT<­/st­ron­g>= option can be used with the NOUNIQ­UEKEY option. UNIQUEOUT= SAS-da­ta-set specifies the output data set for observ­ations that will contain unique records.
 

OVERWRITE

The OVERWRITE option will enable you to delete the input data set before the replac­ement output data set of the same name is populated with observ­ations.

Example

data class;
set sashelp.class;
run;
proc sort data=class overwrite;
by age;
run;
The OVERWRITE option has no effect when an OUT= data set is specified.

PRESORTED

The PRESORTED checks within the input data set to determine whether the sequence of observ­ations is in order before sorting is done.

Relative Order of Observ­ations in Each BY Group

EQUALS
NOEQUALS
The EQUALS option specifies the order of the observ­ations in the output data set and it maintains the relative order of the observ­ations from within the input data set to the output data set for observ­ations with identical BY variable values.
NOEQUALS does not necess­arily preserve this order in the output data set.
SORTEQUAL System Option specifies that observ­ations with identical BY variable values are to retain the same relative positions in the output data set as in the input data set.
NOSORT­EQUALS says that no resources should be used to control the order of observ­ations in the output data set that have the same value for a BY variable.
 

Sorting Orders

Numeric Variables
For numeric variables, the smalle­st-­to-­largest comparison sequence is:
1. SAS missing values (shown as a period or special missing value)
2. negative numeric values
3. zero
4. positive numeric values.