(Can you name what groups of students are included in this subset? Hint: there are four different groups.) Example - Extract cases matching a logical conditionĬonditional logic can get very complex, particularly when the criteria are based on multiple variables and/or multiple values. The resulting subset has 288 observations. To do this, we can use the DELETE keyword to remove observations where Rank = 1, which is the indicator value for freshman. Let's create a subset of the sample data that doesn't contain any freshmen students. Example - Delete cases with a specific value The inclusion or exclusion criteria appear after the IF statement. The "disqualifying" values you specify are called the exclusion criteria. DATA New-Dataset-Name (OPTIONS) Ĭreating a subset that contains only records without a certain value: In this case, your subset will be all of the cases that remain after dropping observations with "disqualifying" values. The criteria for keeping an observation is called the inclusion criteria. Inclusion and exclusion criteria are both statements of conditional logic that are based on one or more variables, and one or more values of those variables.Ĭreating a subset that contains only records with a certain value: In this case, your subset will keep the records that meet the criteria you specify. Subsets can be created using either inclusion or exclusion criteria. For instructions on how to drop or keep variables from a dataset, see our Data Step tutorial. Note: A related task is to select a subset of variables (columns) from a dataset. The difference between the two processes is in how the cases are selected. Both processes create new datasets by pulling information out of an existing dataset based on certain criteria. When splitting a dataset, you will have two or more datasets as a result.īoth subsetting and splitting are performed within a data step, and both make use of conditional logic. When subsetting a dataset, you will only have a single new dataset as a result.Ī split acts as a partition of a dataset: it separates the cases in a dataset into two or more new datasets. You can also think of this as "filtering" a dataset so that only some cases are included. In this tutorial, we use the following terms to refer to these two tasks:Ī subset is selection of cases taken from a dataset that match certain criteria. I've tried various types of merges and sets, but I keep getting duplicate date rows.When preparing data for analysis, you may need to "filter out" cases (rows) from your dataset, or you may need to divide a dataset into separate pieces. So ideally the finished produce would look like this: I need to be able to combine them such that r1-r3 contain all the non-null values, without creating duplicate date rows to hold the null values. They are all time series data composed of a date variable and variables r1 through r241.įor a given r variable (lets just use r1-r3, and only Forecasts 1 and 2 for now) each dataset has only one row where the value isn't null, and it is a different row in each dataset. I have three datasets Forecasts1, Forecasts2 and Forecasts3. I'm not sure if the title does this question justice, but here it goes:
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |