c. 3
d. 4
13. When creating a table using PROC SQL, which of the following clauses would select only rows that have a value greater than 10 for the column called Age?
a. WHERE Age > 10
b. IF Age > 10
c. SELECT Age > 10
d. None of the above
14. Given the following program and SAS data set ANIMALS, what will be the value of the variable DogYears for the second observation in the resulting SAS data set called DOGS?
ANIMALS
Name | Type | Breed | Age |
Mina | Canine | German Shepherd | 5 |
Bailey | Feline | Norwegian Forest | 9 |
Sammy | Canine | Shetland Sheepdog | 10 |
Taco | Canine | Terrier | 14 |
DATA dogs;
SET animals;
DogYears = Age * 7;
IF Type = ‘Canine’ THEN OUTPUT;
RUN;
a. .
b. 35
c. 63
d. 70
15. How many observations will be produced with the following program?
DATA new;
DO p = 1 TO 5;
OUTPUT;
END;
RUN;
a. 0
b. 1
c. 5
d. 6
16. Suppose that the YEARCUTOFF= option is set to 1950, and your raw data file has the following date that is read using the MMDDYY8. informat and then printed using the MMDDYY10. format. How would the resulting date appear in the output?
----+----1----+----2
01/01/1920
a. 01/01/1920
b. 01/01/1919
c. 01/01/2019
d. 01/01/2020
17. What is the SAS date value that corresponds to December 25, 1959?
a. -25
b. -7
c. 25
d. 359
18. What will be the value of Quarter in the following statement?
Quarter = QTR(MDY(04,05,2063));
a. 1
b. 2
c. 3
d. 4
19. Which type of DATA step statement can be used to initialize a variable to a specified value?
a. sum
b. RETAIN
c. Both of the above
d. Neither of the above
20. Which of the following is considered a sum statement in the DATA step?
a. X = A + B;
b. X = SUM(A,B);
c. A + B;
d. All of the above
21. The raw data file called Class.dat contains three test scores for each of two students in a class. If you submit the following SAS program, what will be the value of the variable represented by p(i) for the first observation after the second time through the iterative DO group?
----+----1----+----2----+----3
222 Jimmy 95 85 75
333 Ulric 90 80 70
DATA score;
INFILE ‘c:\MyRawData\Class.dat’;
INPUT ID Name $ Test1 Test2 Test3;
ARRAY t(3) Test1 - Test3;
ARRAY p(3) Prop1 - Prop3;
DO i = 1 TO 3;
p(i) = t(i) / 100;
END;
Total = SUM(Test1 - Test3);
RUN;
a. 0.85
b. 0.80
c. 0.75
d. 0.70
22. Referring to the preceding raw data and SAS program, what will be the value of Total for the second observation?
a. 255
b. 240
c. 160
d. 20
Short Answer
23. Discuss a situation where it would not be a good idea to overwrite a permanent SAS data set by specifying the same name in the DATA and SET statements.
24. Describe why you would not use a SET statement and an INFILE statement to refer to the same data in a DATA step.
25. Explain why the following assignment statement is incorrect for creating a numeric variable X that has a missing value.
X = ‘.’;
26. Is there a difference between calculating the mean of three variables using a function compared to calculating the mean using an assignment statement as shown in the following code? Explain your answer.
Avg1 = MEAN(X1,X2,X3);
Avg2 = (X1 + X2 + X3) / 3;
27. Would there be any advantage to using the UPCASE, LOWCASE, or PROPCASE functions when working with messy character data? Explain your answer.
28. An elementary school is holding a public fun run for children and adults as a fundraiser. Runners will start at different times based on age, and must be at least four years old. The following code classifies runners into three groups. Rewrite the code so that once a runner is assigned to a group, SAS will skip the rest of the statements. In addition, make sure that anyone who does not fit into one of the age groups or has a missing value for age is assigned to a fourth group of entrants who require follow-up.
** Assign runners to groups 1-3 based on age;
IF 4 <= Age < 9 THEN Group = 1;
IF 9 <= Age < 13 THEN Group = 2;
IF Age >= 13 THEN Group = 3;
29. The following portion of code was used to classify patients into stroke risk groups based on their smoking status and blood pressure measurements. Rewrite the code so that it is less repetitive and will keep SAS from checking every condition for every observation. In addition, make sure that patients who fall into more than one group, based on their systolic blood pressure and diastolic blood pressure, will be placed in the group with the highest risk. Add code that will create an unknown risk group for patients with any data that do not fall into the specified ranges.
** for smokers;
IF Smoke > 0 AND (0 < Sbp < 120 AND 0 < Dbp < 80)
THEN Risk = ‘Medium’;
IF Smoke > 0 AND (120 <= Sbp < 140 OR
80 <= Dbp < 90)
THEN Risk = ‘High’;
IF Smoke > 0 AND (Sbp >= 140 OR Dbp >= 90)
THEN Risk = ‘Severe’;
** for non-smokers;
IF Smoke = 0 AND (0 < Sbp < 120 AND 0 < Dbp <