In this interview, we speak with Ronald Fehd, who possesses a B.S. in computer science. He began contributing to SAS the spring of 1997 and as of 2017, he has published 40 articles on the subject. In the interview, Ronald Fehd talks about some valuable tips or strategies that are necessary to become a top SAS programmer, including The Pareto rule, the 80%/20% rule, and having a style guide. He also recommends one of his favorite tools – keeping a daily job diary. In his words, he explains, why you need to get good at estimating task time, and your daily job diary is one way to see how much time you spend programming versus attending meetings or writing reports, and it helps you stay focused. He also speaks about the SAS procedure he finds the most useful, Autoexec. He also discusses the use of macro language, PROC Content, Frequency, and Summary. Ronald Fehd instructs that, to be a good macro programmer, you need to know your assets. Like any language, it has its strengths.
I have a B.S. in computer science I graduated in 1986, I got my first job that year and learned SAS at that university job. A couple of years later I moved onto a government research job, and in 1989 I attended my first SAS user group International Conference SUGI 14 in San Francisco. IN 1996 I finally got around to putting together some papers, 3 papers at SESUG and in the spring of 1997, I began contributing to SAS, and that was my contribution where I started working with my peers to learn SAS. I have been reading SAS now since I came to CDC. In 2001 I was recognized as the most valuable contributor to SAS-L; in 2007 I began contributing to the SAS Community Organization Wikipedia. 2012 I retired, and I’ve been writing a couple of papers and so since this year 2017 I published my 40th paper.
Can you propose 3 valuable tips or strategies that are necessary to become a top SAS programmer?
OK, I have half a dozen rules, I’m not sure since you’re going to edit it then you can work around that. The Pareto rule, the 80%/20% rule is very important to me. I have a better knowledge of quality assurance, and I think it’s very important for programmers to have a style guide. I’ve studied over the years through reading papers at SAS conferences, database design, there was a big push to learn SQL in 200-2001. The last couple of years I’ve gotten good at using my operating system which I consider very important.
Bricolage, creative tinkering, that’s where my contributions to SAS-L, thinking through problems become very important to me. I recommend keeping a daily job diary which I will get to in a minute and I have a paradigm- list processing which has helped me and my overview of programming. So I have pretty much a page on each one of these items and if I can separate my pages here. I was surprised to see early on that I spent 80% of my time cleaning my data. I like this 80%/ 20% you know they’re big numbers you know you go 20, 40, 60, 80 and 20s so you can remember that. Programs for me consist of 20% data structure and 80% algorithm. Whenever I look at my daily job diary, I see that I spend 80% of my time fixing that small piece the 20% that is data structure. I spend a lot of time after I write a program and doing the test programs and the documentation which I think is very important.
I learned a little bit about quality assurance. My most important guy this is Phil Crosby he is the author of the “zero defect,” IDIA, that’s my standard and writing a program is to make sure that it runs through all the unit tests and the integration tests. There are several other people Pareto I have already mentioned, we have a few art procedures, you need to know who Deming and Juran were, 6 sigma and total quality management. All these things contribute to my understanding of what it takes to write a program that is re-usable which is one of my keywords. You know that’s why I’m called the macro mavens is that I want to write re-usable programs.
The most important thing for me as a programmer is to have a style guide. The two most important aspects of that style guide for me are the naming conventions. I find every time that I had not paid attention to naming conventions at the beginning of a project that involves a lot of rework when I finally decide what the naming conventions are, I have to go back and rename variables or data sets things like that. And a second aspect of the style guide is white space. Indentation so that it’s easy to read, not only read but search and find what you’re looking for when you go back to fix a program. This is one of the things that I emphasize in my style guide. There are many other aspects of that, but I’ll skip them.
In my knowledge of database design, it’s very important to me to understand when I think about a problem what my data supposed to be doing regarding the larger project? You know there are 5 types database tables: we have lookup tables, we have transaction tables, we have inventories that’s why we call our data sets observations because they’re actually inventories, there are snapshots you know looking and recording of an event.
I said earlier the learning SQL is very important for me. There are two aspects the people that know SQL when they come into SAS don’t realize, and that is the SQL dictionaries and the interface to the macro language with the select into a macro variable. These things really put me, you know it was a very important piece for me to learn the SQL select and to write a good hands-on workshop about that.
1 is the configuration files that SAS uses and also the user customizable Autoexec. So my knowledge of the operating system it’s very important to understand the SAS configuration files and the user-written Autoexec. I read a paper about Autoexec companion a couple of years ago which covered all the things that you need to know to set up access to your re-usable programs and macros that you’ve written.
Batch processing is very important when you understand how to use your operating system, and so I rely on the discipline of test-driven development. So I can’t do that unless I’m running my programs in batch.
Many people ask me why I don’t open up SAS and use the editor. I have a professional text editor and running the program, the editor opens up a log for me and refreshes it, so I only open up SAS to look at the actual SAS session maybe once a month if I need something quick that I don’t want to write a small program for.
Bricolage! Creative tinkering! I would like to work for that great search engine in the sky and spend 20% of my time doing a self-education and also taking the time to share that with my peers. So I’ve gone back and forth on SAS-L, SAS community and the communities on SAS sharing the knowledge that I’ve learned in my own self-education.
I keep a daily job diary. This is much more important to me now that I’m retired, I wish I had spent more time when I was actually working doing this because it would have been useful in my performance reviews. You know what were you doing for that week? Well, I was running up a thousand reports for you. You need to get good at estimating task time, and your daily job diary is one way to see how much time you spend programming versus attending meetings or writing reports, and for me now it helps me stay focused. You know if I get stuck on a problem and I look at my daily job diary it’s like you know you spend an hour on this, it’s time to move on.
I spent a lot of time since I wrote a paper with Hart Carpenter called “list processing basics.” I think this was 2007-2008, developing a concept of list processing and out of that, I’ve come away with two other ideas. The first is cardinality ratio and the second one which I laid out in my sea-side paper this last month is the cardinality types. They are few and many and unique variables that have unique values, variables that have many values, those are your analysis variables and variables that have few values those are your look-up, your categorical variables. So once I know this about a data set, the work of programming you know I have by-variable, I have my analysis variables when I want to clean the data I know what the identifiers of the data set are, and it goes much faster.
Which SAS procedure do you find the most useful (do you use most often)?
I’ll kind of line them out, and I want to differentiate between a process and a procedure. I think your original question was which procedures do I use? And I have a process which I run all the time. My Autoexec you know that is the most important thing to me since I have a library which I have to tell SAS where the folder is that contains all the reusable programs and macros that I have. So Autoexec is my first process, I run that every time I run a batch program.
My screen name is the macro maven, so I have an interest in the macro language I’ve been writing macros since the first year that I learned SAS, and that’s now 30 years ago. In the last decade, I’ve come to know the sysfunc macro function and just in the last couple years learned the SCL I think that is a screen control language or source control language functions that I can use in the macro language. So now I’m writing macros which return only the macro codes, they don’t write SAS code. So that’s an interesting aspect to me to rise up to this level of writing in the macro language that only writes other macro statements and so I’ve given up on SAS I’m just writing in the SAS macro language.
PROC Content, as I said earlier data structure is the key for me when I write a program when I get the data structure straight everything flows after that. SQL has a described table thing which allows me to write the data structure into my log. Those SQL dictionaries that’s “sliced bread,” and it really is and when you understand how much information you can get out of the SQL dictionaries, it really rocks your boat. A little access to the SQL access into the macro language those select into macro variable very powerful. But also it can be misused, there are a couple of tricks there that if you’re not aware of them, you use them on big data sets you’ll learn how to explodes SAS.
Frequency and summary are the other 2 procedures that when I do write SAS code, those are the main procedures that I use.
Any closing words?
Ok so that the main function of SAS macros everybody believes is to write SAS code. So the first thing that you need to be good at in writing a macro is to be able to have working SAS code that you want your macro to write. A lot of people you know when they begin to learn macro are confused, they’re thinking that they’re writing SAS or they are writing macro code, they’re writing SAS code, and they go back and forth. So the confusion is I am exactly doing one task, and that is writing SAS code with a macro language. Like any language it has its strengths, it has access to variables that are local and global, and it has access to all those systems, allocated constants most people don’t know that because they haven’t studied their configuration file. So to be a good macro programmer you need to know your assets. I tell people that macros occur in 3 places within the program that’s not what I’m talking about I’m talking about the 2 other places you store that in the folder that you’re writing your project or used for you store it in another folder which is available to all your projects. That’s what I said earlier about the Autoexec, I’ve set up my Autoexec for every project has access to my centralized macro and other programs folders.