Validating a Questionnaire

By Dave Collingridge

I remember years ago walking the halls of the faculty offices at my university asking for help on validating a questionnaire. I repeatedly asked professors, “Can you tell me how to validate the questions in my survey?” The response was usually a polite, “I can’t, but have you tried talking to doctor so and so, he might be able to help.” Doctor so and so couldn’t help either. In fact, no one seemed able to help.

I found it strange that faculty in a psychology department were unable to tell a graduate student how to validate a survey. Aren’t questionnaires one of the most common methods of data collection in the social sciences? They are.

After graduating I was determined to find out how to validate surveys, so I checked out some books from my school’s library and purchased others from Amazon. The books contained useful information on how to create questions and what response scales to use, but they lacked start to finish instructions on how to validate. So I took little pieces here and there from articles, books, and webpages and compiled them into own comprehensive approach to validating questionnaires.

I have used this approach to publish questionnaire-based articles. Also, I was awarded a grant to validate a questionnaire evaluating clinicians’ impressions of electronic decision making tools. I guess journal editors and reviewers think that I know what I am doing, or maybe they are deferring to my “expertise” because, like my university professors, they are not sure about what to do. It is kind of strange being called an expert on something that you did not learn from another expert.

Anyway, here is my approach in a nutshell. Nutshell means that I’ve left out a lot of details. Maybe I will post additional blogs addressing each subject. Or maybe I will write a book on how to validate surveys from start to finish, sit at home, and get rich off of 10% royalties on an academic book. Come to think of it, I’d have a better chance at becoming a statistical hero in tights that swooshes out of the sky and helps people power a Poisson regression.

Questionnaire Validation in a Nutshell

Generally speaking the first step in validating a survey is to establish face validity. There are two important steps in this process. First is to have experts or people who understand your topic read through your questionnaire. They should evaluate whether the questions effectively capture the topic under investigation. You might have them pretend to fill out the survey while scribbling notes. Second is to have a psychometrician (i.e., one who is expert on questionnaire construction) check your survey for common errors like double-barreled, confusing, and leading questions.
The second step is to pilot test the survey on a subset of your intended population. Recommendations on sample size for pilot testing vary. Some academicians are staunch supporters of things like a 20 participant per question. Well if your survey has 30 questions, that means that you’ll need at least 600 respondents! I think this standard should be relaxed – perhaps it has been. Trust me, it is possible to validate with far fewer participants. I’ve taught classes where I gave students questionnaires containing a small proportion of somewhat irrelevant questions. After they filled out the form I pointed out which questions were weak – they had no idea. Then we ran the statistics on their responses and guess what? The analysis revealed that the somewhat irrelevant questions be dropped. It worked time and time again with about 35 students. (I don’t recommend telling a journal reviewer that you only need 35 pilot testing participants because some guy on Methodspace said so. The more participants the better, but if all you can get are 60 participants, it may be enough, especially if your survey is short [about 8-15 questions].)
After collecting pilot data, enter the responses into a spreadsheet and clean the data. Here is an important tip: Have one person read the values while another enters the data. Having one person read and enter data is highly prone to error. After entering the data you will want to reverse code negatively phrased questions. When used sparingly, negatively phrased questions can be very useful for checking whether participants filled out your survey in a reckless fashion. If they read the question carefully, their responses to negatively phrased questions should be consistent with responses to similar positively phrased questions. If they are not consistent you might consider tossing out a person’s responses. It is also wise to check maximum and minimum values for the entire dataset. If you used a 5-point Likert-style scale but find a response of 6 you’ve probably identified a data entry error.
Identify underlying components using principal components analysis (PCA). Component or factor loadings, as they are sometimes called, tell you what factors are being measured by your questions. Questions that measure the same thing should load onto the same factors. Factor loadings range from -1.0 to 1.0. When grouping factor loadings I usually look for values that are ±0.60 or higher, although this varies depending on what the rest of the loadings look like. Sometimes there will be surprises. Occasionally a question will not load onto any factors very well. The fun part is determining what the factors represent by looking for common themes in the questions that load onto the same factors. If you identify 3 factor-themes, you can be assured that your survey is at least measuring three things. Validity is measuring what you purport to be measuring, therefore this step validates what your survey is really measuring. Finally, questions loading onto the same factors can be aggregated (i.e., combined) and compared during the final data analysis phase. (A word of caution: Don’t attempt PCA by yourself if you are inexperienced. Have someone skilled in PCA analysis guide you through the process or have good resources on hand.)
Check the internal consistency of questions loading onto the same factors. This step basically checks the correlation between questions loading onto the same factor. It is a measure of reliability in that it checks whether the responses are consistent. A standard test of internal consistency is Cronbach’s Alpha (CA). Cronbach Alpha values range from 0 – 1.0. In most cases the value should be at least 0.70 or higher although a value from 0.60 to 0.70 is acceptable. What should you do if you have a low value? A nice function in some programs is telling you the CA value after removing a question. IBM SPSS calls it “scale if item deleted.” You might consider deleting a question if doing so dramatically improves your CA. (As with PCA, you should seek assistance from a statistician or a good resource if you are new to testing internal consistency).
The final step is revising the survey based on information gleaned from the PCA and CA. Consider that even though a question does not adequately load onto a factor, you might retain it because it is important. You can always analyze it separately. If the question is not important you can remove it from the survey. Similarly, if removing a question greatly improves a CA for a group of questions, you might just remove it from its factor loading group and analyze it separately. If your survey undergoes minor changes it is probably ready to go. If there are major changes you may want to repeat the pilot testing process. Repeat pilot testing is warranted whenever you start with many more questions than are included in the final version (e.g., pilot testing 50 questions and then narrowing the field to 10 questions).

(I strongly recommend running PCA and CA again after completing the formal data collection phase [i.e., after you use your questionnaire to collect “real” data]. You want to make sure that you get the same factor loading patterns.)

When reporting the results of your study you can claim that you used a questionnaire whose face validity was established experts. You should also mention that it was pilot tested on a subset of participants. Report the results of the PCA and CA analyses. Should you report the results from the pilot testing or formal data collection? I think reporting PCA and CA results on the formal data is most useful. When reporting PCA results you may say something like “Questions 4, 6, 7, 8, and 10 loaded onto the same factor which we determined represents personal commitment to employer.” When reporting CA results you may say something like “The Cronbach’s Alpha for questions representing personal commitment to employer was 0.91, indicating excellent internal consistency in the responses.”

Summary of Steps to Validate a Questionnaire.

Establish Face Validity
Pilot test
Clean Dataset
Principal Components Analysis
Cronbach’s Alpha
Revise (if needed)
Get a tall glass of your favorite drink, sit back, relax, and let out a guttural laugh celebrating your accomplishment. (OK, not really.)

LEARNING FOR BETTER LIFE shared by TEACHER NEL VERSOZA

Saturday, December 7, 2019