Data Collection Techniques

Data gathering is the interaction between the software engineer (in this case a business analyst) and the customers (including users). There are many techniques for gathering data, including interviews, meetings, observations, questionnaires, reviewing software, reviewing internal documents, and reviewing external documents. Read this section to differentiate between them and pay attention to the main advantages and disadvantages of each one of these techniques.

5. Questionnaire

A questionnaire is a paper-based or computer-based form of interview. Questionnaires are used to obtain information from a large number of people. The major advantage of a questionnaire is anonymity, thus leading to more honest answers than might be got through interviews. Also, standardized questions provide reliable data upon which decisions can be based.

Questionnaire items, like interviews, can be either open-ended or closed-ended. Recall that open-ended questions have no specific response intended. Open-ended questions are less reliable for obtaining complete information about factual information and are subject to recall difficulties, selective perception, and distortion by the person answering the question. Since the interviewer neither knows the specific respondent nor has contact with the respondent, open-ended questions that lead to other questions might go unanswered. An example of an open-ended question is: "List all new functions which you think the new application should do".

A closed-ended question is one which asks for a yes/no or graded specific answer. For example, "Do you agree with the need for a history file?" would obtain either a yes or no response. Questionnaire construction is a learned skill that requires consideration of the reliability and validity of the instrument. Reliability is the extent to which a questionnaire is free of measurement errors. This means that if a reliable questionnaire were given to the same group several times, the same answers would be obtained. If a questionnaire is unreliable, repeated measurement would result in different answers every time.

Questionnaires that try to measure mood, satisfaction, and other emotional characteristics of the respondent tend to be unreliable because they are influenced by how the person feels that day. You improve reliability by testing the questionnaire. When the responses are tallied, statistical techniques are used to verify the reliability of related sets of questions.
Validity is the extent to which the questionnaire measures what you think you are measuring. For instance, assume you want to know the extent to which a CASE tool is being used in both frequency of use and number of functions used. Asking the question, "How well do you use the CASE tool?" might obtain a subjective assessment based on the individual's self-perception. If they perceive themselves as skilled, they might answer that they are extensive users. If they perceive themselves as novices, they might answer that they do not use the tool extensively. A better set of questions would be "How often do you use the CASE tool?" and "How many functions of the tool do you use? Please list the functions you use". These questions specifically ask for numbers which are objective and not tied to an individual's self-perception. The list of functions verifies the numbers and provides the most specific answer possible.

Some guidelines for developing questionnaires are summarized in Table 4-6 and discussed here. First, determine the information to be collected, what facts are required, and what feelings, lists of items, or nonfactual information is desired. Group the items by type of information obtained, type of questions to be asked, or by topic area. Choose a grouping that makes sense for the specific project.

For each piece of information, choose the type of question that best obtains the desired response. Select open-ended questions for general, lists, and nonfactual information. Select closed-ended questions to elicit specific, factual information, or single answers.

Compose a question for each item. For a closed-ended question, develop a response scale. The five-response Likert-like scale is the most frequently used. The low and high ends of the scale indicate the poles of responses, for instance, Totally Disagree and Totally Agree. The middle response is usually neutral, for instance, Neither Agree Nor Disagree. Examine the question and ask yourself if it has any words that might not be interpreted as you mean them. What happens if the respondent does not know the answer to your question? Do you need a response that says, I Don't Know? Is a preferred response hidden in the question? Are the response choices complete and ordered properly? Does the question have the same meaning for every department and possible respondent? If the answers to any of these questions indicate a problem, reword the question to remove the problem.

If you have several questions that ask similar information, examine the possibility of eliminating one or more items. If you are doing statistical analysis of the answers, you might want similar questions to see if the responses are also similar (i.e., are correlated). If you are simply tallying the responses and acting on the information, try to use one question for each piece of information needed. The minimalist approach keeps the questionnaire shorter and easier to tally.

TABLE 4-6 Guidelines for Questionnaire Development

1. Determine what facts are desired and which people are best qualified to provide them.
2. For each fact, select either an open-ended or close-ended question. Write several questions and choose the one or two that most clearly ask for the information.
3. Group questions by topic area, type of question, or some context-specific criteria.
4. Examine the questionnaire for problems:

More than two questions asking the same information
Ambiguous questions
Questions for which respondents might not have the answer
Questions that bias the response
Questions that are open to interpretation by job function, level of organization, etc.
Responses that are not comprehensive of all possible answers
Confusing ordering of questions or responses

5. Fix any problems identified above.

6. Test the questionnaire on a small group of people (e.g., 5-10). Ask for both comments on the questions and answers to the questions.

7. Analyze the comments and fix wording ambiguities, biases, word problems, etc. as identified by the comments.

8. Analyze the responses to ensure that they are the type desired.

9. If the information is different than you expected, the questions might not be direct enough and need rewording. If you don't get useful information that you don't already know, reexamine the need for the questionnaire.

10. Make final edits, print in easy-to-read type. Prepare a cover letter.

11. Distribute the questionnaire, addressing the cover letter to the person by name. Include specific instructions about returning the questionnaire. Provide a self-addressed, stamped envelope if mailing is needed.

Pretest the questionnaire on a small group of representative respondents. Ask them to give you feedback on all of the items that they don't understand, that they think are ambiguous, badly worded, or have responses that do not fit the item. Also ask them to complete the questionnaire. The answers of this group should highlight any unexpected responses that, whether the group identified a problem or not, mean that the question was not interpreted as intended. If the pretest responses do not provide you with new information needed to develop the project, the questionnaire might not be needed or might not ask the right questions. Reexamine the need for a questionnaire and revise it as needed. Finally, change the questionnaire based on the feedback from the test group. The pretest and revision activities increase the validity of the questionnaire.

Provide a cover letter for the questionnaire that briefly describes the purpose and type of information sought. Give the respondent a deadline for completing the questionnaire that is not too distant. For instance, three days is better than two weeks. The more distant the due date, the less likely the questionnaire will be completed. Include information about respondent confidentiality and voluntary questionnaire completion, if they are appropriate. Ideally, the questionnaire is anonymous and voluntary. To the extent possible, address the letter to the individual respondent.

Give the respondent directions about returning the completed questionnaire. If mailing is required, provide a stamped, self-addressed envelope. If interoffice mail is used, provide your mail stop address. If you will pick up responses, tell the person where and when to have the questionnaire ready for pickup.