Examples of using Big data sources in English and their translations into Vietnamese
{-}
-
Colloquial
-
Ecclesiastic
-
Computer
Big data sources do not mean the end of survey research.
Table 2.3: Examples of natural experiments using big data sources.
Far from distinctive, many big data sources have information that is sensitive.
Table 2.1: Studies of unexpected events using always-on big data sources.
As the work of Burke and Kraut illustrates, big data sources will not eliminate the need to ask people questions.
First, as I discussed in chapter 2, there are real problems with the accuracy, completeness,and accessibility of many big data sources.
Some researchers believe that big data sources, especially those from online sources, are pristine because they are collected automatically.
In fact, I think that many of the challenges and opportunities created by big data sources follow from just one“W”: Why.
To conclude, many big data sources are drifting because of changes in who is using them, in how they are being used, and in how the systems work.
Researchers could, of course, do this in past, but in the digital age, the scale is completely different,a fact that has been proclaimed repeatedly by many fans of big data sources.
Before concluding, I think that it is worth considering that big data sources may have an important effect on the relationship between data and theory.
Although things are not yet settled, I expect that the third era of survey research will be characterized by non-probability sampling, computer-administered interviews,and the linkage of surveys to big data sources(table 3.1).
Two features of big data sources- their always-on nature and their size- greatly enhances our ability to learn from natural experiments when they occur.
I chose to write the book this way because I wanted to provide a comprehensive view of social research in the digital age,including big data sources, surveys, experiments, mass collaboration, and ethics.
But, as was described in chapter 2, big data sources may not be accurate, they may not be collected on a sample of interest, and they may not be accessible to researchers.
Further, although the earlier eras were characterized by their approaches to sampling and interviewing, I expect that the third era of survey research willalso be characterized by the linkage of surveys with big data sources(Table 3.1).
In other words, even though some big data sources are non-reactive, they are not always free of social desirability bias, the tendency for people to want to present themselves in the best possible way.
As I will show in this chapter, the digital age creates many exciting opportunities for survey researchers to collect data more quickly and cheaply, to ask different kinds of questions,and to magnify the value of survey data with big data sources.
The remainder of the chapter begins by arguing that big data sources will not replace surveys and that the abundance of data increases- not decreases- the value of surveys(Section 3.2).
In their paper, Ansolabehere and Hersh go through a number of steps to check the results of these two steps- even though some of them are proprietary- and these checks might be helpful for other researcherswishing to link survey data to black-box big data sources.
In other words, even though some big data sources are nonreactive, they are not always free of social desirability bias, the tendency for people to want to present themselves in the best possible way.
De Waal, Puts, and Daas(2014) describe statistical data editing techniques developed for survey data andexamine to which extent they are applicable to big data sources, and Puts, Daas, and Waal(2015) presents some of the same ideas for a more general audience.
Third, when survey data collection is combined with big data sources- something that I think will become increasingly common, as I will argue later in this chapter- additional ethical issues can arise.
These four examples all show that a powerfulstrategy in the future will be to enrich big data sources, which are not created for research, with additional information that makes them more suitable for research(Groves 2011).
In fact, my hope is that big data sources will enable researchers to make more within-sample comparisons in many nonrepresentative groups, and my guess is that estimates from many different groups will do more to advance social research than a single estimate from a probabilistic random sample.
These four examples all show that a powerfulstrategy in the future will be to enrich big data sources, which are not created for research, with additional information that makes them more suitable for research(Groves 2011).
Given that more and more of our behavior is captured in big data sources, such as government and business administrative data, some people might think that asking questions is a thing of the past.
This chapter has three parts. First, in section 2.2,I describe big data sources in more detail and clarify a fundamental difference between them and the data that have typically been used for social research in the past. Then, in section 2.3, I describe ten common characteristics of big data sources.