Examples of using Big data sources in English and their translations into Malay
{-}
-
Colloquial
-
Ecclesiastic
-
Computer
Surveys linked to big data sources(section 3.6).
Big data sources tend to have a number of characteristics in common;
Measurement is much less likely to change behavior in big data sources.
Far from distinctive, many big data sources have information that is sensitive.
Big data sources are everywhere, but using them for social research can be tricky.
In particular, I will focus on big data sources created by companies and governments.
The big data sources of today- and likely tomorrow- will tend to have 10 characteristics.
Third era Non-probability sampling Computer-administered Surveys linked to big data sources.
In conclusion, the big data sources of today(and tomorrow) generally have ten characteristics.
In fact, I think that many of the challenges and opportunities created by big data sources follow from just one“W”: Why.
To conclude, many big data sources are not representative samples from some well-defined population.
There is just too much tobe gained by linking survey data to the big data sources discussed in chapter 2.
In some cases, big data sources enable you to do this counting relatively directly(as in the case of New York Taxis).
If true,this would seem to severely limit what can be learned from big data sources because many of them are nonrepresentative.
Most big data sources are incomplete, in the sense that they don't have the information that you will want for your research.
In fact, people who have worked with big data sources know that they are frequently dirty.
Finally, I will describe tworesearch templates for linking survey data to big data sources(section 3.6).
As I described in chapter 2, most big data sources are inaccessible to researchers.
Many other big data sources also have information that is sensitive, which is part of the reason why they are often inaccessible.
As I'm describing thesecharacteristics you will notice that they often arise because big data sources were not created for the purpose of research.
In conclusion, big data sources, such as government and business administrative records, are generally not created for the purpose of social research.
For more on construct validity, see Westen and Rosenthal(2003),and for more on construct validity in big data sources, Lazer(2015) and Chapter 2 of this book.
To conclude, many big data sources are drifting because of changes in who is using them, in how they are being used, and in how the systems work.
Social scientists call this match construct validity andit is a major challenge with using big data sources for social research(Lazer 2015).
Another way in which researchers can use big data sources in survey research is as a sampling frame for people with specific characteristics.
Most social scientists are already familiar with the process of cleaning large-scale social survey data, but cleaning big data sources seems to be more difficult.
Big data sources and surveys are complements not substitutes so as the amount of big data increases, I expect that the value of surveys will increases as well.
(Note that this same activity also appears in chapter 6.) This activity will give you practice in data wrangling andthinking about natural experiments in big data sources.
First, I will argue that big data sources will not replace surveys and that the abundance of big data sources increases- not decreases- the value of surveys(section 3.2).
Nowcasting projects such as Google FluTrends also show what can happen if big data sources are combined with more traditional data that were created for the purposes of research.