Chair for Statistics and Data Science in Social Sciences and the Humanities (SODA)
print


Breadcrumb Navigation


Content

Modernizing Data Collection and Measurement

Team-Lead: Malte Schierholz

Team: Jacob BeckFelix Henninger, Markus Herklotz, Olga Kononykhina, Jan Simson

Overview

New Methods for Job and Occupation Classification

IAB-SMART: Collecting Data for Labor Market Research Through a Smartphone App

Concerns and Willingness to Use Smartphones for Data Collection

Reddit data as a new tool and source for social research

Supplementing and substituting survey data with big data

Modernizing Migration Measures: Combining Survey and Tracking Data Collection Among Asylum-Seeking Refugees

 

 

    New Methods for Job and Occupation Classification

    Currently, most surveys ask for occupation with open-ended questions. The verbatim responses are coded afterwards into a classification with hundreds of categories and thousands of jobs, which is error-prone, time-consuming, and costly. The project investigates how to improve this process by asking response-dependent questions during the interview. We developed an instrument for interview coding of occupations and successfully tested it in a CATI and a CAPI survey. Results are promising: between 55 and 85 percent of the text responses can be coded with the newly developed tool, and there is no evidence that the use of this tool is an additional burden to interviewers and respondents. During the second funding phase, we aim to further develop this method of occupation coding. An in-depth analysis of various error sources is required to inform and improve the quality and the validity of measurement. The improvements to the instrument will reflect the specific requirements of computer-assisted personal interviews, computer-assisted telephone interviews, and web surveys. Results from analyzing interviewer behavior will inform the improvement of training documents to minimize the influence of interviewers on the survey results. The new version of the instrument allows for continuous improvement of the suggested answer options. The results will be made available in the form of an open-source software so that researchers can use our instrument for their own surveys.

    Project team: Frauke Kreuter, Malte Schierholz, Olga Kononykhina, Jan Simson

    Publications:

    • Schierholz, Malte. 2019. "New methods for job and occupation classification". Dissertation, Mannheim. https://madoc.bib.uni-mannheim.de/50617/.
    • Schierholz, Malte, Miriam Gensicke, Nikolai Tschersich, and Frauke Kreuter. 2018. "Occupation Coding during the Interview." Journal of the Royal Statistical Society: Series A (Statistics in Society) 181 (2): 379–407. https://doi.org/10.1111/rssa.12297.
    • Schierholz, Malte. 2018. "Eine Hilfsklassifikation mit Tätigkeitsbeschreibungen für Zwecke der Berufskodierung." AStA Wirtschafts- und Sozialstatistisches Archiv 12 (3–4): 285–98. https://doi.org/10.1007/s11943-018-0231-2.
    • Schierholz, Malte; Brenner, Lorraine; Cohausz, Lea; Damminger, Lisa; Fast, Lisa; Hörig, Ann-Kathrin; Huber, Anna-Lena; Ludwig, Theresa; Petry, Annabell; Tschischka, Laura (2018): Eine Hilfsklassifikation mit Tätigkeitsbeschreibungen für Zwecke der Berufskodierung * Leitgedanken und Dokumentation. (IAB-Discussion Paper, 13/2018), Nürnberg, 43 S.
    • Schierholz, Malte, and Matthias Schonlau. 2020. "Machine Learning for Occupation Coding—a Comparison Study." Journal of Survey Statistics and Methodology. https://doi.org/10.1093/jssam/smaa023.

    IAB-SMART: Collecting Data for Labor Market Research Through a Smartphone App

    Smartphones are multifunctional tools, which can be used for personal communication, planning, entertainment, information search, and many other things in our daily lives. Many people cannot imagine a life without their smartphones, and they carry them around with them all the time. The omnipresence of smartphones makes these devices interesting for researchers who want to collect data to measure human behavior through sensors built-in on a smartphone. Together with the Institute for Employment Research (IAB) we developed the IAB-SMART app to evaluate the opportunities and challenges when using smartphones for data collection in social research, more specifically on labor market research. The IAB-SMART app passively collects mobile data, such as geolocation of users, activities, social interactions, and online behavior, and launches in-app surveys. In addition, we are able to combine these data (given the user’s consents) with survey data from a longstanding panel survey (PASS) and administrative data from the Institute for Employment Research (IAB) containing the employment history of users. The passive measures allow researchers to take a wider perspective on labor market-related behavior such as home office productivity and job search strategies. Furthermore, the combination of sensor, survey, and administrative data will help us to understand how (un)employment affects daily life. In addition to these substantial questions, this project helps us answer methodological research questions on the quality of the data collected through this method.

    Project Team: Frauke Kreuter, Florian Keusch, Georg-Christoph Haas, Mark Trappmann, Sebastian Bähr 

    Publications: 

    • Bähr, S., Haas, G.-C., Keusch, F., Kreuter, F. & Trappmann, M. (2020). Missing data and other measurement quality issues in mobile geolocation sensor data. Social Science Computer Review : SSCORE, 1–24. https://doi.org/10.1177/0894439320944118
    • Haas, G.-C., Kreuter, F., Keusch, F., Trappmann, M. & Bähr, S. (2020). Effects of incentives in smartphone data collection. In C. A. Hill (eds.), Big data meets survey science : a collection of innovative methods (S. 387–414). Hoboken, NJ: John Wiley & Sons. https://doi.org/10.1002/9781118976357.ch13
    • Haas, G.-C., Trappmann, M., Keusch, F., Bähr, S. & Kreuter, F. (2020). Using geofences to collect survey data: Lessons learned from the IAB-SMART study. Survey Methods : Insights from the Field, 2020(10/12/20), 1–12. https://doi.org/10.13094/SMIF-2020-00023
    • Keusch, F., Bähr, S., Haas, G.-C., Kreuter, F. & Trappmann, M. (2020). Coverage error in data collection Ccombining mobile surveys with passive measurement using apps: Data from a German national survey. Sociological Methods & Research : SMR. https://doi.org/10.1177/0049124120914924
    • Kreuter, F., Haas, G.-C., Keusch, F., Bähr, S. & Trappmann, M. (2019). Collecting survey and smartphone sensor data with an App: Opportunities and challenges around privacy and informed consent. Social Science Computer Review : SSCORE, 38(5), 533–549. https://doi.org/10.1177/0894439318816389
    • Bähr, S., Haas, G.-C., Keusch, F., Kreuter, F. & Trappmann, M. (2018). IAB-SMART-Studie: Mit dem Smartphone den Arbeits­markt erforschen. IAB-Forum : Das neue Onlinemagazin des Instituts für Arbeits­markt- und Berufsforschung, 2018, 09.01.2018.

    Concerns and Willingness to Use Smartphones for Data Collection

    Smartphone use is on the rise worldwide, and researchers are exploring novel ways to leverage the capabilities of smartphones for data collection. Mobile surveys, i.e., surveys that are filled out on a smartphone web browser or through an app, are already extensively studied. Research on the use of other features of smartphones that allow researchers to automatically measure an even broader set of characteristics and behaviors of users that go far beyond the collection of mere self-reports is still in its infancy. For example, smartphone users can now be asked to take pictures of receipts to better measure expenditure, to agree to tracking of movements to create exact measures of mobility and transportation or to automatically log app use, Internet searches, and phone calling and text messaging behavior to measure social interaction. These forms of data collection provide richer data (because it can be collected in much higher frequencies compared to self-reports) and have the potential to decrease respondent burden (because fewer survey questions need to be asked) and measurement error (because of reduction in recall errors and social desirability). However, agreeing to engage in these forms of data collection from smartphones is an additional step in the consent process, and participants might feel uncomfortable sharing specific data with researchers due to security, privacy, and confidentiality concerns. Moreover, users might have differential concerns with different types of data collection on smartphones, and thus be more willing to engage in some of these data collection tasks than in others. In addition, participants might differ in their skills of smartphone use and thus feel more or less comfortable using smartphones for research, leading to bias due to differential nonparticipation of specific subgroups. In a series of studies, we measure concerns and willingness when it comes to participation in smartphone data collection.

    Project Team: Frauke Kreuter, Florian Keusch, Bella Struminskaya, Mick Couper and Christopher Antoun

    Publications:

    • Keusch, F., Struminskaya, B., Kreuter, F. & Weichbold, M. (2020). Combining active and passive mobile data collection : A survey of concerns. In C. A. Hill (eds.), Big data meets survey science : a collection of innovative methods (S. 657–682). Hoboken, NJ: John Wiley & Sons. https://doi.org/10.1002/9781118976357.ch22
    • Struminskaya, B. & Keusch, F. (2020). Editorial: From web surveys to mobile web to apps, sensors, and digital traces. Survey Methods : Insights from the Field, 2020(10/12/20), 1–7. https://doi.org/10.13094/SMIF-2020-00015
    • Struminskaya, B., Lugtig, P., Keusch, F. & Höhne, J. K. (2020). Augmenting surveys with data from sensors and apps: Opportunities and challenges. Social Science Computer Review : SSCORE. https://doi.org/10.1177/0894439320979951
    • Keusch, F., Struminskaya, B., Antoun, C., Couper, M. P. & Kreuter, F. (2019). Willingness to participate in passive mobile data collection. Public Opinion Quarterly : POQ, 83(S1), 210–235. https://doi.org/10.1093/poq/nfz007

    Reddit data as a new tool and source for social research

    The use of non-traditional data (i.e., data collected from non-probability sample surveys, passive data, or Big Data) to supplement or replace survey data is growing. However, these data are not without weaknesses; they suffer from their own sources of error, access challenges, and confidentiality concerns. This project uses survey data collected on and posts scraped from Reddit.com to answer three research questions: 1) Can social media data be used to accurately assess social attitudes? 2) What are the sources of error in social media data? 3) What variability in the conclusions drawn from these data is introduced by the researcher’s choice in analytic methods? In addition to the research questions, this project also offers some descriptions of the data and access to it so future Reddit data users can further refine their budgets, timelines, and expectations.

    Project Team: Ruben Bach, Ashley Amaya, Frauke Kreuter, Florian Keusch and Vlad Achimescu

    Publications: 

    • Achimescu, V. und Chachev, P. D. (2021). Raising the flag: Monitoring user-perceived dis­information on reddit. Information, 12, 4. https://www.mdpi.com/2078-2489/12/1/4
    • Amaya, A., Bach, R. L., Keusch, F. und Kreuter, F. (2019). New data sources in social science research: Things to know before working with Reddit data. Social Science Computer Review : SSCORE, 1-10. https://doi.org/10.1177/0894439319893305
    • Amaya, A., Bach, R. L., Kreuter, F. und Keusch, F. (2020). Measuring the strength of attitudes in social media data. In Big data meets survey science : a collection of innovative methods (S. 163-192). Hoboken, NJ: John Wiley & Sons. https://doi.org/10.1002/9781118976357.ch5

    Supplementing and substituting survey data with big data

    For many years, surveys were the standard tool to measure attitudes and behavior for social science research. In recent years, however, researchers have shifted their focus to new sources of data, especially in the online world. For instance, researchers have analyzed the potentials of replacing or supplementing survey data with data from Twitter, smart devices (e.g., smartphones or fitness tracker) and data from other places where people leave digital traces. In this project, we explore the feasibility of using behavioral records of individuals’ online activities to study political attitudes and behavior. Specifically, we explore the potentials of online behavioral data to substitute traditional survey data by inferring attitudes and behavior from the online data. In addition, we analyze how complete such data are as users may switch off data collection during certain activities they do not want recorded. Moreover, we study how (social) media use shapes attitudes and behavior in the offline world. This project is done in collaboration with Ashley Amaya (RTI International).

    Project Team: Ruben Bach, Christoph Kern, Ashley Amaya, Frauke Kreuter, Florian Keusch, Jan Hecht and Jonathan Heinemann

    Publications:  

    • Amaya, A., Bach, R. L., Kreuter, F. und Keusch, F. (2020). Measuring the strength of attitudes in social media data. In Big data meets survey science : a collection of innovative methods (S. 163-192). https://doi.org/10.1002/9781118976357.ch5
    • Bach, R. L. und Wenz, A. (2020). Studying health-related internet and mobile device use using web logs and smartphone records. PLOS ONE, 15, e0234663. https://doi.org/10.1371/journal.pone.0234663
    • Bach, R. L., Kern, C., Amaya, A., Keusch, F., Kreuter, F., Hecht, J. und Heinemann, J. (2019). Predicting voting behavior using digital trace data. Social Science Computer Review : SSCORE. https://doi.org/10.1177/0894439319882896
    • Cernat, A. & Keusch, F. (2020). Do surveys change behaviour? Insights from digital trace data. International Journal of Social Research Methodology : IJSRM. https://doi.org/10.1080/13645579.2020.1853878