Mar 21, 2022

A Research Conversation across Disciplines: History and Data Science with David Weir, Gerhard Wolf and Tim Buchen.

A Research Conversation across Disciplines: History and Data Science with David Weir, Gerhard Wolf and Tim Buchen.

University of Sussex Campus : Sussex Humanities Lab, UoS, Silverstone Building, Arts Road, BN1 9RG

Speaker: David Weir, Gerhard Wolf and Tim Buchen.

Part of the series: SHL Open Workshop

This conversation brings together two perspectives. The first is provided by Gerhard Wolf and Tim Buchen; historians working on the history of mid-twentieth-century migration. And the second, is provided by David Weir, co-director of the Tag Lab in the School of Engineering and Informatics. Together they will explore both the unique body of evidence that forms the basis of Wolf and Buchen’s project, and the approaches that could be applied from informatics.

The Project:
At the end of the Second World War, more than 12 million Germans found themselves on the wrong side of newly drawn borders and were expelled westwards. Some of them left Europe for good with more than 50,000 ethnic Germans from Eastern Europe emigrating to the US within the context of the 1948 DP Act. To be eligible, however, they not only had to provide a job sponsor in the US, but also undergo a thorough political screening process. Interestingly, this process was not entirely unfamiliar to them. When following the call of the Nazis to leave their homes in Eastern Europe and settle in the occupied territories of Poland they had also been presented with a questionnaire to establish their suitability for citizenship. One might think that the criteria that the Nazi state applied to judge whether applicants could become members of the German Volksgemeinschaft would be radically different to those applied by a liberal US government and would not be entirely wrong. More unsettling are the many similarities, however, that we will argue point to the population policies of modern states at the middle of the 20th Century.

The Tools:

A number of data science methods will be discussed that could potentially be applied to an available dataset of 10,000 questionnaires (8,000 successful and 2,000 unsuccessful applications). Perhaps the most fundamental of these involves the use of massive pre-trained language models that can be used as the basis for assessing the ‘semantic distance’ between words, phrases, sentences or documents. This can be useful for clustering, classification and any other tasks involving textual data. A number of text analysis tools could be considered, performing tasks such as named entity recognition, named entity linking and geoparsing.

This event will be both live, with limited seating, in the Sussex Humanities Lab and available virtually via zoom.

A Research Conversation across Disciplines: History and Data Science with David Weir, Gerhard Wolf and Tim Buchen.

A Research Conversation across Disciplines: History and Data Science with David Weir, Gerhard Wolf and Tim Buchen.

About this page

Campus Navigator