Technology for Equitable Entity Disambiguation & Linkage

TEEDL Lab Mission

Our mission is to pioneer innovative computational strategies that effectively identify individual entities, thereby enabling seamless interconnection of their information across diverse databases. We are committed to promoting equity in these processes, ensuring fair and impartial data handling for all.


Who Leads the TEEDL Lab

Jinseok Kim ,  a Research Assistant Professor at the University of Michigan Institute for Social Research and School of Information,  work with a group of data scientists and students from various disciplines to realize the TEEDL lab's mission.


What TEEDL Lab Does

We are dedicated to the conception, development, and rigorous testing of computational methods and tools. These are primarily aimed at resolving ambiguities in names within expansive publication records, subsequently linking these clarified names to corresponding entities within various databases, including administrative records, patents, and funding information. A significant facet of our approach is the emphasis on equity. We meticulously ensure that fairness is ingrained in every step of the entity disambiguation and linkage procedures. 


Why TEEDL Matters

Current methods employed in named entity disambiguation often overlook the nuanced differences in name ambiguity arising from diverse factors such as gender, ethnicity, and cultural contexts. This oversight can yield misleading results. For instance, female scholars who adopt a different surname post-marriage may encounter underestimation of their productivity due to variances in name records across their publications. Similarly, East Asian scholars, often having common names, face the risk of their identities merging into inaccuracies. Additionally, publication records of Hispanic scholars frequently feature inconsistent names, resulting in fragmented author entities across publications.

Our objective is to create disambiguation and linkage models that are sensitive to these variations, thereby ensuring equal representation for names across genders, ethnicities, and cultures. This initiative is key to fostering a fair assessment of academic contributions and promoting equitable policy-making in the realm of science.


How We Do It

We produce high-quality ground truth data, meticulously integrating considerations of equity throughout the process. Utilizing state-of-the-art machine learning and natural language processing techniques, we craft bespoke disambiguation and linkage models. These models are tailored to accommodate diverse ambiguity patterns and characteristics, paving the way for more accurate and equitable data representation.