Workshop on Application of Big Data in Scientometrics


The Data4Impact-led workshop focused on the application of big data techniques to improve the monitoring of R&I activities and assessment of their impact. Data4Impact aims to track the legacy and impact of research activities after the end of public funding. To the best of our knowledge, it is the first project that attempts to track impact pathways and establish links between EU research activities and health innovations and products which are currently on the market. Deriving indicators from a multitude of data sources, the project has developed an online analysis platform that aims to better suit the needs of research funders, policymakers, researchers and the society at large. The Data4Impact platform shows a series of results and indicators for 40+ research programmes in the health domain. The prototype platform was presented at the ISSI 2019 conference in Rome for the first time.


Recent economic and societal developments created a strong demand for additional evidence on the performance of research and innovation (R&I) systems, as well as the societal impact they bring. However, this increased demand is not being met by the current R&I indicators, as data collection typically stops at the end of research funding, while most results and impacts materialise in the medium- and long-term of the R&I lifecycle.

Data4Impact is a H2020 research project funded by the European Commission which addresses this problem by capitalising on the latest technological developments in big data technologies and analytics (Natural Language Processing and Machine Learning). In order to provide better R&I monitoring and impact assessment, Data4Impact has combined large volumes of structured and unstructured data from different sources and applied its methodology on publicly-funded health research in EU member states over the course of the project. A series of indicators were developed on the performance and societal impact of 40+ research programmes in the health domain. The comprehensive set of data and indicators combine publication, patent, company/innovation, clinical guidelines, project monitoring, as well as various social media/media and other types of online data.

Data4Impact structured this workshop around its Analytical Model of Societal Impact Assessment (AMOSIA). The analytical model is constructed around four distinct phases of the research lifecycle, including input, throughput, output, and impact as shown in the figure below. Relying on novel big data techniques, Data4Impact has gathered data for each analytical phase. For a detailed overview of the linkages between these data collected across three dimensions of impact, including academic, economic and societal impact, please refer to the Data4Impact booklet.

During the workshop, participants were invited to an interactive session to reflect on the developed methodology and indicators in particular as they relate to the functionality of the Data4Impact end-product. The open data and visualisation tool was used to:
1. Identify and monitor the direct links between research activities and the resulting technological innovations and impact, and
2. Examine the evolution of different health-related topics over time as outcomes of research and technology, and with regard to the public interest.

The political momentum and the current R&I system present excellent opportunities to foster the worldwide debates on research impact assessment. Expert perspectives are particularly essential in these debates, therefore the Data4Impact consortium was excited to organise this interactive workshop to receive valuable feedback and advance the discussion in the area of R&I monitoring.

Further information

For more information about the ISSI 2019 Conference, please visit the conference website.

Contact us

If you have any questions about the Data4Impact workshop, please contact Sonata Brokeviciute at sonata[at]