From education to innovation and development

The objective of this project is part of a broader study, both nationally and internationally, of the productive, scientific and technological development of nations that we have carried out based on industrial, scientific and patent production data disaggregated respectively by industrial categories, scientific fields and technological codes [1-5]. We are now considering extending these methodologies to a more detailed geographical level, studying what is happening in different areas of the various countries. As for Italy, this implies the study of individual regions, for which we have already obtained data on industrial and technological production. In order to have a broader and more precise perspective on the development prospects of the various Italian regions at the same time, we would like to couple the data on industrial and technological production with those concerning the production of human capital of that part of the population that accesses studies. superior and which is necessarily the main actor of scientific and technological development and, therefore, also economic. The main sources of data to which we already have access are: Scival and Microsoft Academic Graph for scientific production, Patstat for patents, Istat for industrial production at national and regional level and Comtrade of the United Nations for industrial production in different countries.

In this context, Almalaurea’s data and analyzes play a key role as they make it possible to disaggregate the flows of production and use of knowledge not only in the various scientific and technological sectors but also in the various geographical areas of the country. Almalaurea is an Interuniversity Consortium founded in 1994 which represents 76 universities and monitors about 90% of the overall graduates who leave the Italian university system every year. In particular, Almalaurea annually carries out two census surveys on the profile and employment status of graduates 1, 3 and 5 years after graduation, monitors the study paths of students and analyzes the characteristics and performance of graduates on the academic and on the employment front, allowing the comparison between different courses and study locations. To date Almalaurea has built a database of more than 3 million graduates. Starting from these data, we would like to understand in particular which are the scientific sectors, and more generally academic, which contribute most, in the various geographical areas, to the production of human capital and which are the scientific and technological areas, at a regional level, that have more potential for growth in light of the flow of specific knowledge developed there.

In this context, the Almalaurea data represent the magnifying glass that allows us to understand the key role of training as a driving force for economic development. In particular, it is our intention to cross the Almalaurea data also with the data provided by the Statistical Office of Miur [1] on the structural specificities of the different universities: this comparison can allow us to understand the criticalities and strengths of the different universities and different areas geographic. Through the Almalaurea data, which provide the socio-economic profiles of students and their distribution at the end of the university course and several years after graduation, we can relate the data of university users, the heterogeneity of the educational offer, and the situation technological and industrial areas of each geographical area with the different professional outlets and the employment situation. For this purpose we need to have access to the disaggregated and anonymised data of Almalaurea to create groupings at regional and municipal level.


We intend to characterize the trajectories of training and employment of students with the dual purpose of:

1) Identify the most relevant variables and combinations of variables in order to predict student trajectories

2) produce personalized recommendations on the choice of study path, conditioned by the starting conditions, professional aspirations and geographical context.

To this end, the Almalaurea data will be combined with multiple additional sources of socio-economic, geographical, innovation and university characteristics data, which will serve to deepen the starting context, the university pathway and the entry into the labor market with geographical and economic sector. A schematic representation of the approach we plan to give to the problem is presented below.

From a methodological point of view we intend to use statistical, network theory and Machine-Learning approaches in order to select and build the most relevant combinations of variables in terms of trajectory prediction skills.

The research group, based at CREF, will also be composed of researchers from the CNR and from the Physics Department of Sapienza University. This group, as a whole, has already conducted many studies regarding the characterization of the level of economic, technological and scientific development of the various countries, about which we attach below a series of references in the scientific literature.


In recent years we have coordinated several scientific projects both in Europe and nationally on these issues:

  • GROWTHCOM: “Growth and Innovation Policy-modeling: Applying Next Generation Tools, Data, And Economic Complexity Ideas” Funding Body: European Commission FP7-ICT-2013-10 (Challenge 5.4 for Governance and Policy Modeling) (2013-2016, EUR 2M ) cordis ISC Coordinator: Luciano Pietronero / Andrea Gabrielli. Web Site:
  • “CRISISLAB: Analytics for crisis prediction and management”, ISC Coordinator: Luciano Pietronero, National Coordinator: Luciano Pietronero, Funding Body: Italian Government (CNR Interest Projects) Partners: IMT – Institute for Advanced Studies Lucca (Italy)

Publications (selection)