From Education to Innovation and Development

The objective of this project is part of a broader study, both national and international, of the productive, scientific, and technological development of nations that we have carried out based on industrial, scientific, and patent-production data disaggregated respectively by industrial categories, scientific fields, and technological codes [1–5]. We are now considering extending these methodologies on a more detailed geographical level, studying what is happening in different areas of the various countries. As for Italy, this implies the study of individual regions for which we have already obtained data on industrial and technological production. To gain a broader and more precise perspective on the developmental prospects of the various Italian regions at the same time, we would like to correlate the data on industrial and technological production with those concerning the production of human capital of the population segment that engages in advanced studies and which is necessarily the main actor of scientific and technological development and, therefore, also related economic activity. The main sources of data to which we already have access are: Scival and Microsoft Academic Graph for scientific production, Patstat for patents, Istat for industrial production at national and regional levels and Comtrade of the United Nations for industrial production in various countries.

In this context, Almalaurea’s data and analyses play a key role because they make it possible to disaggregate the flows of production and use of knowledge not only in the various scientific and technological sectors but also in the various geographical areas of the country. Founded in 1994, Almalaurea is an Interuniversity Consortium which represents 76 universities and monitors about 90% of the overall graduates who leave the Italian university system every year. In particular, Almalaurea annually carries out two census surveys on the profile and employment status of graduates one, three, and five years after graduation, monitors the study paths of students, and analyzes the characteristics and performance of graduates on both the academic and employment front, enabling the comparison between various courses and study locations. To date, Almalaurea has built a database of more than three million graduates. Starting from these data, we would like to understand in particular which are the scientific sectors, and more generally the academic ones, that contribute most, in the various geographical areas, to the production of human capital and which are the scientific and technological areas, on a regional level, that have more potential for growth in light of the flow of specific knowledge developed there.

In this context, the Almalaurea data represent the magnifying glass that allows us to understand the key role of training as a driving force for economic development. In particular, it is our intention also to cross the Almalaurea data with the data provided by the Statistical Office of Miur [1] on the structural specificities of the various universities: This comparison can enable us to understand the criticalities and strengths of the various universities and geographic regions. Through the Almalaurea data that provide the socioeconomic profiles of students and their distribution at the end of the university course and several years after graduation, we can correlate the data of university users, the heterogeneity of the educational offerings, and the statuses of technological and industrial areas of each geographical area with the different professional outlets and the employment situation. For this purpose, we need access to the disaggregated and anonymized data of Almalaurea to create groupings on the regional and municipal levels.


We intend to characterize the trajectories of training and employment of students for the dual purpose of:

1) Identifing the most relevant variables and combinations of variables to predict student trajectories

2) Producing personalized recommendations on the choice of study path, conditioned by the starting conditions, professional aspirations, and geographical contexts.

To this end, the Almalaurea data will be combined with multiple additional sources of socioeconomic, geographical, innovation, and university characteristics data, which will serve to deepen the starting context, the university pathway, and the entry into the labor market with geographical and economic sectors. A schematic representation of the approach we plan to apply to the problem is presented in the accompanying figure.

From a methodological point of view, we intend to use statistical, network theory and Machine-Learning approaches to select and build the most relevant combinations of variables in terms of trajectory-prediction skills.

The research group, based at CREF, will also be composed of researchers from the CNR and from the Physics Department of Sapienza University. This group, as a whole, has already conducted many studies regarding the characterization of the level of economic, technological, and scientific development of the various countries, about which we include a list of relevant references in the scientific literature.


In recent years we have coordinated several scientific projects both in Europe and nationally on these issues:

  • GROWTHCOM: “Growth and Innovation Policy-modeling: Applying Next Generation Tools, Data, And Economic Complexity Ideas” Funding Body: European Commission FP7-ICT-2013-10 (Challenge 5.4 for Governance and Policy Modeling) (2013-2016, EUR 2M ) cordis ISC Coordinator: Luciano Pietronero / Andrea Gabrielli. Web Site:
  • “CRISISLAB: Analytics for crisis prediction and management”, ISC Coordinator: Luciano Pietronero, National Coordinator: Luciano Pietronero, Funding Body: Italian Government (CNR Interest Projects) Partners: IMT – Institute for Advanced Studies Lucca (Italy)

Publications (selection)