the institute

The legacy of Enrico Fermi. The challenges of the future

about us
the organization

research

CREF promotes original and high-impact lines of research, based on physical methods, but with a strong interdisciplinary character and in relation to the main problems of the modern knowledge society.

Principal Research Lines
Individual research areas
History of Physics
LABORATORIES AND INFRASTRUCTURES
Research staff

third mission

The CREF was born with a dual soul: a research centre and a historical museum. Its aim is to preserve and disseminate the memory of Enrico Fermi and to promote the dissemination and communication of scientific culture.

next EVENT

Ore 11 AM

Seminar open to scholars

Young researchers

Higher education and projects for young researchers

Open positions...
Read more

news

News, publications, talks, press release

cover_saracco_twitter (1)
Intelligenza Artificiale

la ricerca

The socio-economic impact and technological innovation of artificial intelligence

Head of Research Line
Andrea Tacchella
Overview

Recent advances in artificial intelligence are rapidly changing how we process information and generate content. In particular, the models for the language Large Language Models – LLM, (ChatGPT, GPT4, LLaMA) and the models for generating images Diffusion Models – DM, (DALLE 2, Midjourney, StableDiffusion) have proven capable of producing content and responding to requests at levels comparable to those of a human being. These tools have the potential to profoundly change the way we manage, transform and produce information, content and, ultimately, work.

In this project, we will focus mainly on Language Models, which possess the greatest transformative potential for the labor market and at the same time, are best suited to being used as tools for economic and social analysis.

Language models like OpenAI’s GPT3, released in June 2020, have pushed the boundaries of natural language understanding and generation. With 175 billion parameters, GPT3 is one of the largest language models and has demonstrated impressive capabilities in various tasks, such as translation, question-answering and text completion. The GPT3 architecture is based on the transformer model, which uses self-attention mechanisms to capture long-range dependencies in text. In early 2023 OpenAI released GPT4, a multimodal model, capable of working simultaneously on images and text, which is estimated to be composed of around 100 trillion parameters.

In the same months, several open-source LLMs were released: models whose parameters were made public, unlike OpenAI models, which remain proprietary. The main open-source model available at the moment is LLaMA. A key feature of these open-source models is the ability to fine-tune them to specific tasks, achieving cutting-edge results in sentiment analysis, named entity recognition, and text classification. This allows you to exploit the linguistic capabilities of these models in a huge variety of tasks.

Creative processes are closely related to innovation dynamics in numerous systems belonging to complexity theory. In both, one of the characterising elements is the temporal expansion of the conceptual space due both to the inclusion of new and unexpected elements and autonomous processes linked to the exploration of the space of the possible. In general, modern Artificial Intelligence techniques have achieved extraordinary levels of recombination of the information in their knowledge base. ChatGPT or GPT4 currently manage to produce extremely convincing texts thanks to learning an enormous amount of training information and technological infrastructures of considerable size. However, although the generative capabilities of these systems are completely convincing, to date, these machines do not possess a specific ability to include new information or even simply to anticipate its arrival in a context of continuous learning. This limitation, which we tried to address through the adoption of mixed techniques – such as neural networks and reinforcement learning – has not led to solutions that can be used in applications on non-stationary systems. In particular, this problem takes on a very relevant role in all applications in which AI must establish a connection between human creativity – articulated in an artistic sense or as a process of innovation of ideas and development – to represent a tool useful and effective for supporting natural creative dynamics.

In the context of collaboration with AI tools, text production has a prominent role due to the results obtained in the field of Natural Language Processing NLP, such as the suggestion of semantically correct alternatives, the analysis of text readability and the generation of texts starting from a user request. This latest technology promises to radically change the writing process radically, providing an engine for the production of texts, the synthesis and the proposal of alternative rhetorical forms which, despite being opened to the public less than a year ago, is already used on a wide scale and is already influencing the workflow of many writers.

However, the use of these tools remains relegated to text analysis as a static form. When we talk about time, in the NLP community, we most often talk about the concept of codifying the succession of textual elements of a work and not about the writing process.

Purpose and Goals

In this project, we aim to analyze the transformative potential of these tools in three directions:

  • Their social and economic impact;
  • Their ability to increase creativity and innovativeness;
  • The possibility of using them as tools to increase our understanding of economic and social phenomena

we will articulate the research along multiple lines:

  • The Analysis of the impact on the labour market. We will study the dynamics of the skills required by the labor market and how this is affected by the availability of new AI tools. Thanks to the forecasting and continuous learning techniques described below, we intend to maintain a “real-time” analysis of the dynamics and integrate it with the production of forecasting scenarios.
  • Use of LLM for the analysis of economic and innovation dynamics. Thanks to the growing availability of open-source LLMs, the automated analysis of enormous quantities of freely available unstructured information, such as websites, scientific publications and patents, becomes possible. One of this project’s objectives is to use these tools, with appropriate fine-tuning, to extract dynamic signals on the processes of technological innovation and economic development.
  • Use of LLM to optimise green transition policies. We can use LLMs to connect policy prescriptions with the actual availability of capabilities in the territories. For example, it is possible to connect patents with European legislation on reducing industrial pollution, or it is possible to reconstruct the need for critical raw materials to implement these policies.
  • Exploring techniques for learning non-stationary systems. As mentioned above, including new information in AI systems is still an unresolved issue in the panorama of modern learning techniques. The main aim in this area is to create systems capable of implementing dynamic learning, which is conservative in memorising previously learned information while quickly including new things in continuous learning processes.
  • Support for artistic creativity. A class of AI systems equipped with the above characteristics can codify and include artistic innovation processes as a non-stationary evolutionary process. In this area, we want to create an AI system capable of understanding, at different levels of complexity, human creative procedures to develop a constructive dialogue for bidirectional man-machine support for creativity
  • Representation and prediction of evolutionary systems. The study of specific evolutionary areas, such as the diffusion of AI in different research areas or the development of technological codes in the case of technological patents, allows us to understand the historical nature of these evolutions and anticipate the following areas that will involve these processes. In this case, the AI, in addition to including the news of the historical series, must be able to build probabilistic scenarios compatible with the past and plausible for future predictions.
  • Creativity in collective writing processes in a hybrid person-AI context. The characteristics of innovativeness of generative artificial intelligence and the consequences of their recurring use in writing texts, such as the reduction of surprise, a quality of a time series that describes the unexpected character of a given element given the previous ones. The work models of the hybrid person-AI creative process, in terms of organisation of the work phases and distribution of objectives aimed at improving the creative activity in the terms defined by the user (increasing the final quality of the product, enjoyment of the process, greater probability of innovation, etc.). The creation of AI tools is suitable for coordinating and curating collaborative writing processes.
  • AI to enhance economic analysis: e.g. BloombergGPT
  • LLM for innovation analysis (patents, papers)
  • LLM for web scraping of economic activity data (services, job ads) – with IFC, AirBnB, Translated.com

Contents and Methods

Analysis of technological and innovation dynamics. The process of knowledge exploration is widely documented through the production of scientific articles, patents and code repositories. The vastness and variety of this production make it impossible to trace the dynamics of innovation with traditional methods, except for macroscopic trends. However, a detailed knowledge of these dynamics allows anticipating imminent discoveries. Using LLM systems, we will build a mathematical representation of innovation concepts (embeddings) expressed in documents (scientific articles, patents, code repositories). We will study their dynamics and design prediction models based on semantic proximity.

Mapping of green technological skills. Techniques for reducing polluting emissions are the subject of numerous regulations. In particular, the European legislation on the reduction of industrial emissions is considered a reference standard at a global level. We aim to use LLM to connect the technical specifications detailed in European legislation with the production of patents. This will allow detailed mapping of these processes’ technological footprints and the related skills’ geographical availability.

 

Critical raw materials. We aim to integrate studies on sustainable transition and the use of innovative artificial intelligence methodologies based on recent studies connecting the production of individual technologies (in particular for adaptation and mitigation of climate change – de Cunzo et al., 2022) with the export of products (Pugliese et al., 2019). In particular, we intend to research critical raw materials (MPC) in technologies for the adaptation to and mitigation of climate change, through the study of the abstracts of the patents present in the Y02-Y04S technological fields through large language models sized, natural language processing (NLP) systems with billions of parameters that have demonstrated new abilities to answer reading comprehension questions, as well as generate creative texts or solve mathematical problems. The diffusion of green technologies represents a fundamental step for achieving policy objectives on climate change mitigation. Still, it involves a significant expansion of the production and trade of critical raw materials that are fundamental for their functioning and are currently irreplaceable, e.g. solar panels, wind turbines, electric vehicles, and energy-efficient lighting (International Energy Agency, 2021; Herrington, 2021; Hund et al., 2020; Kowalski and Legendre, 2023). MPCs are a series of raw materials of strategic importance and high supply risk identified by the European Commission, which continuously updates a list of these resources linked to today’s technology (European Commission, 2011, 2020a, 2020b). Our work could provide a comprehensive descriptive empirical analysis of the presence of CRM in green technologies, initially using text mining techniques on green patent descriptions and then with more complex language models. Combining this result with information on the countries where patents are filed makes it possible to identify which green technologies rely most on CRMs and where they are employed.

Furthermore, taking into account the production data of critical raw materials, it will be possible to better articulate the dependence of green technologies on these materials, for example, by including a concentration index of their production and, on the other hand, to geolocate the spatial distribution of these inputs and compare it with that of countries where MPC-dependent green technologies are employed. Preliminary analyses indicate that materials such as lithium, silicon, rare earth, cobalt and graphite are critical components for the development of green technologies (in particular, energy generation and transmission technologies and those related to the production or processing of goods) and certainly require careful monitoring due to the poor diversification of their production. Finally, we see a clear divergence between the producers of the raw materials necessary for developing green technologies and the countries developing these technologies. More advanced AI methodologies will allow us to refine the preliminary results and distinguish between the use of materials as inputs necessary for the construction of technologies or their presence within the patent descriptions that aim to remove environmental pollution and the recycling of MPCs.


The impact of AI on the labour market. Job advertisement data is considered one of the best tools for observing the dynamics of employer demand for skills. These advertisements are collected in large databases of poorly structured data. Using language modelling tools, it is possible to extract information from these advertisements, such as the sector to which the advertisement relates, the job and the skills required. In this project, we will use these databases to study (also in real time) how the emergence and diffusion of AI tools is changing the demand for skills at a geographical and sectoral level.

 
Web scraping for services. Unlike data on trade in goods, data on trade in services are poorly available, not harmonised and generally unreliable. This severely limits economic analysis, especially in the area of economic complexity. In collaboration with Translated.com, the world’s largest company in the translation market, based in Rome, we intend to use LLM to develop an analytical pipeline to identify and geolocate service providers through systematic scraping of large databases of web pages (CommonCrawl). This activity will also involve the International Finance Corporation, with which the CREF has signed a Memorandum of Understanding and which has expressed interest in the possibilities of analysing the service economy in emerging markets. A first pilot project will be carried out in Cape Verde.
DreamingLearning. This AI learning technique is currently being developed and finalised. It makes it possible to endow a generative neural network with the property of anticipating new information that may arrive at the input of such systems, allowing new features to be very easily incorporated into the existing knowledge base. Furthermore, its theoretical description will enable us to establish a precise link with Bayesian approaches for automatically describing appropriate priors. These characteristics allow the machine to generalise the training information in a very effective way while at the same time focusing attention on the relevant innovations in terms of their impact on the non-stationary temporal sequence under analysis.

Evolutionary neural systems. The advantage demonstrated in the application of Dreaming Learning leads us to believe that neural systems equipped with architectures less tied to the feed-forward model and in which information can circulate more freely within the network are more effective in understanding innovation dynamics flexibly and adaptively. In this direction, we intend to develop new AI systems based on evolutionary dynamics, in which the architecture can adapt effectively and naturally to the proposed problem.


WeWrite. An innovative collaborative non-linear writing platform that facilitates the translation of complex ideas into usable text forms, such as scientific articles and pieces of fiction, but also navigable text networks, as in the case of interactive games. The platform itself, or the skills developed in its construction, can form the basis for a study of the distribution of work in a hybrid writing environment, suggesting easily explorable alternatives and allowing us to reconstruct the history of the creative process and to credit the work of the various participants, human or AI.
Metrics for identifying creative phases in writing processes New work on the writing process enables us to use time-ordered sequences of texts to identify the amount of exploration, in terms of ideas and rhetorical resolutions, that the writing team undertakes between one version of the text and the next. It is possible to develop a writing assistant that is aware of the creative process and able to provide guidance based on the writer’s path, using these techniques in conjunction with new generative artificial intelligence tools.

FC – International Finance Corporation

Translated.com

Mamacrowd

Sony CSL Rome participates in the STARTS-AIR project at the croospath of Art and Science. 

Aterballetto, for the development of performative artistic interaction technologies between performers and AI systems to support natural creativity.

Linguistics/Centre for Behaviour and Evolution (Christine Cusckley)

CSL-Rome is part of the French “ScientIA” project on studying the impact of Artificial Intelligence on other scientific discipline

CREF Researchers

Andrea Tacchella, Angelica Sbardella, Francesco De Cunzo (CREF)

Sony Researchers 

Alessandro Londei, Vittorio Loreto, Ruggiero Lo Sardo (CREF – SONY)

De Cunzo, F., Petri, A., Zaccaria, A., & Sbardella, A. (2022). The trickle down from environmental innovation to productive complexity. Scientific Reports, 12(1), 22141.
Napolitano, L., Sbardella, A., Consoli, D., Barbieri, N., & Perruchas, F. (2022). Green innovation and income inequality: A complex system analysis. Structural Change and Economic Dynamics, 63, 224-240.
Caldarola, B., Grazzi, M., Occelli, M., & Sanfilippo, M. (2023). Mobile internet, skills and structural transformation in Rwanda. Accepted in Research Policy.
Bruno, M., Lambiotte, R., & Saracco, F. (2022). Brexit and bots: characterizing the behaviour of automated accounts on Twitter during the UK election. EPJ Data Science, 11(1), 17.
Patuelli, A., & Saracco, F. (2023). Sustainable development goals as unifying narratives in large UK firms’ Twitter discussions. Scientific Reports, 13(1), 7017. 

© Centro Studi e Ricerche Enrico Fermi. All rights reserved | Cookie Policy