Modelos de sistemas de recomendaciones basados en lógica difusa, altmetría y aprendizaje automático. Aplicación al Boletín Oficial del Estado

  1. Bailón Elvira, Juan Carlos
Supervised by:
  1. Antonio Grabriel López Herrera Director

Defence university: Universidad de Granada

Fecha de defensa: 13 November 2024

Committee:
  1. Jesús Serrano Guerrero Chair
  2. María Luque Rodriguez Secretary
  3. Carlos Gustavo Porcel Gallego Committee member

Type: Thesis

Abstract

The exponential increase in digital resources available online has been largely driven by technological advances alongside growing global accessibility that allows immediate access to these resources. Most modern devices are integrated into an interconnected network known as the Internet of Things (IoT). These devices, in addition to fulfilling their primary functions, generate and collect data that are processed by various systems to extract valuable information, often exploited for commercial purposes. In this broad ecosystem of devices and systems, Information Retrieval Systems (IRS) play a crucial role. IRS are fundamental in various daily activities, such as conducting search engines like Google or Yahoo, scheduling appointments with public administrations, or consulting documents available on institutional websites or corporate intranets. These systems operate by filtering information based on user queries, comparing these queries with large databases to generate ordered lists of relevant results. However, the vast amount and the increasingly rapid rate at which information is generated pose a significant challenge for both IRS and users. The saturation of results produced by a single search makes it difficult to thoroughly review all the items presented by an IRS. To address this problem, Recommendation Systems (RS) emerge, which form the core of this research. RS are designed to perform automatic filtering of information that would otherwise have to be carried out manually by the user. This thesis outlines the various types of RS and the techniques used to provide personalised and relevant recommendations to users. One of the main challenges faced by both IRS and RS is the application of excessively rigid filters that can result in inappropriate exclusion or inclusion of results. In this context, fuzzy logic emerges as a promising solution, offering a more flexible way of managing filters. Fuzzy logic allows for handling intermediate degrees of relevance, meaning that a resource can be considered more or less interesting depending on its degree of relevance to a specific user. When the degree of relevance is high, the system prioritises that resource over others of lesser relevance, thus improving the sensitivity of filtering and better adapting to the user’s needs. Moreover, the quality of the data used in the filtering process is essential in RS. Traditionally, these systems have employed intrinsic data of the resources, social networks and online opinions significantly influence consumption decisions. Platforms like YouTube, TikTok, Instagram, and X (formerly known as Twitter) generate a large volume of content subject to debate and public review. The opinions of other users—for instance, when searching for a restaurant during a trip—have a significant impact on our decisions. Therefore, it has become very important for RS to incorporate these additional data to offer more precise and pertinent recommendations. To face this challenge, this thesis proposes the integration of altmetrics, a concept introduced in 2011 in the field of bibliometrics, which advocates the use of alternative metrics to traditional ones like citations and impact indices. In line with this philosophy, it is suggested to enrich information retrieval and recommendation systems by incorporating data obtained from various additional sources, in order to improve the filtering and positioning of resources. Furthermore, the thesis proposes the design of a multipurpose RS based on a multi-agent approach combined with fuzzy logic and altmetrics. This model allows agents to be independent modules that the user can activate or deactivate to customise their own RS. Each agent has specific configuration parameters that enable the system to be adapted to the individual needs of the user, offering the flexibility to reconfigure the system according to changing requirements. Combined with the flexibility offered by fuzzy logic, this allows the recommendation results to better adapt to the user’s needs. The system’s objects will be enriched by data extracted from different external systems that provide various value metrics to the objects composing the system. The application of this RS model is carried out in the Official State Gazette (Boletín Oficial del Estado - BOE), the official source of legislative publications at the state level, responsible for publishing all decisions approved by the Congress of Deputies. The choice of the BOE as a case study is due to the documentary problems it presents, which are addressed in this thesis through the implementation of machine learning algorithms to improve the documentary descriptions of such documents and, therefore, also enhance the recommendations and optimise access to the published information.