Tecnicas de clasificacion para la prediccion de tarifas aereas

  1. Barrón Ortiz, Marco Antonio
Dirigida per:
  1. Sebastián Ventura Soto Director/a
  2. José María Luna Ariza Codirector

Universitat de defensa: Universidad de Córdoba (ESP)

Fecha de defensa: 30 de de març de 2023

Tribunal:
  1. Pedro González García President
  2. Amelia Zafra Gómez Secretari/ària
  3. José María Moyano Murillo Vocal

Tipus: Tesi

Resum

1. Introducción o motivación de la tesis: Este trabajo se enfoca en los problemas multifactoriales a los que se enfrentan las aerolíneas comerciales como son la guerra de precios y la creación de una tabla dinámica de descuentos. En primer lugar, dentro de la industria aérea, los equipos de precios y ganancias pasan una cantidad de tiempo considerable analizando e interpretando las acciones de sus competidores. La mayoría de las veces, estos analistas tienen que usar sus habilidades para realizar una serie de análisis ad-hoc que les permita interpretar o encontrar patrones en las tarifas aéreas. La implementación de metodologías automáticas es clave para reducir los tiempos y evitar errores humanos. Esta tesis propone una nueva metodología para predecir, analizar e interpretar las tarifas de las aerolíneas que es capaz de imitar los procesos manuales ejecutados por los equipos de fijación precios. Para enfrentar esta guerra de precios, se propone un algoritmo de programación de expresión genética que imita el proceso manual llevado a cabo por los equipos de analistas mediante la adición automática de nuevas características o atributos. Para demostrar la capacidad de la metodología, se consideró un escenario real utilizando tarifas publicadas por parte de la aerolínea denominada Air Canada durante el período de diciembre 2019 a enero 2020; correspondiente a un período de viajes entre los meses de diciembre 2019 y abril de 2020. En segundo lugar, se aborda el problema de crear una tabla de ofertas dinámicas, debido a que, históricamente, las aerolíneas de todo el mundo han utilizado estructuras de precios estáticas, que están restringidas a puntos de precios discretos y existe una segmentación limitada entre sus pasajeros. Debido a estas limitaciones y restricciones, existe una enorme necesidad de métodos novedosos para calcular la disposición a pagar e identificar a los pasajeros potenciales, cuya probabilidad de reservar un vuelo aumenta si estos reciben un descuento con la finalidad de incrementar sus ganancias a través del incremento de las ventas de tarifas aéreas. Se propone un algoritmo de gramáticas evolutivas, el cual funciona como un selector de características para extraer los mejores subgrupos mediante el análisis del comportamiento de reservas que muestran los pasajeros. Se consideró un escenario real en el análisis experimental utilizando datos privados de una aerolínea comercial de talla mundial. 2.Contenido de la investigación: En esta tesis doctoral, se propone una metodología que tiene por objetivo la clasificación eficaz de tarifas aéreas así como la obtención de un modelo de alta interpretación. Esta metodología ha sido probada experimentalmente con tarifas reales que han sido lanzadas al público por la aerolínea Air Canada. La automatización de un modelo de clasificación, el cual es fácilmente interpretable, puede generar ganancias, evitar errores humanos, y disminuir el tiempo que se dedica a esta tarea de forma manual, permitiendo a los analistas de precios tener una perspectiva clara de lo que está ocurriendo en el mercado. A su vez, esta tesis doctoral también propone una segunda metodología enfocada a la creación de un sistema de recomendación de ofertas dinámicas que sirva como un modelo de alta interpretación basado en la tarea de descubrimiento de subgrupos (Subgroup Discovery, SD). Este modelo permite identificar subgrupos de interés para el ajuste de precios en base a las características específicas de los pasajeros, teniendo como objetivo principal el incremento de reservas de vuelos a través una página Web. Este sistema de recomendación ha sido probado experimentalmente con datos reales y privados pertenecientes una aerolínea comercial. 3.Conclusión: Las principales conclusiones obtenidas tras el desarrollo del trabajo realizado en esta tesis son las siguientes: 1. Tras hacer una búsqueda y revisión bibliográfica de la literatura relacionada acerca de los métodos que se utilizan para enfrentar la guerra de precios y los métodos de precios dinámicos dentro de la industria aérea, se ha encontrado una serie de trabajos interesantes. Esto indica que ambos problemas que se tratan de resolver en esta tesis es de actualidad y de gran importancia para las aerolíneas comerciales. De hecho, la implementación de metodologías automatizadas capaces de producir modelos de alta interpretación es de vital importancia para los equipos de analistas de precios y ganancias, debido a que se pueden reducir el número de errores humanos, aumentar la capacidad y velocidad de análisis, y la extracción de patrones o reglas interesantes. 2. La tarea de predecir y extraer conocimiento en las tarifas aéreas es una tarea muy difícil de conseguir, principalmente por dos causas: la alta cantidad de tarifas a analizar y los cambios constantes que suceden diariamente. Para solventar estas dos dificultades, en esta tesis se ha propuesto la una metodología en la cual se integra un algoritmo 96 Conclusiones y trabajo futuro de programación de expresión genética (GEP); el cual es capaz de imitar las tareas de limpieza y transformación de datos que a menudo realizan los analistas de precios dentro de una aerolínea. Por lo tanto, este algoritmo crea conjuntos de datos transformados, los cuales alimentan a un algoritmo de clasificación para predecir la clase de la tarifa a la que pertenece y extraer una serie de reglas, creando un modelo de fácil interpretación que puede ser utilizado por los analistas de precios. La metodología demostró una mejora tanto en las métricas de clasificación, como en la métrica de interpretación, siendo capaz de generar un modelo de alta interpretación. Esta metodología fue probada con un conjunto de datos reales que han sido publicados por la aerolínea bandera de Canadá (Air Canada). 3. La identificación de individuos cuya probabilidad de realizar una reserva de vuelo aumente si recibe un descuento mientras hace una búsqueda de tarifas en una página web es una tarea de enorme interés para las aerolíneas. En esta tesis doctoral, se ha propuesto una metodología que tiene como finalidad la creación de un repositorio de reglas que permitan identificar subgrupos de pasajeros, los cuales la probabilidad de efectuar una reserva de vuelo aumenta si estos reciben una oferta. Dentro de esta metodología, se ha propuesto un algoritmo de gramáticas evolutivas (GE) el cual funciona como un selector de características, creando diversos conjuntos de datos que alimentan a un algoritmo de SD para la extracción de reglas, generando un repositorio con los mejores subgrupos el cual funciona como una tabla de ofertas dinámicas y como un modelo de interpretación a la vez. La metodología demostró ser capaz de generar un repositorio de reglas únicas, evitando la redundancia de las mismas; así mismo, estos subgrupos descubiertos muestran pasajeros cuya probabilidad de efectuar una reserva aumenta si estos reciben una oferta. La metodología fue probada en un escenario real utilizando datos privados de una aerolínea comercial 4. Las dos metodologías propuestas demostraron que la utilización de técnicas de DM, como son los algoritmos de clasificación basados en reglas y los métodos de SD, en conjunto con algún tipo de algoritmo evolutivo, pueden ser muy eficientes para resolver problemas a los cuales se enfrentan las aerolíneas actualmente. 5. Este trabajo explica como se pueden crear modelos de alta interpretación, utilizando métodos de clasificación y métodos de SD, los cuales pueden ayudar a la generación de ganancias dentro de un entorno comercial altamente competitivo. 4. Bibliografía: [1] Abellán, J. and Moral, S. (2003). Building classification trees using the total uncertainty criterion. International Journal of Intelligent Systems, 18(12):1215¿1225. [2] Aggarwal, C. C. (2015). Data mining: the textbook. Springer. [3] Aggarwal, C. C., Bhuiyan, M. A., and Hasan, M. A. (2014). Frequent pattern mining algorithms: A survey. In Frequent pattern mining, pages 19¿64. Springer. [4] Aha, D. W., Kibler, D., and Albert, M. K. (1991). Instance-based learning algorithms. Machine learning, 6(1):37¿66. [5] Alshammari, M. and Mezher, M. (2020). A comparative analysis of data mining techniques on breast cancer diagnosis data using weka toolbox. (IJACSA) International Journal of Advanced Computer Science and Applications, 8:224¿229. [6] Angeline, P. J. (1994). Genetic programming: On the programming of computers by means of natural selection: John r. koza, a bradford book, mit press, cambridge ma, 1992, isbn 0-262-11170-5, xiv+ 819pp., us 55,00. [7] Archak, N., Ghose, A., and Ipeirotis, P. G. (2011). Deriving the pricing power of product features by mining consumer reviews. Management science, 57(8):1485¿1509. [8] Arnoud, V. (2015). den boer. Dynamic pricing and learning: Historical origins, current research, and new directions. Surveys in Operations Research and Management Science, 20(1):1¿18. [9] Arrieta, A. B., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., García, S., Gil-López, S., Molina, D., Benjamins, R., et al. (2020). Explainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible ai. Information fusion, 58:82¿115. [10] Atzmueller, M. (2015). Subgroup discovery. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 5(1):35¿49. [11] Atzmueller, M., Atzmueller, M. M., and Java, S. (2021). Package `rsubgroup¿. [12] Atzmueller, M. and Puppe, F. (2006). Sd-map¿a fast algorithm for exhaustive subgroup discovery. In European Conference on Principles of Data Mining and Knowledge Discovery, pages 6¿17. Springer. [13] Bay, S. D. and Pazzani, M. J. (2001). Detecting group differences: Mining contrast sets. Data mining and knowledge discovery, 5(3):213¿246. 100 Bibliografía [14] Begum, S. A. and Devi, O. M. (2011). Fuzzy algorithms for pattern recognition in medical diagnosis. Assam University Journal of Science and Technology, 7(2):1¿12. [15] Bel, L., Allard, D., Laurent, J.-M., Cheddadi, R., and Bar-Hen, A. (2009). Cart algorithm for spatial data: Application to environmental and ecological data. Computational Statistics & Data Analysis, 53(8):3082¿3093. [16] Bernardo, J., Bayarri, M., Berger, J., Dawid, A., Heckerman, D., Smith, A., West, M., et al. (2003). The variational bayesian em algorithm for incomplete data: with application to scoring graphical model structures. Bayesian statistics, 7(453-464):210. [17] Bertsimas, D. and Tsitsiklis, J. (1993). Simulated annealing. Statistical science, 8(1):10¿ 15. [18] Beyer, H.-G. and Schwefel, H.-P. (2002). Evolution strategies¿a comprehensive introduction. Natural computing, 1(1):3¿52. [19] Bhargava, N., Jain, A., Kumar, A., and Le, D.-N. (2017). Detection of malicious executables using rule based classification algorithms. In ICITKM, pages 35¿38. [20] Bousquet, O., von Luxburg, U., and Rätsch, G. (2004). Advanced lectures on machine learning. vol. 3176 of lecture notes in computer science. [21] Boutorh, A. and Guessoum, A. (2014). Grammatical evolution association rule mining to detect gene-gene interaction. In BIOINFORMATICS, pages 253¿258. [22] Brabazon, A. (2018). Grammatical evolution in finance and economics: A survey. Handbook of Grammatical Evolution, pages 263¿288. [23] Brabazon, A., Matthews, R., O¿Neill, M., and Ryan, C. (2002). Grammatical evolution and corporate failure prediction. In Proceedings of the 4th Annual Conference on Genetic and Evolutionary Computation, pages 1011¿1018. [24] Breiman, L., Friedman, J., Stone, C., and Olshen, R. (1984). Classification and regression trees chapman & hall. New York. [25] Bremermann, H. J. (1958). The evolution of intelligence: The nervous system as a model of its environment. University of Washington, Department of Mathematics. [26] Bremermann, H. J. et al. (1962). Optimization through evolution and recombination. Self-organizing systems, 93:106. [27] Bremermann, H. J. and Rogson, M. (1964). An evolution-type search method for convex sets. Technical report, CALIFORNIA UNIV BERKELEY. [28] Cannon, W. et al. (1932). The wisdom of the body norton. New York, NY. [29] Chaudhari, A. and Mulay, P. (2019). A bibliometric survey on incremental clustering algorithm for electricity smart meter data analysis. Iran Journal of Computer Science, 2(4):197¿206. Bibliografía 101 [30] Chen, L., Mislove, A., and Wilson, C. (2016). An empirical analysis of algorithmic pricing on amazon marketplace. In Proceedings of the 25th international conference on World Wide Web, pages 1339¿1349. [31] Chen, M. K. and Sheldon, M. (2016). Dynamic pricing in a labor market: Surge pricing and flexible work on the uber platform. Ec, 16:455. [32] Chen, Y., Li, F., and Fan, J. (2015). Mining association rules in big data with ngep. Cluster Computing, 18(2):577¿585. [33] Chokkalingam, S. P. and Komathy, K. (2013). Comparison of different classifier in weka for rheumatoid arthritis. In 2013 International Conference on Human Computer Interactions (ICHCI), pages 1¿6. [34] Clark, P. and Niblett, T. (1989). The cn2 induction algorithm. Machine learning, 3(4):261¿ 283. [35] Cohen, W. W. (1995). Fast effective rule induction. Machine learning proceedings 1995, pages 115¿123. [36] Cramer, N. L. (1985). A representation for the adaptive generation of simple sequential programs. In proceedings of an International Conference on Genetic Algorithms and the Applications, pages 183¿187. [37] Damanik, I. S., Windarto, A. P., Wanto, A., Andani, S. R., Saputra, W., et al. (2019). Decision tree optimization in c4. 5 algorithm using genetic algorithm. In Journal of Physics: Conference Series, volume 1255, page 012012. IOP Publishing. [38] Dehuri, S. and Cho, S.-B. (2008). Multi-objective classification rule mining using gene expression programming. In 2008 Third International Conference on Convergence and Hybrid Information Technology, volume 2, pages 754¿760. IEEE. ESTUDIOS DE DOCTORADO Página 5 de 12 [39] Den Boer, A. V. (2015). Dynamic pricing and learning: historical origins, current research, and new directions. Surveys in operations research and management science, 20(1):1¿18. [40] Dong, G. and Bailey, J. (2012). Contrast data mining: concepts, algorithms, and applications. CRC Press. [41] Došilovic, F. K., Br ¿ ci¿ c, M., and Hlupi ¿ c, N. (2018). Explainable artificial intelligence: A ¿ survey. In 2018 41st International convention on information and communication technology, electronics and microelectronics (MIPRO), pages 0210¿0215. IEEE. [42] Duan, L., Tang, C., Tang, L., Zuo, J., and Zhang, T. (2009). An effective microarray data classifier based on gene expression programming. In 2009 Fifth International Conference on Natural Computation, volume 4, pages 523¿527. IEEE. [43] Duffy, J. and Engle-Warnick, J. (2002). Using symbolic regression to infer strategies from experimental data. In Evolutionary computation in Economics and Finance, pages 61¿82. Springer. [44] Eftimov, T. and Korošec, P. (2019). A novel statistical approach for comparing metaheuristic stochastic optimization algorithms according to the distribution of solutions in the search space. Information Sciences, 489:255¿273. 102 Bibliografía [45] Elmaghraby, W. and Keskinocak, P. (2003). Dynamic pricing in the presence of inventory considerations: Research overview, current practices, and future directions. Management science, 49(10):1287¿1309. [46] Farris, P. W., Bendle, N., Pfeifer, P. E., and Reibstein, D. (2010). Marketing metrics: The definitive guide to measuring marketing performance. Pearson Education. [47] Faruqui, A. and Palmer, J. (2011). Dynamic pricing and its discontents. Regulation, 34:16. [48] Fayyad, U. M., Piatetsky-Shapiro, G., Smyth, P., et al. (1996). Knowledge discovery and data mining: Towards a unifying framework. In KDD, volume 96, pages 82¿88. [49] Ferreira, C. (2002a). Discovery of the boolean functions to the best density-classification rules using gene expression programming. In European Conference on Genetic Programming, pages 50¿59. Springer. [50] Ferreira, C. (2002b). Gene expression programming in problem solving. In Soft computing and industry, pages 635¿653. Springer. [51] Ferreira, C. (2006). Gene expression programming: mathematical modeling by an artificial intelligence, volume 21. Springer. [52] Fiig, T., Le Guen, R., and Gauchet, M. (2018). Dynamic pricing of airline offers. Journal of Revenue and Pricing Management, 17(6):381¿393. [53] Fogel, D. B. (1995). Evolutionary computation: Toward a new philosophy of machine intelligence, institute of electrical and electronics engineers. Inc, New York, pages 155¿171. [54] Fogel, D. B. (1998a). Artificial intelligence through simulated evolution. Wiley-IEEE Press. [55] Fogel, D. B. (1998b). Evolutionary computation. the fossil record. selected readings on the history of evolutionary computation. Classifier Systems. [56] Fogel, D. B. (2006). Foundations of evolutionary computation. In Modeling and Simulation for Military Applications, volume 6228, page 622801. International Society for Optics and Photonics. [57] Fogel, D. B., Fogel, L. J., and Atmar, J. W. (1991). Meta-evolutionary programming. In Conference record of the twenty-fifth asilomar conference on signals, systems & computers, pages 540¿541. IEEE computer Society. [58] Fogel, G. B. and Corne, D. W. (2003). An introduction to evolutionary computation for biologists. In Evolutionary Computation in Bioinformatics, pages 19¿38. Elsevier. [59] Fournier-Viger, P., Lin, J. C.-W., Kiran, R. U., Koh, Y. S., and Thomas, R. (2017). A survey of sequential pattern mining. Data Science and Pattern Recognition, 1(1):54¿77. [60] Fournier-Viger, P., Yang, P., Kiran, R. U., Ventura, S., and Luna, J. M. (2021). Mining local periodic patterns in a discrete sequence. Information Sciences, 544:519¿548. [61] Frank, E. and Witten, I. H. (1998). Generating accurate rule sets without global optimization. Bibliografía 103 [62] Fraser, A. (1968). The evolution of purposive behavior. purposive systems. [63] Friedman, J., Hastie, T., Tibshirani, R., et al. (2000). Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). Annals of statistics, 28(2):337¿407. [64] Fürnkranz, J. and Kliegr, T. (2015). A brief overview of rule learning. In International symposium on rules and rule markup languages for the semantic web, pages 54¿69. Springer. [65] Fürnkranz, J., Gamberger, D., and Lavrac, N. (2012). Foundations of Rule Learning. Cognitive Technologies. Springer. [66] García, S., Fernández, A., Luengo, J., and Herrera, F. (2009). A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft Computing, 13(10):959. [67] García, S., Luengo, J., and Herrera, F. (2015). Data Preprocessing in Data Mining, volume 72 of Intelligent Systems Reference Library. Springer. [68] Georgoulas, G., Gavrilis, D., Tsoulos, I. G., Stylios, C., Bernardes, J., and Groumpos, P. P. (2007). Novel approach for fetal heart rate classification introducing grammatical evolution. Biomedical Signal Processing and Control, 2(2):69¿79. [69] Goldberg, D. E. and Holland, J. H. (1988). Genetic algorithms and machine learning. [70] Goodman, B. and Flaxman, S. (2017). European union regulations on algorithmic decisionmaking and a ¿right to explanation¿. AI magazine, 38(3):50¿57. [71] Gross, J. and Groß, J. (2003). Linear regression, volume 175. Springer Science & Business Media. [72] Gunning, D. (2017). Explainable artificial intelligence (xai). Defense advanced research projects agency (DARPA), nd Web, 2(2):1. [73] Hall, P. (2018). On the art and science of machine learning explanations. arXiv preprint arXiv:1810.02909. [74] Haris NA, Abdullah M, O. A. R. F. (2014). Optimization and data mining for decision making. In In 2014 World Congress on Computer Applications and Information Systems (WCCAIS), volume 10, pages 1¿4. IEEE. [75] Harsoor, A. S. and Patil, A. (2015). Forecast of sales of walmart store using big data applications. International Journal of Research in Engineering and Technology, 4(6):51¿59. [76] Hegland, M. (2007). The apriori algorithm¿a tutorial. Mathematics and computation in imaging science and information processing, pages 209¿262. [77] Heil, O. P. and Helsen, K. (2001). Toward an understanding of price wars: Their nature and how they erupt. International Journal of Research in Marketing, 18(1-2):83¿98. [78] Hernández Orallo, J., Ramírez Quintana, M. J., and Ferri Ramírez, C. (2004). Introducción a la Minería de Datos. Pearson Educación. 104 Bibliografía [79] Herrera, F., Carmona, C. J., González, P., and Del Jesus, M. J. (2011). An overview on subgroup discovery: foundations and applications. Knowledge and information systems, 29(3):495¿525. [80] Hofer, C., Windle, R. J., and Dresner, M. E. (2008). Price premiums and low cost carrier competition. Transportation Research Part E: Logistics and Transportation Review, 44(5):864¿882. [81] Holland, J. H. (1992). Genetic algorithms. Scientific american, 267(1):66¿73. ESTUDIOS DE DOCTORADO Página 7 de 12 [82] Holland, J. H. et al. (1992). Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. MIT press. [83] Holte, R. C. (1993). Very simple classification rules perform well on most commonly used datasets. Machine learning, 11(1):63¿90. [84] Hornik, K., B. C. . Z. A. (2009). Open-source machine learning: R meets weka. Computational Statistics, pages 225¿232. [85] Hossin, M. and Sulaiman, M. N. (2015). A review on evaluation metrics for data classification evaluations. International journal of data mining & knowledge management process, 5(2):1. [86] Hu, J. and Guo, W. (2019). Flexibility analysis in waste-to-energy systems based on decision rules and gene expression programming. In 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC), pages 988¿993. IEEE. [87] Hu, J. and Mojsilovic, A. (2007). High-utility pattern mining: A method for discovery of high-utility item sets. Pattern Recognition, 40(11):3317¿3324. [88] Huang, J. and Deng, C. (2009). A novel multiclass classification method with gene expression programming. In 2009 International Conference on Web Information Systems and Mining, pages 139¿143. IEEE. [89] Iba, W. and Langley, P. (1992). Induction of one-level decision trees. In Machine Learning Proceedings 1992, pages 233¿240. Elsevier. [90] Iqbal, M., Azam, M., Naeem, M., Khwaja, A., and Anpalagan, A. (2014). Optimization classification, algorithms and tools for renewable energy: A review. Renewable and Sustainable Energy Reviews, 39:640¿654. [91] J¿edrzejowicz, J. and J¿edrzejowicz, P. (2010). Cellular gep-induced classifiers. In International Conference on Computational Collective Intelligence, pages 343¿352. Springer. [92] John, R. (1992). Koza. genetic programming: On the programming of computers by means of natural selection. [93] Jdrzejowicz, J. and Jdrzejowicz, P. (2008). Gep-induced expression trees as weak classifiers. In Industrial Conference on Data Mining, pages 129¿141. Springer. [94] Kapania, N. R., Subosits, J., and Christian Gerdes, J. (2016). A sequential two-step algorithm for fast generation of vehicle racing trajectories. Journal of Dynamic Systems, Measurement, and Control, 138(9). Bibliografía 105 [95] Karakasis, V. K. and Stafylopatis, A. (2008). Efficient evolution of accurate classification rules using a combination of gene expression programming and clonal selection. IEEE transactions on evolutionary computation, 12(6):662¿678. [96] Kavšek, B. and Lavrac, N. (2006). Apriori-sd: Adapting association rule learning to ¿ subgroup discovery. Applied Artificial Intelligence, 20(7):543¿583. [97] Kavurucu, Y., Senkul, P., and Toroslu, I. H. (2009). Ilp-based concept discovery in multi-relational data mining. Expert Systems with Applications, 36(9):11418¿11428. [98] Khan, R. A., Suleman, T., Farooq, M. S., Rafiq, M. H., and Tariq, M. A. (2017). Data mining algorithms for classification of diagnostic cancer using genetic optimization algorithms. Ijcsns, 17(12):207. [99] Kim, B., Khanna, R., and Koyejo, O. O. (2016). Examples are not enough, learn to criticize! criticism for interpretability. Advances in neural information processing systems, 29. [100] Komarek, P. (2004). Logistic regression for data mining and high-dimensional classification. Carnegie Mellon University. [101] Koza, J. R. et al. (1994). Genetic programming II, volume 17. MIT press Cambridge, MA. [102] Kralj, P., Lavrac, N., Gamberger, D., and Krstacic, A. (2007). Supporting factors to improve the explanatory potential of contrast set mining: Analyzing brain ischaemia data. In 11th Mediterranean Conference on Medical and Biomedical Engineering and Computing 2007, pages 157¿161. Springer. [103] Krämer, A., Friesen, M., and Shelton, T. (2018). Are airline passengers ready for personalized dynamic pricing? a study of german consumers. Journal of Revenue and Pricing Management, 17(2):115¿120. [104] Lantseva, A., Mukhina, K., Nikishova, A., Ivanov, S., and Knyazkov, K. (2015). Datadriven modeling of airlines pricing. Procedia Computer Science, 66:267¿276. [105] Lavrac, N., Kavsek, B., Flach, P., and Todorovski, L. (2004). Subgroup discovery with cn2-sd. J. Mach. Learn. Res., 5(2):153¿188. [106] Leibold, A. O. S. and López, Y. A. (2016). Low cost carriers in mexico. In The Low Cost Carrier Worldwide, pages 119¿132. Routledge. [107] Lemmerich, F., Rohlfs, M., and Atzmueller, M. (2010). Fast discovery of relevant subgroup patterns. In Twenty-Third International FLAIRS Conference. [108] Li, J., Liu, H., Downing, J. R., Yeoh, A. E.-J., and Wong, L. (2003). Simple rules underlying gene expression profiles of more than six subtypes of acute lymphoblastic leukemia (all) patients. Bioinformatics, 19(1):71¿78. [109] Li, J. and Wong, L. (2002). Identifying good diagnostic gene groups from gene expression profiles using the concept of emerging patterns. Bioinformatics, 18(5):725¿734. 106 Bibliografía [110] Links, A. F. and BookDust, I. M. (2002). Alex s. fraser. IEEE Transactions on Evolutionary Computation, 6(5). [111] Liu, B., Hsu, W., Ma, Y., et al. (1998). Integrating classification and association rule mining. In Kdd, volume 98, pages 80¿86. [112] Liu, X., Cai, Z., and Gong, W. (2008). An improved gene expression programming for fuzzy classification. In International Symposium on Intelligence Computation and Applications, pages 520¿529. Springer. [113] Liu, Z., Wynter, L., and Xia, C. (2002). Pricing information services in a competitive market: avoiding price wars. PhD thesis, INRIA. [114] Luna, J. M., Ondra, M., Fardoun, H. M., and Ventura, S. (2018). Optimization of quality measures in association rule mining: an empirical study. Int. J. Comput. Intell. Syst., 12(1):59¿78. [115] Luna, J. M., Pechenizkiy, M., Duivesteijn, W., and Ventura, S. (2020). Exceptional in so many ways - discovering descriptors that display exceptional behavior on contrasting scenarios. IEEE Access, 8:200982¿200994. [116] Marghny, M. and El-Semman, I. (2005). Extracting fuzzy classification rules with gene expression programming. In ALML 2005 Conference. Citeseer. [117] Maroco, J., Silva, D., Rodrigues, A., Guerreiro, M., Santana, I., and de Mendonça, A. (2011). Data mining methods in the prediction of dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests. BMC research notes, 4(1):1¿14. [118] Márquez-Vera, C., Cano, A., Romero, C., and Ventura, S. (2013). Predicting student failure at school using genetic programming and different data mining approaches with high dimensional and imbalanced data. Applied intelligence, 38(3):315¿330. [119] Mauceri, S., Sweeney, J., and McDermott, J. (2021). One-class subject authentication using feature extraction by grammatical evolution on accelerometer data. In Heuristics for Optimization and Learning, pages 393¿407. Springer. ESTUDIOS DE DOCTORADO Página 9 de 12 [120] Mazouni, R. and Rahmoun, A. (2015). Agge: A novel method to automatically generate rule induction classifiers using grammatical evolution. In Intelligent Distributed Computing VIII, pages 279¿288. Springer. [121] McAfee, R. P. and Te Velde, V. (2006). Dynamic pricing in the airline industry. Handbook on economics and information systems, 1:527¿67. [122] Miller, T. (2018). Explanation in artificial intelligence: Insights from the social sciences. [123] Mitchell, M. (1998). An introduction to genetic algorithms. MIT press. [124] Molnar, C., Casalicchio, G., and Bischl, B. (2020). Interpretable machine learning¿a brief history, state-of-the-art and challenges. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 417¿431. Springer. Bibliografía 107 [125] Niinistö, J. (2020). Utilizing machine learning in data-driven pricing. [126] Noaman, A. Y., Luna, J. M., Ragab, A. H., and Ventura, S. (2016). Recommending degree studies according to students¿ attitudes in high school by means of subgroup discovery. International Journal of Computational Intelligence Systems, 9(6):1101¿1117. [127] Novak, P. K., Lavrac, N., and Webb, G. I. (2009). Supervised descriptive rule discovery: ¿ A unifying survey of contrast set, emerging pattern and subgroup mining. Journal of Machine Learning Research, 10(2). [128] O¿Neill, M. and Ryan, C. (2001). Grammatical evolution. IEEE Transactions on Evolutionary Computation, 5(4):349¿358. [129] Otero, D. F. and Akhavan-Tabatabaei, R. (2015). A stochastic dynamic pricing model for the multiclass problems in the airline industry. European Journal of Operational Research, 242(1):188¿200. [130] O¿Neil, M. and Ryan, C. (2003). Grammatical evolution. In Grammatical evolution, pages 33¿47. Springer. [131] P. Belobaba, A. O. and Barnhart, C. (2016). The global airline industry. In Soft computing and industry, page 82. 2nd ed. John Wiley Sons, Ltd. [132] Padillo, F., Luna, J. M., and Ventura, S. (2020a). LAC: library for associative classification. Knowl. Based Syst., 193:105432. [133] Padillo, F., Luna, J. M., and Ventura, S. (2020b). Lac: Library for associative classification. Knowledge-Based Systems, 193:105432. [134] Pels, E. and Rietveld, P. (2004). Airline pricing behaviour in the london¿paris market. Journal of Air Transport Management, 10(4):277¿281. [135] Pérez, J. M., Muguerza, J., Arbelaitz, O., Gurrutxaga, I., and Martín, J. I. (2007). Combining multiple class distribution modified subsamples in a single tree. Pattern Recognition Letters, 28(4):414¿422. [136] Pitfield, D. (2005). A time series analysis of the pricing behaviour of directly competitive¿low-cost¿airlines. International Journal of Transport Economics/Rivista internazionale di economia dei trasporti, pages 15¿39. [137] Quinlan, J. R. and Cameron-Jones, R. M. (1995). Induction of logic programs: Foil and related systems. New Generation Computing, 13(3):287¿312. [138] Rai, D., Thoke, A., and Verma, K. (2012). Enhancement of associative rule based foil and prm algorithms. In 2012 Students Conference on Engineering and Systems, pages 1¿4. IEEE. [139] Rajab, K. D. (2019). New associative classification method based on rule pruning for classification of datasets. IEEE Access, 7:157783¿157795. [140] Raju, J. and Zhang, Z. (2010). Smart Pricing: How Google, Priceline, and Leading Businesses Use Pricing Innovation for Profitabilit (paperback). Pearson Prentice Hall. 108 Bibliografía [141] Ramraj, S., Uzir, N., Sunil, R., and Banerjee, S. (2016). Experimenting xgboost algorithm for prediction and classification of different datasets. International Journal of Control Theory and Applications, 9:651¿662. [142] Rish, I. et al. (2001). An empirical study of the naive bayes classifier. In IJCAI 2001 workshop on empirical methods in artificial intelligence, volume 3, pages 41¿46. [143] Rojas, R. (1996). The backpropagation algorithm. In Neural networks, pages 149¿182. Springer. [144] Rokach, L. and Maimon, O. (2005). Decision trees. In Data mining and knowledge discovery handbook, pages 165¿192. Springer. [145] Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5):206¿215. [146] Rusell, S. and Norvig, P. (2003). Artificial intelligence: A modern approach. Pretice Hall Series in Artificial Intelligence, 1:649¿789. [147] Ryan, C., Collins, J. J., and Neill, M. O. (1998). Grammatical evolution: Evolving programs for an arbitrary language. In European Conference on Genetic Programming, pages 83¿96. Springer. [148] Salzberg, S. L. (1994). C4. 5: Programs for machine learning by j. ross quinlan. morgan kaufmann publishers, inc., 1993. Machine learning. [149] Santos, F. A. d. N., Mayer, V. F., and Marques, O. R. B. (2020). Dynamic pricing and price fairness perceptions: a study of the use of the uber app in travels. Turismo: Visão e Ação, 21:239¿264. [150] Selcuk, A. M. and Avs.ar, Z. M. (2019). Dynamic pricing in airline revenue management. Journal of mathematical analysis and applications, 478(2):1191¿1217. [151] Sengpoh, L. (2015). The competitive pricing behaviour of low cost airlines in the perspective of sun tzu art of war. Procedia-Social and Behavioral Sciences, 172:741¿748. [152] Shiri, J., Sadraddini, A. A., Nazemi, A. H., Kisi, O., Landeras, G., Fard, A. F., and Marti, P. (2014). Generalizability of gene expression programming-based approaches for estimating daily reference evapotranspiration in coastal stations of iran. Journal of hydrology, 508:1¿11. [153] Shukla, N., Kolbeinsson, A., Otwell, K., Marla, L., and Yellepeddi, K. (2019). Dynamic pricing for airline ancillaries with customer context. In Proceedings of the 25th ACM SIGKDD International Conference on knowledge discovery & data mining, pages 2174¿2182. [154] Simovici, D. A. (2012). Linear algebra tools for data mining. World Scientific. [155] Siu, K. K., Butler, S. M., Beveridge, T., Gillam, J., Hall, C., Kaye, A. H., Lewis, R. A., Mannan, K., McLoughlin, G., Pearson, S., et al. (2005). Identifying markers of pathology in saxs data of malignant tissues of the brain. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 548(1-2):140¿146. Bibliografía 109 [156] Song, H. S., kyeong Kim, J., and Kim, S. H. (2001). Mining the change of customer behavior in an internet shopping mall. Expert systems with applications, 21(3):157¿168. [157] Spears, W. M., De Jong, K. A., Bäck, T., Fogel, D. B., and De Garis, H. (1993). An overview of evolutionary computation. In European Conference on Machine Learning, pages 442¿459. Springer. [158] Sumner, M., Frank, E., and Hall, M. (2005). Speeding up logistic model tree induction. In European conference on principles of data mining and knowledge discovery, pages 675¿683. Springer. [159] Swinburne, R. (2004). Bayes¿ theorem. Revue Philosophique de la France Et de l, 194(2). [160] Taunk, K., De, S., Verma, S., and Swetapadma, A. (2019). A brief review of nearest ESTUDIOS DE DOCTORADO Página 11 de 12 neighbor algorithm for learning and classification. In 2019 International Conference on Intelligent Computing and Control Systems (ICCS), pages 1255¿1260. IEEE. [161] Turing, A. M. (2009). Computing machinery and intelligence. In Parsing the turing test, pages 23¿65. Springer. [162] Vamathevan, J., Clark, D., Czodrowski, P., Dunham, I., Ferran, E., Lee, G., Li, B., Madabhushi, A., Shah, P., Spitzer, M., et al. (2019). Applications of machine learning in drug discovery and development. Nature Reviews Drug Discovery, 18(6):463¿477. [163] Ventura, S., Luna, J. M., et al. (2018). Supervised descriptive pattern mining. Springer. [164] Wang, K., Tang, L., Han, J., and Liu, J. (2002). Top down fp-growth for association rule mining. In Pacific-Asia conference on knowledge discovery and data mining, pages 334¿340. Springer. [165] Webb, G. I. (1999). Decision tree grafting from the all-tests-but-one partition. In Ijcai, volume 2, pages 702¿707. [166] Webb, G. I. (2010). Association discovery. In ACM SIGKDD Workshop on Useful Patterns, page 7. [167] Weiss, S. M. and Indurkhya, N. (1998). Predictive data mining: a practical guide. Morgan Kaufmann. [168] West, D. M. (2018). The future of work: Robots, AI, and automation. Brookings Institution Press. [169] Whigham, P. A. (1996). Search bias, language bias, and genetic programming. Genetic Programming, 1996:230¿237. [170] Witten, I. H., Frank, E., and Hall, M. A. (2011). Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann, Amsterdam, 3 edition. [171] Wittman, M. D. and Belobaba, P. P. (2018). Customized dynamic pricing of airline fare products. Journal of Revenue and Pricing Management, 17(2):78¿90. 110 Bibliografía [172] Wohlfarth, T., Clémençon, S., Roueff, F., and Casellato, X. (2011). A data-mining approach to travel price forecasting. In 2011 10th International Conference on Machine Learning and Applications and Workshops, volume 1, pages 84¿89. IEEE. [173] Wong, T.-T. and Tseng, K.-L. (2005). Mining negative contrast sets from data with discrete attributes. Expert Systems with Applications, 29(2):401¿407. [174] Wrobel, S. (1997). An algorithm for multi-relational discovery of subgroups. In European symposium on principles of data mining and knowledge discovery, pages 78¿87. Springer. [175] Wu, X., Kumar, V., Quinlan, J. R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G. J., Ng, A., Liu, B., Philip, S. Y., et al. (2008). Top 10 algorithms in data mining. Knowledge and information systems, 14(1):1¿37. [176] Wu, Z. and Yao, M. (2009). A new gep algorithm based on multi-phenotype chromosomes. In 2009 Second International Workshop on Computer Science and Engineering, volume 1, pages 204¿209. IEEE. [177] Xiao, R., Li, M.-J., and Zhang, H.-J. (2004). Robust multipose face detection in images. IEEE Transactions on Circuits and Systems for Video Technology, 14(1):31¿41. [178] Yadav, J. and Sharma, M. (2013). A review of k-mean algorithm. Int. J. Eng. Trends Technol, 4(7):2972¿2976. [179] Yasodha P, A. N. (2014). Comparative study of diabetic patient data using classification algorithm in weka tool. Int. J. Comput. Appl. Technol. Res., pages 554¿558. [180] Yin, X. and Han, J. (2003). Cpar: Classification based on predictive association rules. In Proceedings of the 2003 SIAM international conference on data mining, pages 331¿335. SIAM. [181] Yu, C.-S., Lin, Y.-J., Lin, C.-H., Wang, S.-T., Lin, S.-Y., Lin, S. H., Wu, J. L., and Chang, S.-S. (2020). Predicting metabolic syndrome with machine learning models using a decision tree algorithm: Retrospective cohort study. JMIR Med Inform, 8(3):e17110. [182] Zhong, J., Feng, L., and Ong, Y.-S. (2017a). Gene expression programming: A survey. IEEE Computational Intelligence Magazine, 12(3):54¿72. [183] Zhong, J., Feng, L., and Ong, Y.-S. (2017b). Gene expression programming: A survey. IEEE Computational Intelligence Magazine, 12(3):54¿72. [184] Zhou, C., Xiao, W., Tirpak, T. M., and Nelson, P. C. (2003). Evolving accurate and compact classification rules with gene expression programming. IEEE Transactions on Evolutionary Computation, 7(6):519¿531. [185] Zhu, J., Liapis, A., Risi, S., Bidarra, R., and Youngblood, G. M. (2018). Explainable ai for designers: A human-centered perspective on mixed-initiative co-creation. In 2018 IEEE Conference on Computational Intelligence and Games (CIG), pages 1¿8. IEEE. [186] Zuo, J., Tang, C., and Zhang, T. (2002). Mining predicate association rule by gene expression programming. In International Conference on Web-Age Information Management, pages 92¿103. Springer.