Data Balancing to Improve Prediction of Project Success in the Telecom Sector

  1. Nuño Basurto 1
  2. Alfredo Jiménez 2
  3. Secil Bayraktar 3
  4. Álvaro Herrero 1
  1. 1 Universidad de Burgos
    info

    Universidad de Burgos

    Burgos, España

    ROR https://ror.org/049da5t36

  2. 2 Kedge Business School
    info

    Kedge Business School

    Talence, Francia

    ROR https://ror.org/00wk3s644

  3. 3 Toulouse Business School
    info

    Toulouse Business School

    Tolosa, Francia

    ROR https://ror.org/0349y2q65

Livre:
15th International Conference on Soft Computing Models in Industrial and Environmental Applications (SOCO 2020): Burgos, Spain ; September 2020
  1. Álvaro Herrero (coord.)
  2. Carlos Cambra (coord.)
  3. Daniel Urda (coord.)
  4. Javier Sedano (coord.)
  5. Héctor Quintián (coord.)
  6. Emilio Corchado (coord.)

Éditorial: Springer Suiza

ISBN: 978-3-030-57801-5 978-3-030-57802-2

Année de publication: 2021

Pages: 366-373

Congreso: International Conference on Soft Computing Models in Industrial and Environmental Applications SOCO (15. 2020. Burgos)

Type: Communication dans un congrès

Résumé

Investments in the telecom industry are often conducted through private participation projects, allowing a group of investors to build and/or opérate large infrastructure projects in the host country. As governments progressively removed the barriers to foreign ownership in this sector, these investment consortia have become increasingly international. Obviously, an accurate and early prediction of the success of such projects is very useful. Softcomputing can certainly contribute to address such challenge. However, the error rate obtained by classifiers when trying to forecast the project success is high due to the class imbalance (success vs. fail). To overcome such problem, present paper proposes the application of classifiers (Support Vector Machines and Random Forest) to data improved by means of data balancing techniques (both oversampling and undersampling). Results have been obtained on a real-life and publicly-available dataset from the World Bank.