Data Science for International Business with R
Statistics and Modelling
Introduction
This book, “Data Science for International Business with R,” is designed to provide a comprehensive introduction to the application of data science techniques in the context of international business. It aims to equip readers with the skills and knowledge necessary to analyze and interpret complex data sets, enabling them to make informed decisions in a global business environment.
Why R? Why not Python or Julia? R is a powerful language for statistical computing and graphics, making it particularly well-suited for data analysis tasks. It has a rich ecosystem of packages and libraries that facilitate data manipulation, visualization, and modeling. Additionally, R’s strong community support and extensive documentation make it an ideal choice for both beginners and experienced data scientists.
It is also, in my humble view, easier to learn for students and researchers in international business, as it is designed with statistics and data analysis in mind. R’s syntax is intuitive for those familiar with statistical concepts, and its focus on data visualization aligns well with the needs of international business professionals who often need to present data-driven insights. We tap into the great work of people at Posit, with amazing resources such as tidyverse, ggplot2, and dplyr, which make data manipulation and visualization straightforward and efficient. We will also model using tidymodels, which is a framework that simplifies the process of building and evaluating statistical models. Then, I would encourage you to explore Python or Julia, but R will be our main tool in this book. These three languages are what we call the “big three” in data science, and they each have their strengths. R is particularly strong in statistical analysis and visualization, Python excels in general-purpose programming and machine learning, while Julia is known for its high performance in numerical computing. They are functional languages, meaning the code you design is based on functions that take inputs and produce outputs, which is a natural fit for data analysis tasks.
The book covers a range of topics, including data wrangling, exploratory data analysis, and visualization, all using the R programming language. It emphasizes practical applications and real-world examples to illustrate how data science can be leveraged to address challenges in international business.
The book is structured to facilitate learning, with each chapter building on the previous ones. It is suitable for students, researchers, and professionals in international business who are looking to enhance their data science skills.
Book structure
This book is structured into several modules, each focusing on different aspects of data science in international business. The modules are designed to be self-contained, allowing readers to explore specific topics in depth while also building a comprehensive understanding of the field.
The modules include:
Introduction to Data Science: An overview of data science concepts and their relevance to international business.
Data Wrangling: Techniques for cleaning and preparing data for analysis.
Exploratory Data Analysis: Methods for exploring and visualizing data to uncover patterns and insights.
Statistical Modeling: Introduction to statistical models and their application in international business contexts.
Machine Learning: An overview of machine learning techniques and their use in predictive analytics.
Book format
This book is available in multiple formats, including HTML, PDF, and ePub. The HTML version is designed for online reading, while the PDF and ePub versions are suitable for offline use. The book is also designed to be interactive, with embedded code examples and exercises that allow readers to practice their skills.
License
This book is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. This means that you are free to share and adapt the material, provided that you give appropriate credit, do not use it for commercial purposes, and distribute any derivative works under the same license.
Citing this book
The full reference is:
BibTeX:
@book{gsdsqr,
author = {Thierry Warin},
year = 2025,
title = {Data Science for International Business with R},
publisher = {Forthcoming},
address = {Forthcoming},
URL = {https://warin.ca/ds4ibr},
doi = {Your DOI (if available)}
}
Acknowledgements
A special thanks goes to my MSc students at HEC Montreal, whose insights, enthusiasm, and questions during our sessions have greatly enriched this book. Your contributions, whether through discussion, feedback, or collaboration, have been invaluable, and I am deeply grateful for your support.