about coding stuff
EpiBibR Github. EpiBibR stands for “epidemiology-based bibliography for R.” It is the second largest dataset about global coronavirus research and the largest one in R. The R package is under the MIT License and as such is a free resource based on the open science principles (reproducible research, open data, open code). The resource may be used by researchers, whose domain is scientometrics, but also by researchers from other disciplines. For instance, the scientific community in Artificial Intelligence and Data Science may use this package to accelerate new research insights about covid-19. The package follows the methodology put in place by the Allen Institute and its partners to create the CORD-19 dataset with some differences. The latter is accessible through downloads of sub-sets or through a REST API. The data provide important information such as authors, methods, data, and citations to make it easier for researchers to find relevant contributions to their research questions. Our package proposes 22 features for the 139,724 references (on April 16, 2021) and access to the data has been made as easy as possible in order to integrate efficiently in almost any researcher’s pipeline (Warin T, “Global Research on Coronaviruses: An R Package”, J Med Internet Res 2020;22(8):e19615, DOI: 10.2196/19615, PMID: 32730218, PMCID: 7423387).
oxfoR Github. oxforR is based on the Oxford COVID-19 Government Response Tracker (OxCGRT) and allows to retrieve their latest data in a R format. The tracker shows governmental responses to COVID-19 through 17 indicators for all countries.
statcanR CRAN Github. Easily connect to Statistics Canada’s Web Data Service with R. Open economic data (formerly known as CANSIM tables, now identified by Product IDs (PID)) are accessible as a data frame, directly in the user’s R environment.
iriR CRAN Github. The IRI Scoreboard aims at providing robust data and analyses on the contribution of private-sector R&D to sustainable competitiveness and “prosperity”. With iriR, we want to make the IRI Scoreboard’s data readily available. We have also compiled the yearly scoreboards through time to create a cross-section time-series dataset. Researchers and analysts have access to more than 7,500 innovative companies worldwide, which are or have been part of the top 1,000 innovative companies.
spiR CRAN Github. In 2015, The 17 United Nations’ Sustainable Development Goals were adopted. ‘spiR’ is a wrapper of several open datasets published by the Social Progress Imperative (https://www.socialprogress.org/), including the Social Progress Index (a synthetic measure of human development across the world). ‘spiR’‘s goal is to provide data to help policymakers and researchers prioritize actions that accelerate social progress across the world in the context of the Sustainable Development Goals. The Social Progress Index proposes a new perspective on social challenges and needed efforts to accelerate social progress in line with the Sustainable Development Goals. In this context, the goal of ’spiR’ is to allow an easy connection with R to the Social Progress Index in order to benefit from the “power of crowds.” ‘spiR’ is an R wrapper to easily access the Social Progress Index datasets.
gvcR Github. The R package gvcR provides data on risks of disruption or passage restriction at major choke points worldwide. When possible, a list of examples of disruptions and transit delays that have occurred in these locations since 2002 is given. A maximum of three incidents are noted for each risk category. The choke point risks are classified into three categories: - weather and climate risk; security and conflict risk; and political and institutional risk – and further divided these into subcategories, such as ‘haze and fog,’ ‘trade and transit controls’ etc. Each risk also has a code defined as follows: first letter of the category, dash, first of the risk. If the risk letter already exists, it will be accompanied by a slash and then by the first letter of the second risk word.
corridoR Github. corridoR is an R wrapper to easily access the Northern Corridor project database. The project was developed in order to analyse the potential impact of the Northern Corridor on Canadian economy and global maritime traffic. In order to do so, we took more than 20 000 ship voyages passing through the Panama Canal and calculated their marine distances. To analyse the applicability and the profitability of the Canadian Northwest Passage, we took the same 20 000 routes and make them passed hypothetically through the Canadian Arctic. We then compared the distances to see which trips were shorter by the Northern Corridor. Distances are in nautical miles.
shapeR Github. shapeR aims to simplify GIS mapping by giving access through simple R functions to shapefiles from different sources.
EpiBibR ExploR here
statcanR ExploR here
de Marcellis-Warin, N., Marty, F., Thelison, E., Warin, Th. (2021) “Anti-Trust Index” AI Transparency Institute here
including an interactive coding platform:
Économie industrielle avec R
Quantitative methods in International Business with R
Machine Learning for International Business with R
Foundations in quantitative analysis for International Business with R
R Shiny Docker-based platform for web app developments
drhector: a data science-dedicated platform that I designed for my students in my Industrial Analysis course.
quantum simulations
Sports Analytics [Murphy-P3] (2017-2019): Passionate about road cycling, I designed a sports analytics platform that I used in the context of a professional cycling team to predict races during three Tours de France. This experiment was stopped by the pandemic. As an aside note, it gave me a chance to ride some cols (Izoard, Galibier, Tourmalet, Venoux, Alpe d’Huez, etc.).
With the development of different data packages, I have developed several tutorials to question these packages and other APIs. Visit the API tutorials.
To promote the learning of data science, I have developed online interactive data science quizzes that allow for self-assessment.
Some examples here: