class: center, middle, inverse, title-slide .title[ # Class 3: Data Science and Research ] .author[ ### Thierry Warin, PhD ] --- <style> .col2 { columns: 2 200px; /* number of columns and width in pixels*/ -webkit-columns: 2 200px; /* chrome, safari */ -moz-columns: 2 200px; /* firefox */ } </style> ## Social Data Science <img src="./figures/fig1.png" width="800px" style="display: block; margin: auto;" /> --- ## AI: Methodologies <img src="./figures/fig2.png" width="800px" style="display: block; margin: auto;" /> --- ## Interdisciplinary Methodologies ### From the 4th Industrial Revolution to Society 5.0 <img src="./figures/fig3.png" width="600px" style="display: block; margin: auto;" /> --- ## Social Data Science: The Missing Link <img src="./figures/fig4.png" width="500px" style="display: block; margin: auto;" /> --- ## Research Projects Using Data Science ### Structured Data Industrial Organization / Monetary Policy <div class="col2"> <img src="./figures/fig5.png" width="300px" style="display: block; margin: auto;" /> <ul> <li>RQ: How to detect systematic differences in price dispersion across sectors? What are the reasons of such differences?</li> <li>Methodology: Econometrics, Web Scraping </li> <li>Category: Nowcasting, Inflation, Price dispersion</li> <li>The Wow Effect: From millions of prices about 30'000 products, real-time measures of Market Thickness and the Value of Information</li> <li>Literature:</li> - "Are Online and Offline Prices Similar? Evidence from Large Multi-Channel Retailers", Alberto Cavallo, American Economic Review, January 2017, Vol 107 (1) - "The Noisy Monopolist: Imperfect Information, Price Dispersion and Price Discrimination", Steven Salop, Review of Economic Studies, 1977, vol. 44, issue 3, 393-406 </ul> </div> --- ## Research Projects Using Data Science ### Structured Data Finance <div class="col2"> <img src="./figures/fig6.png" width="400px" style="display: block; margin: auto;" /> <ul> <li>RQ: To develop low cost solutions for middle class investors</li> <li>Methodology: Monte Carlo, Bayesian Networks</li> <li>Partner Authors and Institutions: CIRANO, HEC Montreal, AMF (Financial Markets Regulator)</li> <li>The Wow Effect: Use of robots to provide a dynamic and tailored portfolio allocation to each customer instead of a generic risk profile</li> <li>Category: Risk Portfolio</li> </ul> </div> --- ## Research Projects Using Data Science ### Structured Data Finance <div class="col2"> <img src="./figures/fig7.png" width="400px" style="display: block; margin: auto;" /> <ul> <li>RQ: Using boards of directors as a proxy for knowledge pipelines between financial firms, how can proxy systemic risk in the financial industry? </li> <li>Methodology: Network Analysis</li> <li>Partner Authors and Institutions: CIRANO, HEC Montreal, AMF (Financial Markets Regulator)</li> <li>The Wow Effect: 43,499 directors; 2,209 financial firms; 52 countries can be visualized to revealed social ties across countries</li> <li>Category: Systemic Risk, Financial Industry, Governance</li> <li>Literature:</li> - Kogut, B. and Colomer, J. (2012) ‘Is there a global small world of owners and directors?’, in Kogut, B. (Ed.): The Small Worlds of Corporate Governance, pp.259–299, The MIT Press, Cambridge. </ul> </div> --- ## Research Projects Using Data Science ### Structured Data Economic Integration <div class="col2"> <img src="./figures/fig8.png" width="300px" style="display: block; margin: auto;" /> <ul> <li>Can we observe a regional specialisation - or convergence - dynamic in Europe through cluster life cycle?</li> <li>The Wow Effect: +5M clusters data collected; 553'007 European observations used (67 clusters, 279 regions, 36 countries, 20 years, 6 indicators) ; over 10'000 maps created ; regional specialisation dynamic in developed economies (Western Europe) ; convergence dynamic in developing economies (Eastern Europe) due to a catching-up effect </li> <li>Category: Regional Integration, Clusters, Convergence, Geographical Economy, Data Science</li> <li>Literature:</li> - Sala-i-Martin, Xavier. 1996a. “Regional Cohesion: Evidence and Theories of Regional Growth and Convergence.” European Economic Review 40:1325-1352. </ul> </div> --- ## Research Projects Using Data Science ### Unstructured Data Political Science <div class="col2"> <img src="./figures/fig9.png" width="400px" style="display: block; margin: auto;" /> <ul> <li>RQ: In countries where institutions do not have the highest level of standards, can we use social media to gather information on an incoming election?</li> <li>Methodology: Econometrics, Text mining, 3.8 million tweets</li> <li>The Wow Effect: 2 days before the election, a change in the electoral dynamics has been noticed, which at the end was anticipating the results of the election</li> <li>Category: Political Risk</li> <li>Literature:</li> - KHEMANI, S. (2015): “Buying Votes vs. Supplying Public Services: Political Incentives to Under-Invest in Pro-Poor Policies,” Journal of Development Economics, 117, 84–93. </ul> </div> --- ## Research Projects Using Data Science ### Unstructured Data Political Science <div class="col2"> <img src="./figures/fig10.png" width="500px" style="display: block; margin: auto;" /> <ul> <li>RQ: Despite being third in vote intention at the start of the campaign, Justin Trudeau won a majority victory. During the 2015 General Election in Canada, how were each political leader perceived?</li> <li>Methodology: Unsupervised Learning, Social Media Analysis, Linguistics</li> <li>The Wow Effect: Scandals at the start of the campaign lasted for a month and were constantly associated to the incumbent candidate</li> <li>Category: Political Risk, Reputational Risk</li> <li>What about corporate reputation?</li> </ul> </div> --- ## Research Projects Using Data Science ### Unstructured Data Migrations <div class="col2"> <img src="./figures/fig11.png" width="400px" style="display: block; margin: auto;" /> <ul> <li>RQ: Mapping the conversation in Europe about the refugee crisis.</li> <li>Methodology: Geographic Information Systems, Text Analysis, Social Media Analysis</li> <li>Partner Authors and Institutions: Jeffry Frieden, Harvard University, SKEMA Business School</li> <li>The Wow Effect: Possibility to add a spatial dimension to the conversations (unstructured data), as well as combining with traditional data such as the number of refugees, the country of origin or the route of destination.</li> <li>Literature:</li> - "Measuring the Reliability of Hate Speech Annotations: The Case of the European Refugee Crisis", Ross, Michael Rist, Guillermo Carbonell, Benjamin Cabrera, Nils Kurowsky, Michael Wojatzki, Arxiv, 2017. </ul> </div> --- ## Research Projects Using Data Science ### Unstructured Data Monetary Policy <div class="col2"> <img src="./figures/fig12.png" width="450px" style="display: block; margin: auto;" /> <ul> <li>What is the reaction of the European Central Bank and its Presidents to the events occurring in the Eurozone?</li> <li>Methodology: Linguistics, Text Analysis, LDA</li> <li>The Wow Effect: War and Peace + The Wealth of Nations</li> <li>Category: Communication of Institutions, Europe, Central Bank</li> <li>Literature:</li> - Amaya, J.-Y. Filbien "The similarity of ECB׳s communication" Social Science Research Network, Rochester, NY (2015) </ul> </div> --- ## Research Projects Using Data Science ### Unstructured Data International Trade <div class="col2"> <img src="./figures/fig13.png" width="400px" style="display: block; margin: auto;" /> <ul> <li>RQ: Regionalization or Globalization?</li> <li>Methodology: Content Analysis, Network Analysis, Clustering</li> <li>The Wow Effect: All PTA trade deals between countries</li> <li>Category: International Trade</li> </ul> </div> --- ## Research Projects Using Data Science ### Unstructured Data Innovation <div class="col2"> <img src="./figures/fig14.png" width="350px" style="display: block; margin: auto;" /> <ul> <li>RQ: How to assess whether innovations of China's pharmaceutical industry differ compared to innovations elsewhere in terms of the nature and value of patents?</li> <li>Methodology: Text Analysis, Unsupervised Learning, LDA</li> <li>The Wow Effect: More than 100,000 patents are analyzed to reveal the most important firms and institutions in the pharmaceutical industry in China, as well as the nature of patents. </li> <li>Category: Innovation, Emerging Markets</li> <li>Literature:</li> - "Innovation assessment through patent analysis", BP Abraham, SD Moitra - Technovation, 2001) </ul> </div> --- ## Research Projects Using Data Science ### Unstructured Data Innovation <div class="col2"> <img src="./figures/fig15.png" width="400px" style="display: block; margin: auto;" /> <ul> <li>RQ: regarding Artificial Intelligence, where is the innovation coming from? What are the new developments in a disruptive industry?</li> <li>Methodology: Unsupervised Learning, LDA</li> <li>The Wow Effect: 55,109 patents related to AI were analyzed; new trends in computational intelligence were revealed; leading firms (IBM, Microsoft, Google) are among the main contributors of AI patents</li> </ul> </div> --- ## Research Projects Using Data Science ### Unstructured Data GIS, Drones and Economic Development <div class="col2"> <img src="./figures/fig16.png" width="250px" style="display: block; margin: auto;" /> <ul> <li>Port-au-Prince, Spring 2018</li> </ul> </div> --- ## Research Projects Using Data Science ### Unstructured Data Frackmap Project <div class="col2"> <img src="./figures/fig17.png" width="500px" style="display: block; margin: auto;" /> <br> <ul> <li>RQ: In countries embracing a booming controversial industry (fracking), is there resonance or dissonance between the public’s risk perception and the state of academic research?</li> <li>The Wow Effect: Analysing 60,000 geo-located tweets and (10,000; 600) peer-reviewed articles revealed a very high acceptability despite the major concerns of the scientific community (seismicity, health & environmental impacts, occupational health)</li> <li>Category: Industrial Risks, Risk Perception, Public Health</li> </ul> </div>