An empirical analysis of the data can be thought of as consisting of two parts: (1) summary statistics of the variables in the data, and (2) testing the hypotheses of a theoretical model using data.
 

REGRESSION
Regression analysis is a statistical technique which facilitates the study of the relation between a ‘dependent’ or ‘explained’ variable on one hand and a set of ‘independent’ or ‘explanatory’ variables on the other. Depending on the type of the explained variable considered and the nature of the research question (e.g., causality versus correlation), there are specific types of regression analyses.
ORDINARY LEAST SQUARE (OLS)
This is the most basic and widely used regression technique. Most of the other regression techniques are essentially modifications of the OLS setup in some particular way. The ‘least square’ technique gets its name from the fact that the parameters of the regression equation are estimated by minimizing the sum of the squares of the deviation of the dependent variable from the regression function.
More on (OLS)
Results: Stata Output
Interpreting Regression Results
REGRESSION WITH DUMMY VARIABLE
Dummy variables, also known as indicator variables, are those which take the values of either 0 or 1 to denote some mutually exclusive binary categories like yes/no, absence/presence, etc. When one or more of the explanatory variables is a dummy, the standard OLS regression technique can still be used. However, a categorical dependent variable calls for a different regression technique, e.g., the logistic regression (or logit) and the probit.
Click here for more on
 Logit
 Probit
REGRESSION WITH COUNT VARIABLE
When the dependent variable is a nonnegative count variable, e.g., the number of telephone calls made by a callcentre agent in an hour, the number of deaths in a war, the number of victims of smallpox during an epidemic, etc., the standard OLS regression technique should not be used. Instead one has a choice of using either a Poisson regression, or a Negative Binomial Regression, or a ZeroInflated Negative Binomial Regression.
QUANTILE REGRESSION
Usually regression coeffcients can be thought as the marginal impact of the explanatory variable on the mean of the dependent variable. However, the researcher might be interested in knowing the the marginal impact of the explanatory variable on certain quantiles of the distribution of the dependent variable, e.g., what is the impact of the food stamp program on the consumption of the bottom 10% of the consumption distribution? Such questions are addressed by implementing a quantile regression.
INSTRUMENTAL VARIABLE REGRESSION(IV)
When there is reason to believe that the explanatory variable is correlated with the error term in the OLS regression (in other words, the explanatory variable is endogenous), then it cannot be claimed that the explanatory variable causes the change in the dependent variable, rather simply that there exists a correlation. One way to argue for causality is to use the technique of instrumental variable regression where one uses an exogenous variable (called the instrument) that is uncorrelated with the error term in the OLS but correlated with the endogenous explanatory variable.
PANEL REGRESSION
When data is available over time and over the same individuals then a panel regression is run over these two dimensions of crosssectional and timeseries variation. Panel regression is essentially an OLS regression with some added properties and interpretation like fixed effects, random effects, pooled crosssection, etc.
DIFFERENCE IN DIFFERENCE (DIFFINDIFF OR DID)
To argue for the causal impact of a treatment on a dependent variable, the technique of DifferenceinDifference (or double difference) is used where the impact of the treatment is defined as the difference in average outcome in the treatment group before and after treatment minus the difference in average outcome in the control group before and after treatment: it is literally a ‘difference of differences’.
REGRESSION DISCONTINUITY (RD)
Regression discontinuity design is useful for estimating the causal effect of an explanatory variable in the case where there is an observable jump or discontinuity in the level of the explanatory variable.
DATA
I. Development and Global Health
United Nations Data
World Bank Data
 DataCatalog (Social, economic, financial, natural resources, environment etc.)
 The Gender Data Portal
 World Development Indicators (WDI)  Africa Development Indicators
 Global Bilateral Migration Database
Maternal and Child Health Data
Demographic and Health Surveys
 Health, HIV, Nutrition etc. – Developing countries: https://dhsprogram.com Extract via: http://statcompiler.com/
Global Health Observatory (GHO)
 SDG health and healthrelated target indicators: https://www.who.int/gho/en/
 Mortality & Burden of Disease
 Health Equity
Education
II. International Trade
Canada Trade Data
 Trade Statistics
 Trade Agreements
 Trade Profile
 Tariff Profile
 Regional Trade Agreements
 Antidumping Data
Global Antidumping Data
 Article: “Let’s Inject Antitrust Principles into Antidumping Law”
 Global Antidumping Database
World Bank Data
World Trade Organization (WTO) Antidumping Data
 Statistics
 Sector Codes
 Initiations by Sector
Antidumping Sectoral Distribution
Countervailing Sectoral Distribution
Regional Trade Agreements
 Participation Map
 WTO Database
International Trade Centre Data
World Bank Data
 World Integrated Trade Solution (WITS),
 UNCTAD TRAINS Database has detailed tariff level data for 8 digit HS product data for all countries 19892014
(MFN tariff, AD, specific duties etc. )
Country Profiles (WTO Data)
 Trade Profiles (trade situation of members, observers and other selected economies)
 Tariff Profiles (market access situation of members, observers and other selected economies)
 Services Profiles (detailed statistics on key infrastructure services (transportation, telecommunications, finance and insurance) for selected economies.
Aid for Trade Profiles (download complete set of profiles)
Time Series Data
Trade Policy Measures
III. Country Level
Chinese Data
IV. UBC
 British Columbia InterUniversity Research Data Centre
Available datasets:
 Canadian Community Health Survey (CCHS)
 Ethnic Diversity Survey (EDS)
 General Social Survey (GSS selected cycles)
 Longitudinal Survey of Immigrants to Canada (LSIC)
 National Longitudinal Survey of Children and Youth (NLSCY)
 National Population Health Survey (NPHS)
 Survey of Labour and Income Dynamics (SLID)
 Workplace and Employee Survey (WES)
 Youth in Transition Survey and the Programme for International Student Assessments (YITSPISA)