Causality in econometrics: methods in conversation with practice

kedo imbanz was born in 1963 in the,netherlands he received his phd in 1991,from brown university providence usa,he is the applied econometrics professor,and professor of economics at stanford,university usa,so please,welcome,guido,i want to thank the committee for the,prize and in particular for highlighting,the importance of credible causal,inference the estimation of causal,effects comparisons of outcomes under,different treatments or policies from,observational non-randomized data is,important for providing advice to policy,makers,in the last three decades there's been,an explosion of work in this area in,economics as well as many other,disciplines and i see the prize as a,recognition for all of this work,i grew up in a small town in the,netherlands,but in high school i was faced with the,decision what major to choose for,college,my economics teacher lent me a book by,the 1969 nobel laureate and dutch,econometrician john tim bergen which,appealed to me with its mixture of,mathematics and practical relevance,i was particularly impressed with tim,berg's ability to combine high-level,academic work with involvement in policy,advice and i decided to enroll in the,ecometrics program in rotterdam that,timbering had founded in the 1960s,after a detour through an exchange,program with the university of hull i,did my phd at brown university in the us,since then i've moved around from coast,to coast before settling in 2012 at,stamford university in california during,my undergraduate and graduate days i was,not exposed to much work that explicitly,focused on causality,although there had been case studies,going back as far as the 1860s including,the snow study on the causes of cholera,and econometrics has implicitly always,focused on causality,the explicit use of the term causality,was rare through the 1980s and early 90s,it only started increasing sharply after,1995 with currently over 50 of working,papers in the national bureau of,economic research working paper series,using the term,as the figure on the right based on the,work by curry cleveland and zwies shows,looking back at my own work from that,time josh and i did not use the term,causal in a 1994 paper on the local,average treatment effect but two years,later use it over 100 times in a 1996,paper with donald rubin nowadays calls,on inference is a fast growing vibrant,and interdisciplinary field with,researches in statistics political,science economics computer science,epidemiology and other areas working on,common problems coming from different,perspectives with different tools and,interacting closely in conferences and,seminars during the pandemic i started,together with other researchers in this,area and online into the university,seminar on course inference that weekly,attracts hundreds of attendees,in the last year society for causal,inference has been founded to further,bring together this community,to put the themes of this talk in,context let me start by being clear,about what i mean by som

okay so this video we are talking about,how to,use excel to run a basic single and,multiple regressions,and i'm gonna do all the models for,the homework five that we have um,okay so let's start with the growth,uh file so this is the excel file,um for a regression usually,or um by default,that we have variable names on the first,um row okay,so we want all the variable names to be,only one word okay so if you have two,words you,kind of like use an underline to connect,them so that,it's it's only one word there is no,blank,between all the things so for example,country name,um you can use an underline to collect,connect,these two names okay so this is um,some of the very basic um,rules that we follow okay,and the reason why we do that is because,when we use programming languages,if if your variable names contains more,than one word,then the programming language might,think they are different variables,okay so all the variable names has to be,only one word so for example chase share,you can just just combine these two,words into one word,years of school you can ignore the off,because you understand what it means um,i don't remember what this uh rev,cults mean um,but um all of these are the,values of these variables and what are,these values for,they are for different countries okay so,the first,column is always an identity of,each observations okay it could be,different school districts okay school,district one school district two,um in this case it's um countries,okay or in some other cases it could be,individuals,okay or household household one,household two,and you can have the name for the,households you can also have the id,of the households um we usually don't,run the regression with variable one,because because they're names they don't,have any,um numeric meaning but um,we will run regressions from the second,to last columns,and um what's our dependent variable,okay our dependent variable here is,growth,okay um it's percentage of,gtv growth um and,and so we we naturally imagine that the,last few columns will be,our x variables okay,um you you can see that these variables,looks like they are continuous and,column c looks like,it's a dummy variable okay or it turns,out,it doesn't have any information,so you can look at here okay,um go down if you select,column c you can see that the average is,a zero,count 66 okay town 66,means that we have 65 countries okay,i don't know maybe 66 let's let's see,uh we have 65 countries because the the,first column,is um the variable name sorry first row,is variable name,all right so it seems like um,oil doesn't have any any information,anyways let's go to the models okay so,the models are not given,um i wrote them down based on,our homework okay um you are required to,practice all of them,before our homework practice class sorry,our homework class so that um in the,homework uh when i when i explain the,homeworks you can um,directly look at our regression results,okay so let's start with the first model,the first

today,we're going to talk about basic,regression analysis,with time series data,where are we in the course we have just,talked about,heteroskedasticity and this material,here,is based on chapter 10 of wooldridge's,book,introduction to econometrics,let me start with,these three examples of,time series plots what you always see,here,is time on the horizontal axis,here here and here then you know,you see that you can,show time series data in various ways,either with bars,over the line,but in the end of the day what you do,see is development over time,what is this for instance this dark,red line is the total supply of oil,this dark blue line is the price of oil,over time,same here supply,dark red price,light blue what's interesting in this,right plot here,is um that we see not only,the supply or production of,oil here in libya but we also see,in red here various things,that have happened and that,do at least partly explain,why it changed over time these three,plots are taken from the economist,as you can see in the bottom,this example here is for financial,markets,so what um,the authors of this paper here,have done is um they have looked,at a lot of announcements on the stock,market,basically you know when the announcement,takes place,and what they studied was how the stock,price reacted to positive news,what you see here on the horizontal axis,is the days before,and after the announcement so zero is,the dividend,announcement day so what you see is,actually it's an announcement,about the dividend zero is the,announcement day,minus two is two days before three days,before,four days before and so on um,and what you see plotted,is what is called cumulative returns,um so what you do there is,you take the returns on one day,so from here then you have five percent,and then you know you start here and you,add,from this point here the returns,on the day after the announcement and,you added,and basically you do it to construct,this line what you do see is,that there is a big systematic movement,that is driven by the dividend,announcement,and before and after it's kind of flat,okay so what this tells us is um,that there is no systematic movement,basically so um,12 days before the announcement,and two days before the announcement the,stock price,is more or less the same and also here,you see,after the return after the dividends of,the good news have been shared,the stock price jumps up and then,um 12 days later,it is more or less at the same level as,it was,right after jumping up,third example here um is taken,from schiller's contribution here,1981. what you see here is,two time series one is,um what the stock market did,and that would be the blue line that's,denoted by p,and then you have a dashed line and you,can think of the dashed line as,you know the fundamental underlying,value of the,uh of of what is traded on the stock,market,so schiller's point is always that there,is some,extra variance,on the stock market driven by all kinds,of ways,in which actors i

hi everyone and welcome again to nettle,the go-to place to learn about business,finance economics and much much more,please don't forget subscribe to our,channel and click the bell notification,button below so that you never miss,fresh videos and tutorials you might be,interested in many things start current,patreon supporters and youtube members,for making this video possible and would,also greatly appreciate if you consider,supporting us as well so please check,the link in description or click the,join button below for more details,my name is sava and today we're,investigating a foundational key concept,and model in financial kind of metrics,and,more broadly time series grammatics in,general that is the moving average model,or m a model for short and don't let the,title confuse you it has no relation,whatsoever to the moving average,technical guards indicator or to the,moving average convergence division,strategy,it's a thing in itself and today we'll,learn how to apply it to financial time,series,moving average models are generally,applied to stationary time series and,the most common stationary time series,we have in finance are asset returns so,let's have,10 years worth of s p 500 data calculate,the returns and try to model the time,series dependence of,s p 500 or marketing efficiency if you,will the predictability of smp 500,returns,using the ma model of order 4 and i'll,touch upon the mathematics and the,specification of the model and why one,might use it and what challenges it is,associated with a little bit later first,let's calculate daily returns of the,index index value today over the index,value yesterday minus one as usual apply,it throughout the whole sample,and then to start with let's calculate,the average,of all daily returns that would be our,starting specification,and,count,the values as well to calculate the,degrees of freedom for hypothesis,testing,and let's,investigate,the peculiarity of the moving average,model and how is it different to for,example simple ultra aggressive models,or conventional multiple regression,if we look at this general formula we,see that the model,is,representing,returns in time t or any variable in,time t some time series value as a,constant a,random disturbance term error term or,residual term epsilon t,but here is the catch it also adds a,weighted sum of lagged residuals,weighted by,their individual coefficients beta i,and it goes up until n lags where n is,the order of the moving average model,the simplest would perhaps be,ma1 moving average model of order one,where you would just have,one lagged residual epsilon t minus one,the residual for the previous day,and it would have just one beta however,to make it a little bit less trivial and,a little bit more generalizable we'll,consider,a moving average model of order four,to potentially model a more long-term uh,dependence in the behavior of s p 500 to,have,a more accurate forecast potentially,and here we have got four,betas multiplied by four lak

okay good afternoon class,i welcome to ecn 305,introduction to econometrics class,we are going to be treating the second,topic today,in the last topic we are able to talk,about,the definition and many of econometrics,the uses of econometrics,and the relationship between,econometrics and mathematics,geoeconomic theory as well as statistics,we have successfully defined,econometrics,to be application of,mathematics for economics economic,theory statistics,in estimating parameters in a model,for the purpose of decision making,it means the application of econometrics,involves solving a problem to make,decisions,so in that respect there are various of,stages that need to,take place before we can apply,econometrics,in solving any problems,and that is what we are going to be,looking at today which is start,econometrics research,studies uh the first one,which is the problem identification or,statement,of problem now this is the first step in,conducting,any research,and the first step in any research,is to state what the problem is it can,equally be called,conception of ideas now,looking at your environment looking,around you,or looking at any economic situations,something is going wrong you can,actually identify this,is a particular problem that begs for,solution,and that is the first stage of any,research,then what can we call a problem,a problem is something that you have,perceived or you have seen,in the environment that needs to be,solved,that is not going on well that it's,contrary to what is supposed to be,happening what ought to happen in the,economy,now for example there is constant,increase,in prices of goods and services,in the economy which can be tagged,inflation,now a an economist or anybody,can just uh ask the question,what are the courses what are the,factors influencing,the increase in price level of commodity,in the market this is a problem that has,been identified,this is an idea that has been conceived,in the mind of a researcher,that want to provide our solutions or,that i want to provide,answers to this uh problem,now let's for example we have talked,about,inflation and factors that are,influenced,or determines the increases,in these prices so now that is the first,stage of,econometrics research the second stage,is to formulate hypothesis in the first,stage we have identified,inflation as an increase,continuous increase in price of goods,and services,so in that case factors such as,possibly exchange rate for example,level of production and so on and so,forth,might be seen as or and money supply,in the economy might be seen as possible,factors,that is influencing or causing increase,in price level in the economy now,this the second stage of econometrics,research,is formulation of hypothesis now,hypothesis,is a congestion statement that needs to,be validated,hypothesis is just a guess statement,it's just,a mere statement that needs to undergo,scientific process,of validation so now what you have,conceived,in your mind what you have conc

video from crotch econometrics I'll be,showing you some simple tricks on how,you can reshape your data that isn't a,wide format to a long format and also,how you can generate ideas for your,salvations for panel data analysis,whenever we download this data from,external databases like the World Bank,or IMF they are often in white formats,which cannot be fed in through Stata in,the aero foam they need to be prepped up,and cleaned up within the Excel,interface so I'm going to show you some,simple tricks and simple commands in,Excel that you can use in this example I,have 46 countries each of them having 17,variables across 36 years so if I'm,Tamar Valley generates ID so identifies,all numbers for these observations,there's a likelihood that mix takes time,or Co and as a researcher I'm not,willing to take chances so I'm going to,teach you some tricks by which you can,actually reshape this data within the,Excel interface number one we have to,generate IDs identifiers for each of the,46 countries so how do we do that we,begin by clicking right here column a,right click and we'll generate a new,column I call that new column C,underscore ID because I'm getting,written an ID for each country each of,our six countries so I type in one for,say a two and then I have to give Stata,sorry excel and instruction equals if,open the bracket if B 3 that is the,observation in B 3 is the same thing as,the observation in b2 comma assign,number a true comma if otherwise,assigned number a 2 plus 1 let's see,what Excel will do SL are generated one,for me meaning and Goulet is the same in,cell b3 as the one in B so B 2 so what I,do next is I bring my mouse to the end,of cell a3 each industry plus sign a,double click on that and Excel has,generated these IDs for,remember I said I have for the SIS,countries so let me do control end and,that takes me to the end to see what I,have number 46 for Zimbabwe so this is,correctly done Zimbabwe has 4 to 6,because column a 1 is generator with a,formula I cannot fit it to stand out,that way I have to remove the formula so,what I do next is to create another,column,I call that columns C underscore ID,because I'm still working on country IDs,I click on the B column I copy it I,click on column a then I go to paste,click on paste special the taste special,dialog box opens off I click on values,and okay now stata sorry Excel has,generated the numbering for me and I,have done better by removing the formula,so I delete column B so if you click on,on a now each of these countries are,known by their number and it's as if I,manually did it but Excel data for me,now I need to generate numbering for,each of the series that is each of the,variables why I need to sort them forced,so I alas the entire worksheet I go to,sort and filter,I sought by series alphabetically a to Z,so you can see that my series variables,I have all been sorted correctly it,should be easy now to generate IDs for,them I right click I create a new column,and I call this s unde

In this video I want to talk about what actually do we mean,by econometrics. So econometrics,is in general a statistical tool set which
helps us to evaluate,some sort of relationship of interest.,An example might be we're interested in,
for individuals,,what is the effect of an individual's level of education,on the average wage which that individual,might expect to obtain. So if an individual's level of education increases,we might expect that the level of
wages,which an individual obtains on average might increase.,So if I was to plot a graph of the,level of education of individuals on the,x-axis, against the wages,which a group of individuals have obtained
on the y-axis,,then we might hope to see some sort of
positive correlation between these two,variables.,That's not to say that is necessarily a
causal relationship,only that there is some sort of positive
relationship between these two variables.,Econometrics help us to
quantify,this degree of correlation by in a sense
drawing,a line through the centre of all those points.,And by drawing a line through the centre of all those points,we are hoping to capture what is the
average effect,of education on wages. So,on average an individual who has 10 years of education,,might expect to obtain a wage which is let's
say,four hundred dollars. Whereas an
individual,perhaps if they had 11 years worth of
education,might expect back their wages to go up,by a hundred dollars. So they now earn
five hundred dollars.,Well econometrics is a toolset,for finding out what the strength of this relationship is. So,how much do wages actually go up by.
And,this type of relationship here where we're
concerned with the relationship,for individual people or individual
firms is,the subject of microeconometrics.,And it's called microeconometrics for analogy with,microeconomics. Another sort of microeconometric,relationship we might be
interested in might be, 'What is the,effect,of TV advertising on,a company's level of sales. So,if I was to draw a graph of a,company's level of sales over time,then we might have something which looks
something like this. If there is some sort of,seasonality.,Perhaps this is coffee sales or ice cream sales.,And we might be interested in do these peaks which we observe,in the data - are they caused by TV
advertising?,And TV advertising might look something,like these bars that I have drawn below. So,econometrics is a way of understanding for
this time series data,does this TV advertising here,cause sales to go up? And similarly does this TV,advertising here cause sales to go up. So,this is slightly different to the previous
example in that we are dealing,with what we call time series data,
whereas the original data was what we,call,cross-sectional data. But it's still
actually what we call Microeconometric,data because we are dealing with,data for a particular firm. Another
type,of econometrics is the subject of,macroeconometrics and,macroeconometrics as its name
suggests,is to deal with ma

hi everyone my name is kevin today i,want to show you how you can do,forecasting in microsoft excel and as,full disclosure before we jump into this,i work at microsoft as a full-time,employee so why would you ever want to,do forecasting in microsoft excel and,what does that even mean well you might,have a whole bunch of data let's say you,work at say a workplace or maybe you,have a youtube channel and you have a,bunch of data you could look at how,you've performed in the past and you,could use that to predict or forecast,what the future might look like,all right well why don't we jump on,excel and i'll show you step by step how,you could do this it's a neat thing to,learn how to do,all right here i am on my pc and what,i've done is we're going to have some,fun here and what i've done is i've,downloaded my views on youtube through,the last year so this goes back to may 7,2019 all the way up to may 5th 2020 and,you can see how my views have changed,over time and so let's say for example i,want to forecast well hey what will my,views look like in 30 days or 90 days or,or maybe even a year from now what will,things look like and that's where,forecasting comes into play so here i,could see that you know last year i was,at 5000 views a day and then here i'm at,about a hundred eleven thousand views uh,per day so based on this growth can,excel help me forecast what the future,looks like and the answer is yes excel,can do that so how do we do this well,i'm going to show two different methods,or two different techniques that we,could use to forecast the first one is,pretty simple and this is one that i've,used for a long time what we're going to,do is we're going to go up onto the,pivots up here and we're going to click,on insert,and on insert what we want to do is we,want to insert a line chart and so i'm,just going to insert a very simple 2d,line chart and let's go ahead and throw,this in so now this gives me a nice,visualization of what my views have,looked like over time so you know it,started out there was some growth a,little bit of growth and then especially,recently it's grown quite a bit more and,so this is what the past year looks like,for me,and now to be able to forecast into the,future what does the future hold what i,can do is i'm going to just click on,this line with my left mouse button and,now i'm going to right click on it and,what i can do here is i can add a,something called a trend line so let's,go ahead and throw that in,and within trendline i have a whole,bunch of different options we're going,to walk through what these all different,mean what all these different options,mean and which ones you should use so,one of the things is you'll see this,trend line here and the trend line tries,to match my data as closely as possible,so right now the default is just a,linear,line or trend line and so linear line is,just a straight line and this is the,best attempt at matching the data,one of the things that you'll see though,is it doesn't perf

