Quant Mashup - Eran Raviv

Understanding False Discovery Rate [Eran Raviv]

False Discovery Rate is an unintuitive name for a very intuitive statistical concept. The math involved is as elegant as possible. Still, it is not an easy concept to actually understand. Hence i thought it would be a good idea to write this short tutorial. We reviewed this important topic in the

*- 6 years ago, 4 Apr 2017, 07:14am -*

Understanding K-Means Clustering [Eran Raviv]

Google “K-means clustering”, and you usually you find ugly explanations and math-heavy sensational formulas*. It is my opinion that you can only understand those explanations if you don’t need them; meaning you are already familiar with the topic. Therefore, this is a more gentle introduction

*- 6 years ago, 12 Mar 2017, 07:00pm -*

Outliers and Loss Functions [Eran Raviv]

A few words about outliers In statistics, outliers are as thorny topic as it gets. Is it legitimate to treat the observations seen during global financial crisis as outliers? or are those simply a feature of the system, and as such are integral part of a very fat tail distribution? I recently read a

*- 6 years ago, 19 Feb 2017, 10:11pm -*

Density Confidence Interval [Eran Raviv]

Density estimation belongs with the literature of non-parametric statistics. Using simple bootstrapping techniques we can obtain confidence intervals (CI) for the whole density curve. Here is a quick and easy way to obtain CI’s for different risk measures (VaR, expected shortfall) and using what

*- 6 years ago, 26 Jan 2017, 11:18am -*

Most popular posts - 2016 [Eran Raviv]

Another year. Looking at my google analytics reports I can’t help but wonder how is it that I am so bad in predicting which posts would catch audience attention. Anyhow, top three for 2016 are: On the 60/40 portfolio mix The case for Regime-Switching GARCH Most popular machine learning R packages

*- 6 years ago, 28 Dec 2016, 08:51am -*

Optimism of the Training Error Rate [Eran Raviv]

We all use models. We all continuously working to improve and validate our models. Constant effort is made trying to estimate: how good our model actually is? A general term for this estimate is error rate. Low error rate is better than high error rate, it means our model is more accurate. By far

*- 6 years ago, 5 Dec 2016, 09:38am -*

Central Moments [Eran Raviv]

Sometimes I read academic literature, and often times those papers contain some proofs. I usually gloss over some innocent-looking assumptions on moments’ existence, invariably popping before derivations of theorems or lemmas. Here is one among countless examples, actually taken from Making and

*- 7 years ago, 14 Nov 2016, 10:07am -*

Extreme Value Theory [Eran Raviv]

Extreme Value Theory (EVT) is busy with understanding the behavior of the distribution, in the extremes. The extreme determine the average, not the reverse. If you understand the extreme, the average follows. But, getting the extreme right is extremely difficult. By construction, you have very few

*- 7 years ago, 20 Sep 2016, 04:09am -*

Multivariate Volatility Forecast Evaluation [Eran Raviv]

The evaluation of volatility models is gracefully complicated by the fact that, unlike other time series, even the realization is not observable. Two researchers would never disagree about what was yesterday’s stock price, but they can easily disagree about what was yesterday’s stock volatility.

*- 7 years ago, 1 Sep 2016, 03:31am -*

Why bad trading strategies may perform well [Eran Raviv]

You probably know that even a trading strategy which is actually no different from a random walk (RW henceforth) can perform very well. Perhaps you chalk it up to short-run volatility. But in fact there is a deeper reason for this to happen, in force. If you insist on using and continuously testing

*- 7 years ago, 12 Aug 2016, 02:59am -*

Human significance, economic significance and statistical significance [Eran Raviv]

We are now collecting a lot of data. This is a good thing in general. But data collection and data storage capabilities have evolved fast. Much faster than statistical methods to go along with those voluminous numbers. We are still using good ole fashioned Fisherian statistics. Back then, when you

*- 7 years ago, 3 Jul 2016, 12:12pm -*

Forecast combinations in R [Eran Raviv]

Few weeks back I gave a talk in the R/Finance 2016 conference, about forecast combinations in R. Here are the slides.

*- 7 years ago, 13 Jun 2016, 10:23am -*

Most popular machine learning R packages [Eran Raviv]

The good thing about using open-source software is the community around it. There are very many R packages online, and recently CRAN package download logs were released. This means we can have a look at the number of downloads for each package, so to get a good feel for their relative popularity. I

*- 7 years ago, 16 May 2016, 11:21am -*

Forecast averaging example [Eran Raviv]

Especially in economics/econometrics, modellers do not believe their models reflect reality as it is. No, the yield curve does NOT follow a three factor Nelson-Siegel model, the relation between a stock and its underlying factors is NOT linear, and volatility does NOT follow a Garch(1,1) process,

*- 7 years ago, 3 May 2016, 02:34am -*

Measurement error bias [Eran Raviv]

What is measurement error bias? Errors-in-variables, or measurement error situation happens when your right hand side variable(s); your x in a y_t = \alpha + \beta x_t + \varepsilon_t model is measured with error. If x represents the price of a liquid stock, then it is accurately measured because

*- 7 years ago, 25 Apr 2016, 04:42am -*

The case for Regime-Switching GARCH [Eran Raviv]

GARCH models are very responsive in the sense that they allow the fit of the model to adjust rather quickly with incoming observations. However, this adjustment depends on the parameters of the model, and those may not be constant. Parameters’ estimation of a GARCH process is not as quick as those

*- 7 years ago, 5 Apr 2016, 12:30am -*

On the 60/40 portfolio mix [Eran Raviv]

Not sure why is that, but traditionally we consider 60% stocks and 40% bonds to be a good portfolio mix. One which strikes decent balance between risk and return. I don’t want to blubber here about the notion of risk. However, I do note that I feel uncomfortable interchanging risk with volatility

*- 7 years ago, 24 Mar 2016, 03:44am -*

ASA statement on p-values [Eran Raviv]

There are many problems with p-values, and I too have chipped in at times. I recently sat in a presentation of an excellent paper, to be submitted to the highest ranked journal in the field. The authors did not conceal their ruthless search for those mesmerizing asterisks indicating significance. I

*- 7 years ago, 14 Mar 2016, 05:03am -*

Multivariate volatility forecasting, part 6 - sparse estimation [Eran Raviv]

First things first. What do we mean by sparse estimation? Sparse – thinly scattered or distributed; not thick or dense. In our context, the term ‘sparse’ is installed in the intersection between machine-learning and statistics. Broadly speaking, it refers to a situation where a solution to a

*- 7 years ago, 15 Feb 2016, 01:12pm -*

Linear regression assumes nothing about your data [Eran Raviv]

We often see statements like “linear regression makes the assumption that the data is normally distributed”, “Data has no or little multicollinearity”, or other such blunders (you know who you are..). Let’s set the whole thing straight. Linear regression assumes nothing about your data It

*- 7 years ago, 26 Jan 2016, 08:58am -*

Curse of dimensionality part 1: Value at Risk [Eran Raviv]

The term ‘curse of dimensionality’ is now standard in advanced statistical courses, and refers to the disproportional increase in data which is needed to allow only slightly more complex models. This is true in high-dimensional settings. Here is an illustration of the ‘Curse of

*- 7 years ago, 17 Jan 2016, 10:35pm -*

Most popular posts – 2015 [Eran Raviv]

The top three for the year are: Out-of-sample data snooping Code for my yield curve forecasting paper Review of a couple of books I personally enjoyed the most writing a few words on ML estimation, and about those great statistical discoveries. Since the last post did not involve any code or images

*- 7 years ago, 4 Jan 2016, 08:15am -*

Present-day great statistical discoveries [Eran Raviv]

Some time during the 18th century the biologist and geologist Louis Agassiz said: “Every great scientific truth goes through three stages. First, people say it conflicts with the Bible. Next they say it has been discovered before. Lastly they say they always believed it”. Nowadays I am not sure

*- 7 years ago, 21 Dec 2015, 08:48pm -*

Orthogonal GARCH [Eran Raviv]

In multivariate volatility forecasting (4), we saw how to create a covariance matrix which is driven by few principal components, rather than a complete set of tickers. The advantages of using such factor volatility models are plentiful. First, you don't model each ticker separately, you can

*- 7 years ago, 7 Dec 2015, 01:19am -*

'predictions', 'forecasts' or 'projections'? [Eran Raviv]

Perhaps it is the different jargon used in different disciplines, not sure. But for some reason, the terms ‘predictions’, ‘forecasts’ and ‘projections’ are frequently used interchangeably. There should be at least some distinction, here is what I entertain: The word ‘predictions’

*- 8 years ago, 3 Dec 2015, 12:41am -*

Correlation and correlation structure (3), estimate tail dependence using regression [Eran Raviv]

What is tail dependence really? Say the market had a red day and saw a drawdown which belongs with the 5% worst days (from now on simply call it a drawdown): weekly SPY returns One can ask what is now, given that the market is in the blue region, the probability of a a drawdown in a specific stock?

*- 8 years ago, 11 Nov 2015, 01:47am -*

Multivariate volatility forecasting (4), factor models [Eran Raviv]

To be instructive, I always use very few tickers to describe how a method works (and this tutorial is no different). Most of the time is spent on methods that we can easily scale up. Even if exemplified using only say 3 tickers, a more realistic 100 or 500 is not an obstacle. But, is it really

*- 8 years ago, 20 Oct 2015, 09:34pm -*

Multivariate volatility forecasting (3), Exponentially weighted model [Eran Raviv]

Broadly speaking, complex models can achieve great predictive accuracy. Nonetheless, a winner in a kaggle competition is required only to attach a code for the replication of the winning result. She is not required to teach anyone the built-in elements of his model which gives the specific edge over

*- 8 years ago, 13 Oct 2015, 03:39am -*

Correlation and correlation structure [Eran Raviv]

This post is about copulas and heavy tails. In a previous post we discussed the concept of correlation structure. The aim is to characterize the correlation across the distribution. Prior to the global financial crisis many investors were under the impression that they were diversified, and they

*- 8 years ago, 21 Sep 2015, 02:38am -*

Multivariate volatility forecasting [Eran Raviv]

Last time we showed how to estimate a CCC and DCC volatility model. Here I describe an advancement labored by Engle and Kelly (2012) bearing the name: Dynamic equicorrelation. The idea is nice and the paper is well written. Departing where the previous post ended, once we have (say) the DCC

*- 8 years ago, 29 Aug 2015, 09:38am -*

Correlation and correlation structure [Eran Raviv]

Given a constant speed, time and distance are fully correlated. Provide me with the one, and I’ll give you the other. When two variables have nothing to do with each other, we say that they are not correlated. You wish that would be the end of it. But it is not so. As it is, things are perilously

*- 8 years ago, 19 Aug 2015, 09:21pm -*

Show yourself (look under the hood of a function in R) [Eran Raviv]

Open source software has many virtues. Being free is not the least of which. However, open source comes with “ABSOLUTELY NO WARRANTY” and with no power comes no responsibility (I wonder..). Since no one is paying, by definition it is your sole responsibility to make sure the code does what it is

*- 8 years ago, 10 Aug 2015, 03:45am -*

Mastering R for Quantitative Finance [Eran Raviv]

I have recently reviewed couple of books. The first of which is actually a give-away if you promise to review it. Global Asset Allocation: A Survey of the World’s Top Asset Allocation Strategies Simply register here and get a kindle version after a few days. The review: Relatively short book which

*- 8 years ago, 20 Jul 2015, 02:21am -*

Multivariate volatility forecasting [Eran Raviv]

Introduction When hopping from univariate volatility forecasts to multivariate volatility forecast, we need to understand that now we have to forecast not only the univariate volatility element, which we already know how to do, but also the covariance elements, which we do not know how to do, yet.

*- 8 years ago, 13 Jul 2015, 02:43am -*

How regression statistics mislead experts [Eran Raviv]

This post concerns a paper I came across checking the nominations for best paper published in International Journal of Forecasting (IJF) for 2012-2013. The paper bears the annoyingly irresistible title: “The illusion of predictability: How regression statistics mislead experts”, and was written

*- 8 years ago, 29 Jun 2015, 04:48am -*

PCA as regression [Eran Raviv]

In a previous post on this subject, we related the loadings of the principal components (PC’s) from the singular value decomposition (SVD) to regression coefficients of the PC’s onto the X matrix. This is normal given the fact that the factors are supposed to condense the information in X, and

*- 8 years ago, 17 Jun 2015, 05:03am -*

Quasi-Maximum Likelihood [Eran Raviv]

Beauty.. really? well, beauty is in the eye of the beholder. One of the most striking features of using Maximum Likelihood (ML) method is that by merely applying the method, conveniently provides you with the asymptotic distribution of the estimators. It can’t get more general than that. The

*- 8 years ago, 16 May 2015, 01:30pm -*

Univariate Volatility Forecast Evaluation [Eran Raviv]

Univariate Volatility Forecast Evaluation Regression based test - Mincer Zarnowitz regression Pairwise Comparison - Diebold Mariano test Jarque-Bera test References and credits Why this post? Open-source software is constantly evolving and improving. Code related to some of my previous posts is not

*- 8 years ago, 25 Mar 2015, 07:25pm -*

Yield curve forecasting [Eran Raviv]

One of my Ph.D papers was published recently. It deals with yield curve forecasting. Here is the code for applying the Nelson-Siegel model to any yield curve: ? [Copy to clipboard] Download as.txt create_NS_residuals = function(dat, horiz = 12, initobs = 108, tau = 0.0609){ n = NROW(dat) fc =

*- 8 years ago, 21 Mar 2015, 04:29am -*

Out-of-sample data snooping [Eran Raviv]

In this day and age, paralleling and mining big data, I like to think about the new complications that follow this abundance. By way of analogy, Alzheimer’s dementia is an awful condition, but we are only familiar with it since medical advances allow for higher life expectancy. Better abilities

*- 8 years ago, 20 Mar 2015, 04:36am -*

Energy idiosyncratic volatility [Eran Raviv]

Recently, volatility has been on the up. Generally, we associate rising volatility with a bear regime, but we also know there is a percolating oil shock. Is the volatility we see in the stock market broad-based, or is it the effect brought about by sharp the drop in oil prices (so related to the

*- 8 years ago, 29 Jan 2015, 03:03pm -*

Most popular posts - 2014 [Eran Raviv]

Well.. better late than never: The solid winner this year is: R vs MATLAB (Round 3) Followed by a far second Mom, are we bear yet? (2) and third: Detecting bubbles in real time And my own personal favorite for the year:

*- 8 years ago, 13 Jan 2015, 03:49pm -*