Quant Mashup - Eran Raviv

Boundary corrected kernel density [Eran Raviv]

Density estimation is now a trivial one-liner script in all modern software. What is not so easy is to become comfortable with the result, how well is is my density estimated? we rarely know. One reason is the lack of ground-truth. Density estimation falls under unsupervised learning, we don’t

*- 1 month ago, 30 Jul 2020, 07:44pm -*

R + Python = Rython [Eran Raviv]

Enough! Enough with that pointless R versus Python debate. I find it almost as pointless as the Bayesian vs Frequentist “dispute”. I advocate here what I advocated there (“..don’t be a Bayesian, nor be a Frequenist, be opportunist“). Nowadays even marginally tedious computation is being

*- 2 months ago, 5 Jul 2020, 11:14pm -*

Machine learning is simply statistics - part 2 [Eran Raviv]

Another opinion piece. If you can’t explain it simply you don’t understand it well enough. (Albert Einstein) Rant in progress A bit on Deep Learning What is so deep about deep learning? Nothing. There is nothing deep about it. If you read through the excellent Deep Learning book you can see (p.

*- 3 months ago, 2 Jun 2020, 09:45am -*

Curse of Dimensionality part 4: Distance Metrics [Eran Raviv]

Many machine learning algorithms rely on distances between data points as their input, sometimes the only input, especially so for clustering and ranking algorithms. The celebrated k-nearest neighbors (KNN) algorithm is our example chief, but distances are also frequently used as an input in the

*- 5 months ago, 15 Apr 2020, 10:39am -*

Understanding Pointwise Mutual Information [Eran Raviv]

The term mutual information is drawn from the field of information theory. Information theory is busy with the quantification of information. For example, a central concept in this field is entropy, which we have discussed before. If you google the term “mutual information” you will land at some

*- 7 months ago, 27 Jan 2020, 10:01am -*

Most popular posts – 2019 [Eran Raviv]

As every year, I checked my analytics so that I can let you know what was popular. This year I have also experimented with a survey where I asked one question at the end of each relevant post. About 120 replies recieved, but the free Survey Monkey account (the survey provider I went with) only lets

*- 8 months ago, 6 Jan 2020, 09:12am -*

CUR matrix decomposition for improved data analysis [Eran Raviv]

I have recently been reading about more modern ways to decompose a matrix. Singular value decomposition is a popular way, but there are more. I went down the rabbit whole. After a couple of “see references therein” I found something which looks to justify spending time on this. An excellent

*- 11 months ago, 20 Oct 2019, 09:17am -*

Understanding Variance Explained in PCA [Eran Raviv]

Principal component analysis (PCA) is one of the earliest multivariate techniques. Yet not only it survived but it is arguably the most common way of reducing the dimension of multivariate data, with countless applications in almost all sciences. Mathematically, PCA is performed via linear algebra

*- 1 year ago, 4 Sep 2019, 09:17pm -*

Robust Moving Average [Eran Raviv]

Moving average is one of the most commonly used smoothing method, basically the go-to. It helps us detect trend in the data by smoothing out short term fluctuations. The computation is trivial: take the most recent k points and simple-average them. Here is how it looks: Moving average example with

*- 1 year ago, 1 Aug 2019, 08:11pm -*

Portfolio construction tilting towards higher moments [Eran Raviv]

When you build your portfolio you must decide what is your risk profile. A pension fund’s risk profile is different than that of a hedge fund, which is different than that of a family office. Everyone’s goal is to maximize returns given the risk. Sinfully but commonly risk is defined as the

*- 1 year ago, 10 Jun 2019, 09:55am -*

Adaptive Huber Regression [Eran Raviv]

Many years ago, when I was still trying to beat the market, I used to pair-trade. In principle it is quite straightforward to estimate the correlation between two stocks. The estimator for beta is very important since it determines how much you should long the one and how much you should short the

*- 1 year ago, 19 May 2019, 07:56am -*

Day of the week and the cross-section of returns [Eran Raviv]

I just finished reading an interesting paper by Justin Birru titled: “Day of the week and the cross-section of returns” (reference below). The story is much too simple to be true, but it looks to be so. In fact, I would probably altogether skip it without the highly ranked Journal of Financial

*- 1 year ago, 15 Mar 2019, 11:57am -*

Most popular machine learning R packages [Eran Raviv]

In a previous post: Most popular machine learning R packages, trying to hash out what are the most frequently used machine learning packages, I simply chose few names from my own memory. However, there is a CRAN task views web page which “aims to provide some guidance which packages on CRAN are

*- 1 year ago, 10 Feb 2019, 08:55pm -*

R tips and tricks – higher-order functions [Eran Raviv]

A higher-order function is a function that takes one or more functions as arguments, and\or returns a function as its result. This can be super handy in programming when you want to tilt your code towards readability and still keep it concise. Consider the following code: # Generate some fake data

*- 1 year ago, 28 Jan 2019, 12:06am -*

Most popular posts – 2018 [Eran Raviv]

2019 is well underway. 2018 was personally difficult, so I am happy it’s behind us. Without further ado, here is what my analytics report shows to be the three most popular posts for 2018: – Create own Recession Indicator using Mixture Models (3:53 minutes average time on page) – Portfolio

*- 1 year ago, 13 Jan 2019, 09:24pm -*

Reproducible Finance with R - Book Review [Eran Raviv]

Reproducible Finance with R is a clever book, with modern treatment of classical concepts. Here below is what I liked- and disliked about the book. Back when I was practicing Judo, there was a guy in my group who mastered that one exercise (called Uchi Mata). He could go fighting 20 consecutive

*- 1 year ago, 5 Jan 2019, 12:13pm -*

Create own Recession Indicator using Mixture Models [Eran Raviv]

Broadly speaking, we can classify financial markets conditions into two categories: Bull and Bear. The first is a “todo bien” market, tranquil and generally upward sloping. The second describes a market with a downturn trend, usually more volatile. It is thought that those bull\bear terms

*- 1 year ago, 27 Nov 2018, 09:31pm -*

Price Movement Prediction [Eran Raviv]

Just finished reading the paper Stock Market’s Price Movement Prediction With LSTM Neural Networks. The abstract attractively reads: “The results that were obtained are promising, getting up to an average of 55.9% of accuracy when predicting if the price of a particular stock is going to go up

*- 1 year ago, 15 Oct 2018, 02:09am -*

Test of Equality Between Two Densities [Eran Raviv]

Are returns this year actually different than what can be expected from a typical year? Is the variance actually different than what can be expected from a typical year? Those are fairly light, easy to answer questions. We can use tests for equality of means or equality of variances. But how about

*- 1 year ago, 9 Oct 2018, 04:56pm -*

Visualizing Time Series Data [Eran Raviv]

This post has two goals. I hope to make you think about your graphics, and think about the future of data-visualization. An example is given using some simulated time series data. A very quick read. In visualization, like in programming, presenting or any other skill, there is much to learn. Also

*- 2 years ago, 17 Sep 2018, 10:52am -*

Market intraday momentum [Eran Raviv]

I recently spotted the following intriguing paper: Market intraday momentum. From the abstract of that paper: Based on high frequency S&P 500 exchange-traded fund (ETF) data from 1993–2013, we show an intraday momentum pattern: the first half-hour return on the market as measured from the

*- 2 years ago, 29 Jul 2018, 10:15am -*

R in Finance [Eran Raviv]

The yearly R in Finance conference is one of my favorites: 1. Titans of the R community are there every year. This year the founder of Rstudio (but much more really), JJ Allaire was a keynote speaker. He gave a talk about Machine Learning with TensorFlow and R. 2. Single track. I like everything,

*- 2 years ago, 27 Jun 2018, 10:47pm -*

Curse of dimensionality part 3: Higher-Order Comoments [Eran Raviv]

Higher moments such as Skewness and Kurtosis are not as explored as they should be. These moments are crucial for managing portfolio risk. At least as important as volatility, if not more. Skewness relates to asymmetry risk and Kurtosis relates to tail risk. Despite their great importance, those

*- 2 years ago, 20 Jun 2018, 12:59pm -*

Portfolio Construction with R [Eran Raviv]

Constructing a portfolio means allocating your money between few chosen assets. The simplest thing you can do is evenly split your money between few chosen assets. Simple as it is, good research shows it is just fine, and even better than other more sophisticated methods (for example Optimal Versus

*- 2 years ago, 10 Apr 2018, 10:26pm -*

Machine learning is simply statistics [Eran Raviv]

Note: I usually write more technical posts, this is an opinion piece. And you know what they say: opinions are like feet, everybody’s got a couple. Machine learning is simply statistics A lot of buzz words nowadays. Data Science, business intelligence, machine learning, deep learning, statistical

*- 2 years ago, 7 Mar 2018, 11:09am -*

Bitcoin exponential growth [Eran Raviv]

Is bitcoin a bubble? I don’t know. What defines a bubble? The price should drastically overestimate the underlying fundamentals. I simply don’t know much about blockchain to have an opinion there. A related characteristic is a run-away price. Going up fast just because it is going up fast. How

*- 2 years ago, 29 Jan 2018, 11:09am -*

Most popular posts – 2017 [Eran Raviv]

Writing this, I can’t believe how quickly the year 2017 has gone by. Also weird, we are already three weeks into 2018, unreal. Time flies when you’re having fun I guess. The analytics report shows that the three most popular posts for 2017 are: – Understanding False Discovery Rate (4 minutes

*- 2 years ago, 19 Jan 2018, 10:07am -*

R vs MATLAB - round 4 [Eran Raviv]

This is another comparison between R and MATLAB (Python also in the mix this time). In previous rounds we discussed the differences in 3d visualization, differences in syntax and input-output differences. Today is about computational speed. Spoiler alert: MATLAB wins by a knockout. A genuinely fair

*- 3 years ago, 6 Sep 2017, 11:35am -*

Visualizing Tail Risk [Eran Raviv]

Tail risk conventionally refers to the risk of a large and sharp draw down of the portfolio. How large is subjective and depends on how you define what is a tail. A lot of research is directed towards having a good estimate of the tail risk. Some fairly new research also now indicates that investors

*- 3 years ago, 8 Aug 2017, 05:50am -*

Lasso, Lasso, Lasso (and friends) [Eran Raviv]

LASSO stands for Least Absolute Shrinkage and Selection Operator. It was first introduced 21 years ago by Robert Tibshirani (Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B). In 2004 the four statistical masters: Efron, Hastie, Johnstone and

*- 3 years ago, 5 Jul 2017, 11:47pm -*

Density Estimation Using Regression [Eran Raviv]

Density estimation using regression? Yes we can! I like regression. It is one of those simple yet powerful statistical methods. You always know exactly what you are doing. This post is about density estimation, and how to get an estimate of the density using (Poisson) regression. The “go-to”

*- 3 years ago, 26 Jun 2017, 03:17am -*

Computer Age Statistical Inference [Eran Raviv]

If you consider yourself Econometrician\Statistician or one of those numerous buzz word synonyms that are floating around these days, Computer Age Statistical Inference: Algorithms, Evidence and Data Science by Bradley Efron and Trevor Hastie is a book you can’t miss, and now nor should you. You

*- 3 years ago, 5 Jun 2017, 09:09am -*

Random Books [Eran Raviv]

It seems like a very long while since my bachelor. Checking my bookshelf the other day I was thinking to flag some of those books which helped or inspired me along the way. Here they are in no particular order. Risk: Elements of Financial Risk Management Clear and to the point, 5 stars. Value at

*- 3 years ago, 4 Jun 2017, 02:31am -*

Shrinkage in statistics [Eran Raviv]

Shrinkage in statistics has increased in popularity over the decades. Now statistical shrinkage is commonplace, explicitly or implicitly. But when is it that we need to make use of shrinkage? At least partly it depends on signal-to-noise ratio. Introduction The term shrinkage, I think, is the most

*- 3 years ago, 11 May 2017, 02:50am -*

Machine Trading from @ChanEP - Book Review [Eran Raviv]

In trading and in trading-related research one could be quickly overwhelmed with the sea of ink devoted to trading strategies and the like. It is essential that you “pick your battles” so to speak. I recently finished reading Machine Trading, by Ernest Chan. Here is what I think about the book.

*- 3 years ago, 3 May 2017, 04:45am -*

Understanding False Discovery Rate [Eran Raviv]

False Discovery Rate is an unintuitive name for a very intuitive statistical concept. The math involved is as elegant as possible. Still, it is not an easy concept to actually understand. Hence i thought it would be a good idea to write this short tutorial. We reviewed this important topic in the

*- 3 years ago, 4 Apr 2017, 07:14am -*

Understanding K-Means Clustering [Eran Raviv]

Google “K-means clustering”, and you usually you find ugly explanations and math-heavy sensational formulas*. It is my opinion that you can only understand those explanations if you don’t need them; meaning you are already familiar with the topic. Therefore, this is a more gentle introduction

*- 3 years ago, 12 Mar 2017, 07:00pm -*

Outliers and Loss Functions [Eran Raviv]

A few words about outliers In statistics, outliers are as thorny topic as it gets. Is it legitimate to treat the observations seen during global financial crisis as outliers? or are those simply a feature of the system, and as such are integral part of a very fat tail distribution? I recently read a

*- 3 years ago, 19 Feb 2017, 10:11pm -*

Density Confidence Interval [Eran Raviv]

Density estimation belongs with the literature of non-parametric statistics. Using simple bootstrapping techniques we can obtain confidence intervals (CI) for the whole density curve. Here is a quick and easy way to obtain CI’s for different risk measures (VaR, expected shortfall) and using what

*- 3 years ago, 26 Jan 2017, 11:18am -*

Most popular posts - 2016 [Eran Raviv]

Another year. Looking at my google analytics reports I can’t help but wonder how is it that I am so bad in predicting which posts would catch audience attention. Anyhow, top three for 2016 are: On the 60/40 portfolio mix The case for Regime-Switching GARCH Most popular machine learning R packages

*- 3 years ago, 28 Dec 2016, 08:51am -*

Optimism of the Training Error Rate [Eran Raviv]

We all use models. We all continuously working to improve and validate our models. Constant effort is made trying to estimate: how good our model actually is? A general term for this estimate is error rate. Low error rate is better than high error rate, it means our model is more accurate. By far

*- 3 years ago, 5 Dec 2016, 09:38am -*

Central Moments [Eran Raviv]

Sometimes I read academic literature, and often times those papers contain some proofs. I usually gloss over some innocent-looking assumptions on moments’ existence, invariably popping before derivations of theorems or lemmas. Here is one among countless examples, actually taken from Making and

*- 3 years ago, 14 Nov 2016, 10:07am -*

Extreme Value Theory [Eran Raviv]

Extreme Value Theory (EVT) is busy with understanding the behavior of the distribution, in the extremes. The extreme determine the average, not the reverse. If you understand the extreme, the average follows. But, getting the extreme right is extremely difficult. By construction, you have very few

*- 4 years ago, 20 Sep 2016, 04:09am -*

Multivariate Volatility Forecast Evaluation [Eran Raviv]

The evaluation of volatility models is gracefully complicated by the fact that, unlike other time series, even the realization is not observable. Two researchers would never disagree about what was yesterday’s stock price, but they can easily disagree about what was yesterday’s stock volatility.

*- 4 years ago, 1 Sep 2016, 03:31am -*

Why bad trading strategies may perform well [Eran Raviv]

You probably know that even a trading strategy which is actually no different from a random walk (RW henceforth) can perform very well. Perhaps you chalk it up to short-run volatility. But in fact there is a deeper reason for this to happen, in force. If you insist on using and continuously testing

*- 4 years ago, 12 Aug 2016, 02:59am -*

Human significance, economic significance and statistical significance [Eran Raviv]

We are now collecting a lot of data. This is a good thing in general. But data collection and data storage capabilities have evolved fast. Much faster than statistical methods to go along with those voluminous numbers. We are still using good ole fashioned Fisherian statistics. Back then, when you

*- 4 years ago, 3 Jul 2016, 12:12pm -*

Forecast combinations in R [Eran Raviv]

Few weeks back I gave a talk in the R/Finance 2016 conference, about forecast combinations in R. Here are the slides.

*- 4 years ago, 13 Jun 2016, 10:23am -*

Most popular machine learning R packages [Eran Raviv]

The good thing about using open-source software is the community around it. There are very many R packages online, and recently CRAN package download logs were released. This means we can have a look at the number of downloads for each package, so to get a good feel for their relative popularity. I

*- 4 years ago, 16 May 2016, 11:21am -*

Forecast averaging example [Eran Raviv]

Especially in economics/econometrics, modellers do not believe their models reflect reality as it is. No, the yield curve does NOT follow a three factor Nelson-Siegel model, the relation between a stock and its underlying factors is NOT linear, and volatility does NOT follow a Garch(1,1) process,

*- 4 years ago, 3 May 2016, 02:34am -*

Measurement error bias [Eran Raviv]

What is measurement error bias? Errors-in-variables, or measurement error situation happens when your right hand side variable(s); your x in a y_t = \alpha + \beta x_t + \varepsilon_t model is measured with error. If x represents the price of a liquid stock, then it is accurately measured because

*- 4 years ago, 25 Apr 2016, 04:42am -*