UP | HOME

Download historical data from Yahoo Finance

Table of Contents

Introduction

The majority of people use Yahoo Finance to obtain updated quotes for traded securities all around the world. However, it can be a useful source of financial data also for research purposes. In fact, from yahoo it is possible to easily obtain historical series of prices. The service was suspended for a few years but now it seems to work again, albeit with a slightly different procedure.

Tickers

The first problem is to identify the ticker symbol of the desired securities. The ticker symbol is an unique string that identifies each security. A global selection of tickers is available as an Excel file from investexcel.

For the New York stock exchange (NYSE), the updated list of tickers can be automatically downloaded in CSV format from the screening interface. Replace 'nyse' with 'nasdaq' to obtain the list of security traded on NASDAQ. The obtained file lists:

  1. the ticker symbol
  2. the name of the company
  3. the price of the last transaction
  4. the market capitalization,
  5. the IPO year
  6. the Sector as defined by the stock exchange,
  7. the Industry, a further division of the Sectors
  8. the link to company page on the stock exchange website

A useful subset of stocks are the components of the S&P500 index. They are liquid and substantially capitalized securities. The list of tickers composing the index can be, for instance, obtained from datahub.

Obtain historical data

Once the ticker of the interesting security is know, the data can be downloaded using a special URL as in the following example

https://query1.finance.yahoo.com/v7/finance/download/[TICKER]?period1=[EPOCHSTART]&period2=[EPOCHEND]&interval=1d&events=history&includeAdjustedClose=true

[TICKER] is mandatory and is the ticker of the desired security. [EPOCHSTART] and [EPOCHEND] define the range of dates for which data are downloaded. Thy are expressed using the "epoch", a Unix standard to represent dates as sequences of numbers (it is the number of seconds from 1-1-1970). Conversion from common date formats and epoch format can be obtained using an online epoch converter. For instance 1AM, first of January, 1990 is expressed as "631155600". The interval variable sets the frequency of the data. In the example it is 1 day. It can be 1w for weekly data and 1mo for monthly data. events=history selects historical prices. The last flag includeAdjustedClose=true include adjusted closing prices. Prices are already adjusted for splits, the "Adjusted Close" price is further adjusted for dividends.

If you insert the URL above in any browser, the data are saved in a file named according to the ticker symbol. The file contains the following comma separated fields:

  1. Date (yyyy-mm-dd)
  2. Open
  3. High
  4. Low
  5. Close
  6. Adj Close
  7. Volume

Scripting

Generally we want to download data about several securities at once. In this case scripting is necessary.

For instance, in order to download the price data at daily frequency for all NYSE traded companies in the Technology sector we can proceed as follows. From the screener interface we select NYSE and the Technology sector and download the CSV file. Then we collect all the tickers in a variable

ticks=$( awk -F',' 'NR>1{print $1}' nasdaq_screener.csv )

where nasdaq_screener.csv is the name of the previously downloaded file. Finally, we cycle over all tickers and download the price data, saving them in a appropriately named files

USERAGENT="Mozilla/5.0"
EPOCHSTART="631155600"
EPOCHEND="1609459200"
BASEURL="https://query1.finance.yahoo.com/v7/finance/download/"
for tick in $ticks; do
    URL="${BASEURL}${ticker}?period1=${EPOCHSTART}&period2=${EPOCHEND}&interval=1d&events=history&includeAdjustedClose=true"
    wget --user-agent=${USERAGENT} "$URL" -O ${ticker}.csv
done

In the example dates range from 01-01-1990 to 31-12-2020. One needs to specify a well known user agent as, I guess, Yahoo implements some generic protection against scrapers.

S&P 500

Using the previous technique I have download historical prices for the stocks composing the S&P500 index from 01-01-1990 to 31-12-2020. They are collected in this archive. Files are named according to the respective tickers.

Author: Giulio Bottazzi

Created: 2021-01-15 Fri 12:11

Validate