[Python] Get millions of data with Oanda FX (78 species/15 years)

This article is for those who want to build big data of time series data . Get 78 types of FX data for 15 years. Machine learning requires a large amount of data. If you want hundreds of thousands to millions of data, this article will be helpful. Use Oanda’s API.

Open a demo account with Oanda

API利用のためデモ口座を開設します。デモ口座は5分程度で簡単に作れます。

  1. oandaにアクセス
  2. デモ口座を開設
  3. APIキーの発行

画面の赤線のところを押して、ページを移動したら分かると思います。ここで取得した APIキーは使用しますのでメモしておきましょう。

Open a demo account to use the API. You can easily create a demo account in less than 5 minutes.

  1. Visit oanda
  2. open a demo account
  3. API key issuance

You can see it by pressing the red line on the screen and moving the page. Make a note of the API key obtained here as it will be used.

Whole Code

# coding:utf-8
# Install oandapy
!pip install git+https://github.com/oanda/oandapy.git

# import Library
import time
import oandapy
import pandas as pd
import numpy as np
from datetime import datetime, timedelta
import pytz

# oanda API
api_key = '' # Input API Key
oanda = oandapy.API(environment = "practice", access_token = api_key)

# Function to retrieve 15 years of FX data and output a file (period and interval must be specified)
def get_histry_data(file_path, kind,duration,year_start,year_end,month_start,month_end):
    file_name =  kind + '_' + duration +'.txt'
    ys = year_start
    ye = year_start
    ms = month_start
    me = month_start + 1
    res = pd.DataFrame(None)
    first_stock = 1
    while ye < year_end or (ye == year_end and me <= month_end) :
        fmt = '%Y-%m-%dT%H:%M:00.000000Z'
        # Convert year and month data to be retrieved into strings that can be used by oandapy's api
        start1 = datetime(year=ys, month=ms, day=10,hour=12, minute=5, second=0).strftime(fmt)
        end1   = datetime(year=ys, month=ms, day=25,hour=12, minute=0, second=0).strftime(fmt)
        start2 = datetime(year=ys, month=ms, day=25,hour=12, minute=5, second=0).strftime(fmt)
        end2   = datetime(year=ye, month=me, day=10,hour=12, minute=0, second=0).strftime(fmt)

        # Data acquisition using oandapy
        res1 = oanda.get_history(instrument = kind,start = start1,end = end1,granularity = duration)
        res2 = oanda.get_history(instrument = kind,start = start2,end = end2,granularity = duration)

        # Print the time for which data is to be acquired
         #print(start1 + " " + end1)
         #print(start2 + " " + end2)

        # Convert data for one candlestick into a DataFrame
        res1 = pd.DataFrame(res1['candles'])
        res2 = pd.DataFrame(res2['candles'])

        # Data format conversion and change to Japan time
        res1['time'] = res1['time'].apply(lambda date: datetime.strptime(date, '%Y-%m-%dT%H:%M:%S.%fZ'))
        res2['time'] = res2['time'].apply(lambda date: datetime.strptime(date, '%Y-%m-%dT%H:%M:%S.%fZ'))
        res1['time'] = res1['time'].apply(lambda date: pytz.utc.localize(date).astimezone(pytz.timezone("Asia/Tokyo")))
        res2['time'] = res2['time'].apply(lambda date: pytz.utc.localize(date).astimezone(pytz.timezone("Asia/Tokyo")))
        res1['time'] = res1['time'].apply(lambda date: date.strftime('%Y/%m/%d %H:%M:%S'))
        res2['time'] = res2['time'].apply(lambda date: date.strftime('%Y/%m/%d %H:%M:%S'))

        # Repeat process for the next month
        # When the month is 13, the value is modified to be January of the next year.
        ms += 1
        me += 1
        if ys == 13:
            ys = 1
        if ye == 13:
            ye = 1
        if ms == 13:
            ms = 1
            ys += 1
        if me == 13:
            me = 1
            ye += 1

        # Combining two sets of acquired data
        res = res.append(res1)
        res = res.append(res2)

        # Export to file, but add HEADER information only the first time
        if first_stock == 1 :
            res.to_csv(file_path)
            first_stock = 0 
        else :
            res.to_csv(file_path, mode='a', header=None)
        res = pd.DataFrame(None)

#main ---------------------------------------------------------------------------------------------------------------
# Where to save files Can also be saved to GoogleDrive
path = './'

# List of Available Currencies
kind = 'USD_JPY'

# List of acquisition intervals
duration = 'M5'

# Print the path of the saved file
file_path =  path + kind + '_' + duration +'.txt'
print(file_path)

# get_data(currency_type,time_width,start_year,end_year,start_month,end_month) (get data up to 10 days)
get_histry_data(file_path,kind,duration,2005,2020,1,1)

# Load and print saved data
data = pd.read_csv(file_path)
print(data)

How to use the code

The get_histry_data function can be used to specify the type of currency and the type of time frame.

get_histry_data(file_path, kind,duration,year_start,year_end,month_start,month_end)

  • file_path: file path
  • kind: kind of currency
  • duration: interval of acquisition currency
  • year_start: start year
  • year_end: end year
  • month_start: start month
  • month_end: end month

By changing these parameters and calling the get_histry_data function, 15 years of data can be retrieved. oanda.get_history actually retrieves prices, but only up to 5000 data can be retrieved, so one month of data is retrieved in two batches. Therefore, the program we have created this time retrieves data for one month in two parts. Therefore, the program we have created this time can only retrieve data up to 5 minutes in the shortest interval. In other words, one-minute data cannot be acquired. Also, when using get_history, we have left the code that can output the time used. if you uncomment lines 48 and 49, you can output the period of acquisition, so please try it.

Various data acquisition

The list of currencies that can be obtained with oanda’s API is open to the public. Below is a list of available currencies.

List of Currency Types

The list is below, but please check the oanda HP for details.

'USD_JPY’,’EUR_JPY’,’AUD_JPY’,’GBP_JPY’,’NZD_JPY’,’CAD_JPY’,’CHF_JPY’,’ZAR_JPY’,        

'EUR_USD’,’GBP_USD’,’NZD_USD’,’AUD_USD’,’USD_CHF’,’EUR_CHF’,’GBP_CHF’,’EUR_CHF’,       

'EUR_GBP’,’AUD_NZD’,’AUD_CAD’,’AUD_CHF’,’CAD_CHF’,’EUR_AUD’,’EUR_CAD’,’EUR_DKK’,       

'EUR_NOK’,’EUR_NZD’,’EUR_SEK’,’GBP_AUD’,’GBP_CAD’,’GBP_NZD’,’NZD_CAD’,’NZD_CHF’,      

'USD_CAD’,’USD_DKK’,’USD_NOK’,’USD_SEK’,’AUD_HKD’,’AUD_SGD’,’CAD_HKD’,’CAD_SGD’,      

'CHF_HKD’,’CHF_ZAR’,’EUR_CZK’,’EUR_CZK’,’EUR_HKD’,’EUR_HUF’,’EUR_HUF’,’EUR_PLN’,       

'EUR_SGD’,’EUR_TRY’,’EUR_ZAR’,’GBP_HKD’,’GBP_PLN’,’GBP_SGD’,’GBP_ZAR’,’HKD_JPY’,       

 'NZD_HKD’,’NZD_SGD’,’SGD_CHF’,’SGD_HKD’,’SGD_JPY’,’TRY_JPY’,’USD_CNH’,’USD_CZK’,       

 'USD_HKD’,’USD_HUF’,’USD_INR’,’USD_MXN’,’USD_PLN’,’USD_SAR’,’USD_SGD’,’USD_THB’,        

'USD_TRY’,’USD_ZAR’

List of acquisition intervals

A list of acquisition intervals is displayed.

  • M: Monthly information
  • W: Weekly information
  • D: Daily information
  • H2: 2-hour information
  • H1: Hourly information
  • M30: 30 minute information
  • M10: 10 minute information
  • M5: 5-minute information

Get all information on 78 types of exchange rates in 8 types of timeframes

It may be a little greedy, but it will be the code to get all the data for 15 years. It takes a lot of time to execute, and I think it puts a lot of load on Oanda’s servers. If you run it on GoogleColab, save your data to Google Drive so you only have to run it once.

# path = '/content/drive/My Drive/
# List of Available Currencies
kind = ['USD_JPY','EUR_JPY','AUD_JPY','GBP_JPY','NZD_JPY','CAD_JPY','CHF_JPY','ZAR_JPY',
        'EUR_USD','GBP_USD','NZD_USD','AUD_USD','USD_CHF','EUR_CHF','GBP_CHF','EUR_CHF',
        'EUR_GBP','AUD_NZD','AUD_CAD','AUD_CHF','CAD_CHF','EUR_AUD','EUR_CAD','EUR_DKK',
        'EUR_NOK','EUR_NZD','EUR_SEK','GBP_AUD','GBP_CAD','GBP_NZD','NZD_CAD','NZD_CHF',
        'USD_CAD','USD_DKK','USD_NOK','USD_SEK','AUD_HKD','AUD_SGD','CAD_HKD','CAD_SGD',
        'CHF_HKD','CHF_ZAR','EUR_CZK','EUR_CZK','EUR_HKD','EUR_HUF','EUR_HUF','EUR_PLN',
        'EUR_SGD','EUR_TRY','EUR_ZAR','GBP_HKD','GBP_PLN','GBP_SGD','GBP_ZAR','HKD_JPY',
        'NZD_HKD','NZD_SGD','SGD_CHF','SGD_HKD','SGD_JPY','TRY_JPY','USD_CNH','USD_CZK',
        'USD_HKD','USD_HUF','USD_INR','USD_MXN','USD_PLN','USD_SAR','USD_SGD','USD_THB',
        'USD_TRY','USD_ZAR']

# List of acquisition intervals
duration = ['M','W','D','H2','H1','M30','M10','M5']

# Repeat for each currency list
for k in kind:
 # Repeat for each acquisition sense list
 for d in duration:

  # get_data(Currency type, time range, start year, end year, start month, end month) (Get data up to 10 days))
  file_path =  path + k + '_' + d +'.txt'
  print(file_path)

  # get_data
  get_histry_data(file_path,k,d,2005,2020,1,1)

  # Load and print saved data
  data = pd.read_csv(file_path)
  print(data

Acquisition of virtual currency data

The data acquisition for the virtual transit is available in a separate article, which you can read here if you are interested.

Conclusion

Now you can acquire millions of FX rates and build big data. We can use this data for back testing and machine learning.

If you are interested, please register with OANDA as well.