Part 1: Build your own crypto daily up-to-date database with R

Share:
Indexes
  1. How to export >100x faster .csv files from R for big data
  2. Develop your own financial dashboard with Flask and Plotly
  3. Are subreddits a good predictor for stock returns?
  4. Build your own mobile friendly website with Flask.
  5. How to increase productivity and health in daily coding life.

TLDR; In this tutorial series, I show you how you can build your own crypto database in json format, which you can update daily. In later tutorials, I will change it to a sqlite format, show you what you can analyze with these type of data and build a shiny app.

Crypto Closes in USD, image by author

API

We will use the API from kraken.com. It is a public available API, so you do not need any authentication methods (link to documentation). You can simply replace the pair variable with your desired trading pair. Below it is Bitcoin (XBT) and USD $ as fiat currency. The pair is required, but you can also define interval in minutes and the start date (since). Every request is limited to 720 rows.

curl "https://api.kraken.com/0/public/OHLC?pair=XBTUSD"
API documentation, Screenshot from kraken.com

Fetch the data

#import the libraries we need
library(jsonlite)
library(glue)
library(tidyverse)
## your interval
format = "1D"
year_ago <- today() - 720
## format as unix
unix <- as.numeric(as.POSIXct(year_ago))
## API Adress
url <- glue("https://api.kraken.com/0/public/OHLC?pair={pair}&interval=1440&since=unix")
## rename your column names
columnNames <- c('unix', 'open', 'high', 'low', 'close', 'vwap',
'volume', 'tradecount', 'misc')
## retrieve the data
mydata <- fromJSON(url)
## format as tibble
df <-
as.data.frame(mydata['result']) %>%
as_tibble()
##rename columns
colnames(df) <- columnNames # rename the columns
# remove the misc column
df <-
df %>%
select(-misc)
## Show df
df output, image by author

We successfully received data from kraken as you can see in the output above. But, we have to make some format changes since the numeric columns are still character and we want to have a readable datetime column.

Format your data

# change columns to floats

df <-
df %>%
mutate(across(c('open', 'high', 'low', 'close', 'vwap','volume',
'tradecount', 'unix'), as.numeric))
# multiply close price * volume to get USD
df <-
df %>%
mutate(USD_Volume = volume * close)
#get readable datetime column
df <-
df %>%
mutate(datetime = as.POSIXct(as.numeric(unix), tz="UTC",
origin = "1970-01-01"))
## Show df
df

There you go! You received two years of Bitcoin data. Honestly, I just wanted a yea, but I think, because it less than 720 rows, they just give it you. Now we want to store our data in json format, this is a simple approach and really less in memory space.

Save your data

fileName <- glue("Data/{pair}-{format}.json")if (file.exists(filename)){
orig_date <- fromJSON(filename, flatten=TRUE) %>%
mutate(datetime = as.POSIXct(as.numeric(unix), tz="UTC", origin = "1970-01-01")) %>%
mutate(across(c('open', 'high', 'low', 'close', 'vwap', 'volume',
'tradecount', 'unix'), as.numeric)) %>%
as_tibble()

df <-
orig_date %>%
bind_rows(df) %>%
distinct(datetime, .keep_all = TRUE)
}

write_json(df, path = filename)

Write a function and download more pairs

Now, we wrap everything into a function and perform the whole for a list of pairs, we are interested in. Below you find the code for that.

There you go! Below in your data folder, you should see the data files.

Thanks for reading my article. I hope you liked it. Please feel free to like, share, and comment on it. Follow me on more content about cryptos, stocks, data analysis and web development.

Read my other article about stocks and reddit:

Here you find the code:

If you want to learn more about Data Science in R, I recommend this book*.