Asynchronous Financial Data processing using F#

Hi everyone,

I was looking at a book to find some algorithm to detect clusters within financial data. I managed to find a decent algorithm for that matter, but I then wanted to test it on some real data.

Hence, I had to get the data from somewhere and I chose to use Yahoo! Finance. The thing is, I wanted to do it in an automatized way in order to be able to re-use the module for further projects.

I remembered Luca Bolognese’s presentation of F# a few years back (certainly one of the best presentations I’ve ever seen) and I thought it was a good time to give it a try.

The idea is quite simple: you can import data from Yahoo! Finance by parsing the CSV files the website produces. These files are actually generated automatically, which means that you can access them by querying the right URL with the right parameters.

The following module shows how to generate the right query to get the CSV file from Yahoo! Finance:

module QueryModule
open System
let dateToList (d:DateTime) = [d.Month;d.Day;d.Year]

let parameters =['a';'b';'c';'d';'e';'f']

let parametersToString (d1:DateTime) (d2:DateTime)=
    let datesList = (dateToList d1) @ (dateToList d2)
    let rec innerFunc (dList: int list) pars =
        match pars with
        | [] -> ""
        | x::[] -> x.ToString() + "=" + dList.Head.ToString()
        | x::y -> x.ToString() + "=" + dList.Head.ToString() + "&" + (innerFunc dList.Tail y)

    innerFunc datesList parameters

let getFileUrl (ticker:string) start stop =
    "http://ichart.finance.yahoo.com/table.csv?s="+ticker+"&"+(parametersToString start stop)+"&ignore=.csv"

The idea is now to write a module to download this data. This would be pretty straightforward and I wouldn’t make you waste your time by reading further code right here.

What I’m trying to do here is to download these files asynchronously. Doing this using C# is possible but requires some heavy coding and the result will most likely contain some bugs. F# (and functional programming languages in general) allows you to do it much more easily.

Let’s see how to implement an asynchronous function in F# then:

module DownloadModule
open QueryModule
open Microsoft.FSharp.Control.WebExtensions
open System
open System.Net
open System.IO

let parseData (rawData:string) =
    rawData.Split('\n')
    |> Array.toList
    |> List.tail
    |> List.map (fun x -> x.Split(','))
    |> List.filter (fun x -> x.Length = 7)
    |> List.map (fun x -> (Convert.ToDateTime(x.[0]),float x.[6]))

let getCSV ticker dStart dEnd =
    async   {
            let query = getFileUrl ticker dStart dEnd
            let req = WebRequest.Create(query)
            use! resp = req.AsyncGetResponse()
            use stream= resp.GetResponseStream()
            use reader = new StreamReader(stream)
            let content = reader.ReadToEnd()
            let ts = parseData content
            return ts
            }

The whole trick in this code resides in the “async” block and the “use!” keyword. Basically, “use!” tells to F# not to wait for the result, that is, to proceed asynchronously if possible.

You now can run this function in parallel to download multiple ticker as follows:

let testPrices=
    ["MSFT";"YHOO"]
    |>List.map (fun x -> getCSV x (DateTime.Parse("01.01.2000")) (DateTime.Parse("01.01.2010")))
    |> Async.Parallel
    |> Async.RunSynchronously;;

This works fine and it’s much faster than it would have been if the list was processed sequentially.

I pushed the example a little further by thinking: “What if I want to do some operation on the time series now?”.

Well, I could wait until the parallel execution is done and the run the following function to get the returns synchronously:

module AnalysisModule
open System

let getReturns (prices:(DateTime *float)list) =
    [for i in 1..(prices.Length-1) -> i]
    |> List.map (fun i ->(fst (List.nth prices i), (snd (List.nth prices i))/(snd (List.nth prices (i-1) )) - 1.0))

That wouldn’t be optimal; once the download of MSFT is done, I don’t need to wait until the download of YHOO is done as well before starting to process the returns…

So, how can I do this easily, without modifying any of the previous function I wrote.

let testReturns =
    ["MSFT";"YHOO"]
    |> List.map (fun ticker -> async {
                        let! prices = getCSV ticker (DateTime.Parse("01.01.2000")) (DateTime.Parse("01.01.2010"))
                        return getReturns prices
                   })
    |>Async.Parallel
    |>Async.RunSynchronously;;

This way, everything is computed in parallel and I have a really great performance!

Hope you enjoyed this little demo!

See you next time!


Comments

3 responses to “Asynchronous Financial Data processing using F#”

  1. This F# code was very helpful. Thank you for posting it.

  2. Great script, very useful and fast! One small oversight, though. Yahoo! Finance’s URL parameters take the month as an integer between 0 and 11 as opposed to 1 through 12, so you need to change

    let dateToList (d:DateTime) = [d.Month;d.Day;d.Year]

    to

    let dateToList (d:DateTime) = [d.Month-1;d.Day;d.Year]

    in line 4 of your first code block.

  3. Very nice, thanks for sharing!! How long you been using f#?

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.