Category Archives: Computer Science

Functional approach to portfolio modeling

Good evening everybody, I’ve been paying attention to portfolio modelling for the past few months. When you tackle such problem, you first try to think about how you could represent a portfolio as an object so that you can dive into your C#/C++/Java code so that you can start making money ASAP. However, you’ll soon find yourself cornered in numerous problems, especially when you want to backtest different allocation strategies.

The object-oriented approach

Usually, when people model a portfolio, they will see it as a mapping between assets and weights associated to a date. There is however a misinterpretation of what a portfolio is. Indeed, what the common programmer describer above in his model is in fact a snapshot of the portfolio stat at some time t. If you were to make some changes in between two dates (all the subsequent instances of the portfolio will then be erroneous and the programmer would have to recompute them all to get the right simulation. Let’s take an example. Say we have a portfolio going through time t=1,2,3. We assume the stock has two assets, Microsoft (MSFT) and Yahoo (YHOO), and that the allocation strategy is to have an equally weighted portfolio (weights={0.5,0.5}). Here’s how the implementation would look like (in C#):

class Portfolio
{
     public DateTime date;
     public Dictionary<Asset,double> allocation;

     public Portfolio() {}

     public void Optimize()
     {
          int n=allocation.Count;
          foreach (var pair in allocation)
          {
                pair.Value=1/n;
          }
      }
}

class History
{
    public Dictionary<DateTime,Portfolio> history;

    public Add(DateTime t, Portfolio p)
    {
         history.Add(t,p);
    }
}

Now assume you want to add another stock (STO) to the portfolio at time 2, the previous implementation needs to be extended as follows:

class History
{
    public Dictionary<DateTime,Portfolio> history;

    public Add(DateTime t, Portfolio p)
    {
         history.Add(t,p);
         if (history.Any(x=>x.Key>=t))
         {
              /* Compare the new portfolio composition with the subsequent states
               * Take the necessary operation to adjust the portfolio.
               * OUCH!!!! This is complicated.
              */
         }
    }
}

As you can see in the comments I added, backward changes requires recomputing the portfolio at time 2 and 3. This is computationally intensive and the kind of function you really do not fancy writing. I’m not even discussing the probability that some bug will exist or the change that you would have to add if you were to make more complicated backward operations. Furthermore, assuming you want to see how the portfolio behave before a change, all the information about the previous simulation would be lost. This is because the class actually stores the stateof the portfolio, not really the portfolio itself. “Thanks for the heads-up Einstein! You got anything better to do?” Well, as a matter of fact, yes.

The functional approach

I would like to introduce a different way of representing a portfolio; a way which would be especially meaningful in a functional environment (F#, Scala). First of all, I would like to define a portfolio snapshot as a list of tuples of an Asset and a double representing each asset and its weight. To me, a portfolio is a strategy more than an object. In terms of allocation, the strategy outputs portfolio snapshots with weights, and these weights can actually generate buy/sell order to adjust a “real” portfolio (a basket of real assets) position. But the whole point here is that the portfolio is in fact just as set of operations. In my opinion, the correct way of representing an operation in a programming language is a function. In our case, an operation would be a function. This includes adding an asset to the portfolio, optimizing a portfolio and so on. The portfolio is hence a list of tuples of type DateTime*Operation (the date being the time at which the operation should occur). Let’s just define some formal definitions to these concepts (in F#):

type Action<'a>='a->'a

type Change<'a> = System.DateTime * Action<'a>;

type History<'a>; = Change<'a> list

An action is a function taking some type as an input and return a modified version of this object (actually, a new instance of the object with the modification included). A changeis an action happening at a certain time. Finally, a history is a list of changes. Simple. Now, how do we apply this to portfolio modeling? First, some more definitions:

type Weight=float

type Asset = string

type AssetAlloc=Weight * Asset

type PortfolioAlloc= AssetAlloc list

type PortfolioAction=Action<PortfolioAlloc>

type PortfolioChange = Change<PortfolioAlloc>

type Portfolio = History<PortfolioAlloc>

Thanks to the F# syntax, the code is pretty much self-explanatory. Now, let’s define a simple portfolio action consisting in adding an asset to the portfolio:

let addAsset (ass:Asset) (w:Weight) (pFolioAlloc: PortfolioAlloc) : PortfolioAlloc =
    match List.tryFind (fun p -> snd p = ass) pFolioAlloc with
    |Some _ -> (w,ass) :: pFolioAlloc |> List.filter (fun pa -> snd pa <> ass)
    |None -> (w,ass) :: pFolioAlloc

For those of you not familiar with functional programming, this might look complicated, but if you look a bit into the language (particularly Pattern Matching), you’ll see it’s actually quite trivial. Let’s continue with the optimization of the portfolio:

let equWeightPortfolio (pFAlloc:PortfolioAlloc) : PortfolioAlloc =
    let w:Weight = 1.0/(float pFAlloc.Length)
    pFAlloc |> List.map (fun alloc -> (w, snd alloc))

Trivial. Finally, we need a function that will evaluate the portfolio. This requires the application in succession of all the changes to an initial portfolio allocation (most of the time, an empty portfolio, initially). This kind of operation is well-known in functional programming, it simply consists in foldinga list:

let getPortfolioAllocFromInit (pFolio:Portfolio) (t : System.DateTime) (init:PortfolioAlloc) =
    pFolio |> List.filter (fun pc -> fst pc <= t)
    |> List.sortBy (fun pc -> fst pc)
    |> List.map (fun pc -> snd pc)
    |> List.fold (fun alloc action -> action(alloc)) init

let getPortfolioAlloc (pFolio:Portfolio) (t : System.DateTime) = getPortfolioAllocFromInit pFolio t []

The first function implements the general logic: first sorting the operations and then applying them sequentially. The second function just modifies the first one by giving an initial empty portfolio. The application of this model is as follows:

let addSPAction : PortfolioAction = addAsset "S&P500" 0.0;;
let addMSAction : PortfolioAction = addAsset "MSFT" 0.0;;
let myPortfolio:Portfolio = [(System.DateTime.Parse("01.01.2011"),addSPAction);
                            (System.DateTime.Parse("31.01.2011"),equWeightPortfolio);
                            (System.DateTime.Parse("01.02.2011"),addMSAction);
                            (System.DateTime.Parse("28.02.2011"),equWeightPortfolio);
                            ];;
let myPortfolioAlloc t = getPortfolioAlloc myPortfolio t;

let endAlloc = myPortfolioAlloc System.DateTime.Now;;

let janAlloc = myPortfolioAlloc(System.DateTime.Parse("01.02.2011"));;

which outputs:

val endAlloc : PortfolioAlloc = [(0.5, "MSFT"); (0.5, "S&P500")]
val janAlloc : PortfolioAlloc = [(0.0, "MSFT"); (1.0, "S&P500")]

When hence see how trivial it is to compute the state of the portfolio at different times. With this representation, altering a portfolio at time t=2 means actually adding a function to a list, but does not change the subsequent operations. The state of the portfolio (the snapshot) is computed on demand. Let’s try doing so:

let addYHOOAction:PortfolioAction = addAsset "YHOO" 0.0
let myPortfolio2:Portfolio= (System.DateTime.Parse("30.01.2011"),addYHOOAction)::myPortfolio
let myPortfolioAlloc2 t = getPortfolioAlloc myPortfolio2 t
let endAlloc2 = myPortfolioAlloc2 System.DateTime.Now
let janAlloc2 = myPortfolioAlloc2(System.DateTime.Parse("01.02.2011"));;

which outputs:

val endAlloc2 : PortfolioAlloc =
  [(0.3333333333, "MSFT"); (0.3333333333, "YHOO"); (0.3333333333, "S&P500")]
val janAlloc2 : PortfolioAlloc =
  [(0.0, "MSFT"); (0.5, "YHOO"); (0.5, "S&P500")]

As you can see, no rocket science to add backward operations! Note that we could have done so for any portfolio action… The model could be improved of course, but the idea is here. Keeping track of the old simulation would just mean keeping the date of creation of the change. I hope you enjoyed the ride, please feel free to comment! See you next time!

Heads up on PHP5 byValue/byReference function parameters

Hi everybody!

It’s been quite a long time since I wrote on my blog because I’ve been busy working and engineering several data models and software architecture for a risk management system.

During my work, I came back to the first real programming language I actually learnt, PHP, which I intend to use to create my UI.

When I started programming in PHP 9 years ago, I was using the PHP4 version which did not provide advanced object-oriented features as PHP5 does.

I don’t want to discuss the new features in PHP5 but there is an important change between the two versions that I was not completely familiar with in this language: the way PHP passes parameters to functions.

Back in PHP4, all parameters were passed by value, that is, that if you modified a parameter within the scope of the function, the changes occurred only within that scope and the parameter would remain unchanged in the upper level. In order to get the changes echoed in the upper scope, you would have to add “&” in front of the variable so that a “pointer” is passed to the function.

In PHP5, this is not true anymore and this is what I want to discuss in this post.

In the latest version, instances of a class are passed by reference; change will echo in the upper scope. “Native” types however, are still passe by values. This is the same behavior as in most object-oriented programming languages such as Java or C#.

However, PHP is not a strongly typed programming languages. It is hence difficult to know what is a “native” type and what isn’t.

Let’s consider the first example, an integer as in the following snippet:

function increment($i) {
    $i+=1;
}

$myVar=1;

increment($myVar);

echo $myVar; //Prints "1"

As you can see, the integer passed as a parameter hasn’t been altered by the function.

In order to make sure that the parameter changes in the upper scope, you have specify in the function definition that you want it to be passed by reference as follows:

function incrementByRef(&amp;$i) {
    $i+=1;
}

incrementByRef($myVar);

echo $myVar; //Prints "2"

This is the behavior we would expect in any object-oriented programming language. However, what happens when we try with more “enhanced” types such as strings or arrays. In Java, strings (as opposed to char) are considered as objects, and hence would by passed by reference. On the other hand, strings are passed by value in C# as “string” is considered as a “native” type.

In PHP, they are also considered as “native” types as the following snippets demonstrates:

function concatstring($str1, $str2) {
    $str1.=$str2;
}

$string="Hello";

concatstring($string,"World");

echo $string; //Prints "Hello"

The same applies to PHP array which are very, very powerful structures comparable to C# dictionaries:

function changeIndex($array,$key,$value) {
    $array[$key]=$value;
}

$myArray=array("Hello"=>"Nothing");

changeIndex($myArray,"Hello","World");

print_r($myArray); //Prints "Array ( [Hello] => Nothing )"

What about classes then? Well as I already mentioned, instances of classes are passed by reference in PHP5 as shown in the following code:

class MyClass {
    public $MyInt=1;
    public $MyString="Hello";
    public $MyArray=array("Hello"=>"Nothing");
}

function increment2($class) {
    $class->MyInt+=1;
}

function concatstring2($class, $str2) {
    $class->MyString.=$str2;
}

function changeIndex2($class,$key,$value) {
    $class->MyArray[$key]=$value;
}

$myClass=new MyClass;

increment2($myClass);
concatstring2($myClass,"World");
changeIndex2($myClass,"Hello","World");

print_r($myClass)
// Prints "MyClass Object ( [MyInt] => 2 [MyString] => HelloWorld [MyArray] => Array ( [Hello] => World ) )"

Finally, one could argue that “manually” passing arguments by value might increase performance. However, after some testing I concluded that this assertion is wrong. First of all, the difference between the two is minimal, and most importantly, running the tests several times does not give the same method as the quickest every time!

Hence, forget about passing you parameters by reference manually when they are instances of classes in PHP5, it doesn’t look useful anymore.

See you next time!

Quantitative Finance and Computer Science: quick comparison between R and MATLAB

Hi everybody,

It’s a few weeks without a post and I apologize for that; I am at the moment looking for a job in Geneva, and it’s a bit time consuming as you would imagine.

Anyway, I kept working a bit on different projects and I had the idea to create this post. What encouraged me was several e-mails I received from some former classmates and some questions I recently had to ask. I noticed that, most of us, “quants”, “analysts” or whatever you call your job, are usually wondering what technology to use for in specific situations.

Most of the time, we end up using programming languages we know best. For example, students used to MATLAB will use it instead of R, programmers used to .Net (C#, Visual Basic) will use it instead of the JVM (Java). If you have next to you a whole desk of hugely experimented people, then you might have good hints to guide you in your technology decisions, but here are a few remarks I gathered during my work.

MATLAB vs R

First of all, R is free and open-source as opposed to MATLAB which has an initial fee, plus additional cost for each additional package.

R being open-source, you can browse the Comprehensive R Archive Network (CRAN) to fit the packages you like and start using them. However, you will sometimes find that the documentation is not very clear and hence you might have to ask several questions on the R mailing lists which are not very user-friendly themselves.

Using MATLAB however, you will get packages about different subject (such as Fixed Income, Derivatives, Data Handling and so on) which are quite complete and have a pretty good documentation. I’d say that in term of support, it’s money well spent.

If you wish to integrate this technology within one of your software, I’ve heard (but not tried) that integration is pretty easy. This might allow you to spare some time you’d have spent on handling data marshaling or library implementation in the native language.

As far as R is concerned you will find some very useful packages such as XTS for time series handling. You will also be able to find handy financial packages to perform computations on bond pricing. However, you might have some problems finding the package that suits your need. There is a trade-off between the number of packages available and their quality. However, once you’ve found what you need, you’ll be pretty happy. In terms of integration, I have to tackle the problem myself, and I’d say that Dirk Eddelbuettel’s blog and libraries will help you a lot. If you wish to use R withing Excel, you can use RExcel from Statcon which is pretty easy to work. The following book might help you handle:

R Through Excel: A Spreadsheet Interface for Statistics, Data Analysis, and Graphics (Use R)

For those of you who wish to dig deep into R foundations, the following book is excellent:

Software for Data Analysis: Programming with R (Statistics and Computing)

In terms of performance, I’d say that MATLAB seems to be quicker to find optimization results. Again, this might be due to the fact that I did not find the right R package, but who cares? After my different researches, and especially for the game-theoretic approach on ice hockey penatlies, I had to use MATLAB to get the program to converge before loosing patience.

That’s all from now, but I’ll soon be back to compare the JVM and the .Net framework for finance computations and MATLAB or R integrations.

Until then,

Have fun!