Thanks for the detailed answer Anton. The only thing I can think of to add to this thread is to wonder if there is any easy way, or easy new feature, that could give users an idea of where their backtesting bottlenecks are.
For example, when I've done performance profiling in the past, I've often added one or two lines of code in key locations to record the entry/exit timestamps for important (ie time consuming) pieces of code. Then after a program execution (or after many, if I wrote the stats to disk with another few lines of code), I could see where (or what relative percentage) of time was spent on these key performance sections.
Do you think it is possible for you to add a couple of lines of metric timestamp code in the framework at key points, so you could show some relative numbers in the simulation stats? I can imagine one collection point being in front of database read operations (to monitor total IO cost). Maybe there's another one possible in front of FIX operations (as you mentioned in your previous post).
I see that you list 4 (database IO, strategy engine, portfolio calculations, and FIX). If you could identify 4 very high level timestamp collection points for those 4 areas, and print them out in the simulation stats, it would tell users what the bottlenecks are.
Timestamp collection and summarization is relatively cheap and fast and easy to implement, so I wonder if you'd be kind enough to think about the collection points in the background. Maybe it's an easy thing to do on your dev path, and it would be quite interesting (and probably productive) to compare sequential stats with your new parallel implementation. It seems to me this represents a small, easy to implement bit of work that offers high bang-per-buck invested. (Not to heap yet more work on Alex and Sergey, of course...
