GithubHelp home page GithubHelp logo

Comments (12)

cjdsellers avatar cjdsellers commented on June 3, 2024

Hey @dkharrat

Thanks for the detail report and MRE. I looked briefly at the data and I see

5 minute bars:

2024-03-22 09:50:00-04:00,18508.5,18518.25,18500.0,18503.75,8664
2024-03-22 09:55:00-04:00,18504.0,18540.5,18504.0,18538.25,7980

1 minute bars:

2024-03-22 09:50:00-04:00,18509.0,18514.75,18503.25,18503.75,1351
2024-03-22 09:51:00-04:00,18504.0,18515.75,18504.0,18511.25,1320
2024-03-22 09:52:00-04:00,18511.5,18515.25,18507.5,18512.25,849
2024-03-22 09:53:00-04:00,18512.0,18531.0,18511.5,18528.75,2128
2024-03-22 09:54:00-04:00,18529.25,18537.5,18529.0,18532.75,2168
2024-03-22 09:55:00-04:00,18532.5,18540.5,18532.25,18538.25,1515

For 5 minute bars, at 09:55:00 the open and low of the interval were both 18504.0, so the TP BUY order executes at 18505.0.

Bar execution is quite simplistic so as each bar hits the matching engine, the OHLC will be converted to ticks and iterated - which is why the above happens.

At the same time this comes back to a previous conversation we've had on bar execution, how do we know we didn't open at 09:50:00 @ 18509.0, and then traded to a high of 18540.5, low of 18504.0, and close of 18538.25 at around 09:54:59.999999999?

With individual bar time frames we don't know, only with the additional granularity of the 1 minute bars we know this wasn't the case. But the logic of keeping track of this has not been added to the platform, which I wouldn't actually consider a bug as such - its working to spec, but just not supporting multiple time frame bar execution for the same instrument as accurately as you're expecting.

Do you require the 5 minute bars for signals, and the 1 minute bars you're using for finer grained execution? otherwise an immediate work around is to stick to a single bar time frame per instrument.

We could keep track of the finest granularity bar for each instrument and ignore the rest for execution, but are you aware of another event driven trading platform which accounts for this for bar execution? If so, it will be interesting to see what the implementation is.

from nautilus_trader.

cjdsellers avatar cjdsellers commented on June 3, 2024

For multiple time frame bars, I'm aware of another trader who sets bar_execution to false, and also provides trade ticks for the backtest (with internal bar aggregation - but this would also work for external aggregation too).

You could also use the same method and pre-process the bar data to trade ticks.

https://github.com/nautechsystems/nautilus_trader/blob/develop/nautilus_trader/backtest/engine.pyx#L373

from nautilus_trader.

cjdsellers avatar cjdsellers commented on June 3, 2024

Hey @dkharrat

I've added an implementation to keep track of the lowest time frame bar per instrument for execution, pushed to develop branch. There is an outstanding edge case that the very initial bars may still exhibit the behaviour you reported until all bar types are seen.

Let me know how this goes for you when you can.

from nautilus_trader.

dkharrat avatar dkharrat commented on June 3, 2024

At the same time this comes back to a previous conversation we've had on bar execution, how do we know we didn't open at 09:50:00 @ 18509.0, and then traded to a high of 18540.5, low of 18504.0, and close of 18538.25 at around 09:54:59.999999999?

That's why I'm providing the 1-min bar data. And the main reason why I'm also providing the 5-min bar data is because Nautilus doesn't seem to automatically aggregate. However, even if we only had 5-min bar data, I think Nautilus could use reasonable heuristics to determine if the fill makes sense. For example, for a large green bar, it's statistically more likely that the low occurred before the close/high (though, this is probably a separate topic for this particular issue).

Do you require the 5 minute bars for signals, and the 1 minute bars you're using for finer grained execution?

Yes, my strategy uses 5-min bar data to calculate indicators and signals. I suppose I could manually aggregate the 1-min bar into 5-mins within the strategy and manually update the indicators, but it would be cleaner and more ideal if Nautilus handled this natively.

I think one potential solution to this would be to feed Nautilus the highest-granularity bar data (e.g. 1-min bars) only and Nautilus would internally perform the higher-timeframe aggregation based on whatever timeframe the strategy subscribes to. This way, the strategy can use whatever timeframe it requires, while Nautilus can use the higher-granularity data for execution. I believe that's what most backtesting platforms do to address this issue.

For multiple time frame bars, I'm aware of another trader who sets bar_execution to false, and also provides trade ticks for the backtest (with internal bar aggregation - but this would also work for external aggregation too).

I'm not sure if I completely follow. Could you please provide some example or code pointers?

from nautilus_trader.

dkharrat avatar dkharrat commented on June 3, 2024

I've added an implementation to keep track of the lowest time frame bar per instrument for execution

Thanks for the quick fix! I'll give it a try and let you know how it goes. I believe the edge case you mentioned could be handled by making sure I add the 1-min bars before the 5-min bars, correct?

from nautilus_trader.

dkharrat avatar dkharrat commented on June 3, 2024

@cjdsellers I just tried the same test strategy I pasted above and it didn't seem to make a difference. I get the same output.

from nautilus_trader.

cjdsellers avatar cjdsellers commented on June 3, 2024

I updated the logic, please give it another try when you get a chance.

from nautilus_trader.

cjdsellers avatar cjdsellers commented on June 3, 2024

And the main reason why I'm also providing the 5-min bar data is because Nautilus doesn't seem to automatically aggregate

What behavior would you expect for this?

Nautilus does aggregate "any" bar type from trade or quote ticks, the DataEngine will arrange the necessary subscriptions - also see the aggregation module. It would be a further enhancement to aggregate higher time frame bars from lower time frame bars, could be done but not part of the spec currently (we've generally catered more for order book / top-of-book type data).

I'm not sure if I completely follow. Could you please provide some example or code pointers?

The general idea here is:

  • Use the wranglers or your own function to pre-process your bar data into either trade or quote ticks
  • Add these ticks to the backtest engine
  • Specify your subscription(s) bar type as INTERNAL aggregation, and LAST price type for trades

The benefit of this is that you can set bar_execution to False so that bar data points aren't involved with execution, and your strategy logic will also work between backtest and live trading too (where real-world, you would either subscribe for bars aggregated by the venue EXTERNAL, or per this example let Nautilus handle the aggregation INTERNAL).

[edit] Having said all of that, I'm hoping the recent additions to OrderMatchingEngine.process_bar will have things working as you were expecting anyway (for backtesting).

from nautilus_trader.

dkharrat avatar dkharrat commented on June 3, 2024

Use the wranglers or your own function to pre-process your bar data into either trade or quote ticks

That's an interesting idea. This is also a good way to customize how bars are interpreted and used for execution. A few questions:

  1. To convert a bar to ticks, do I just generate 4 ticks for each bar (one for each of open, high, low, close)? What about side? And I assume for volume, I would divide it by 4?
  2. To convert a bar to quotes, how do I choose the bid vs ask? Do I just set both bid/ask to the same value?
  3. Which one do you recommend? Since both are just simulating ticks, is there an advantage of choosing to convert to ticks vs quotes?

The benefit of this is that you can set bar_execution to False so that bar data points aren't involved with execution, and your strategy logic will also work between backtest and live trading too (where real-world, you would either subscribe for bars aggregated by the venue EXTERNAL, or per this example let Nautilus handle the aggregation INTERNAL).

Nice! I was not aware of the EXTERNAL vs INTERNAL distinction and that Nautilus will automatically aggregate INTERNAL data. This seems like a reasonable approach.

from nautilus_trader.

dkharrat avatar dkharrat commented on June 3, 2024

Having said all of that, I'm hoping the recent additions to OrderMatchingEngine.process_bar will have things working as you were expecting anyway (for backtesting).

I just tried the latest commit in the develop branch using the debug_strategy.py above, but it still didn't work for me. I got the same output as before.

from nautilus_trader.

cjdsellers avatar cjdsellers commented on June 3, 2024

This appears to be working for me on latest develop. Double checking you're recompiling when you pull the latest?

The logic is quite simple, note the additional log I've added temporarily and changed the log levels to warning just so that it was easier to debug:

        if execution_bar_type is None:
            execution_bar_type = bar_type
            self._execution_bar_types[instrument_id] = bar_type
            self._execution_bar_deltas[bar_type] = bar_type.spec.timedelta

        if execution_bar_type != bar_type:
            bar_type_timedelta = self._execution_bar_deltas.get(bar_type)
            if bar_type_timedelta is None:
                bar_type_timedelta = bar_type.spec.timedelta
                self._execution_bar_deltas[bar_type] = bar_type_timedelta
            if self._execution_bar_deltas[execution_bar_type] >= bar_type_timedelta:
                self._execution_bar_types[instrument_id] = bar_type
            else:
                self._log.warning(f"Not regarding {bar_type}")  # TODO!
                return

        if is_logging_initialized():
            self._log.warning(f"Processing {repr(bar)}")

You can see the 1-MINUTE bar is not used for execution, only 1-SECOND bars are processed:

2023-01-05T19:20:50.000000000Z [WARN] BACKTESTER-001.OrderMatchingEngine(XNAS): Processing Bar(AAPL.XNAS-1-SECOND-LAST-EXTERNAL,126.14,126.15,126.14,126.15,321,1672946449000000000)
2023-01-05T19:20:53.000000000Z [WARN] BACKTESTER-001.OrderMatchingEngine(XNAS): Processing Bar(AAPL.XNAS-1-SECOND-LAST-EXTERNAL,126.14,126.14,126.14,126.14,200,1672946452000000000)
2023-01-05T19:20:55.000000000Z [WARN] BACKTESTER-001.OrderMatchingEngine(XNAS): Processing Bar(AAPL.XNAS-1-SECOND-LAST-EXTERNAL,126.14,126.14,126.12,126.12,1148,1672946454000000000)
2023-01-05T19:20:56.000000000Z [WARN] BACKTESTER-001.OrderMatchingEngine(XNAS): Processing Bar(AAPL.XNAS-1-SECOND-LAST-EXTERNAL,126.12,126.12,126.12,126.12,100,1672946455000000000)
2023-01-05T19:20:58.000000000Z [WARN] BACKTESTER-001.OrderMatchingEngine(XNAS): Processing Bar(AAPL.XNAS-1-SECOND-LAST-EXTERNAL,126.13,126.13,126.13,126.13,500,1672946457000000000)
2023-01-05T19:21:00.000000000Z [WARN] BACKTESTER-001.OrderMatchingEngine(XNAS): Not regarding AAPL.XNAS-1-MINUTE-LAST-EXTERNAL
2023-01-05T19:21:00.000000000Z [INFO] BACKTESTER-001.EMACrossLongOnly: Bar(AAPL.XNAS-1-MINUTE-LAST-EXTERNAL,126.19,126.21,126.12,126.13,12627,1672946400000000000)

I understand it would be better for you if I used your MRE but I happened to be working on this bars example so used this instead.

Also of note, the bar at 59 is missing because Databento bars are only "printed" when there is a trade within the interval.

from nautilus_trader.

cjdsellers avatar cjdsellers commented on June 3, 2024

To convert a bar to ticks, do I just generate 4 ticks for each bar (one for each of open, high, low, close)? What about side? And I assume for volume, I would divide it by 4?

Yes, essentially that's what the wranglers and the matching engine is doing. It looks like aggressor side is just alternating between buyer and seller - there could probably be a more accurate heuristic created for that.

To convert a bar to quotes, how do I choose the bid vs ask? Do I just set both bid/ask to the same value?

Your bar data is probably based on trades, unless you specifically have bid and ask side bars (there are examples of combining bid and ask side bars into quote ticks in the repo - see the examples and wranglers).

Which one do you recommend? Since both are just simulating ticks, is there an advantage of choosing to convert to ticks vs quotes?

We refer to both quotes and trades as "tick" data i.e. QuoteTick and TradeTick, although this terminology can be a little loose as to what a tick actually is. I'd just stick with trades for this use case.

from nautilus_trader.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.