GithubHelp home page GithubHelp logo

mbok / elasticsearch-linear-regression Goto Github PK

View Code? Open in Web Editor NEW
64.0 9.0 20.0 242 KB

A machine learning plugin for Elasticsearch providing aggregations to compute multiple linear regression on search results in real-time for predictive analytics.

License: Apache License 2.0

Java 100.00%
elasticsearch linear-regression elasticsearch-plugin machine-learning predictive-analytics

elasticsearch-linear-regression's Issues

Compatibility with 6.1.3

Hi mate, thank you for your great plugin! It's really nice! Are you thinking to extend compatibility to elasticsearch 6.x.x? Now we can't use elasticsearch-linear-regression beacuse it is not compatible with last elasticsearch version.

Provide aggregation to indicate breakouts regarding a estimated linear regression "channel"

Breakouts (e.g. documents with a response variable value outside of the upper and lower hyper-plane spaced by a specified number of standard deviations above and below the middle linear regression hyper-plane) in time series data may indicate anomaly.
A concrete concept has still to be defined.
Real world use cases are e.g stock markets, see https://www.dailyfx.com/forex/education/trading_tips/daily_trading_lesson/2014/10/24/Trend-Following-with-Regression-Channels.html.

Use case: predict the time of the next purchase

Hello,
probably not the best place to ask for the comment about the mentioned use case, but I'd give it a try.
I am trying to come up with optimal way to forecast the purchase time for Product P and User U.
We currently index these events in ES (pushed by e-commerce system)

orderId,user,product,quantity,time,days
"order1","U","P",1,"2017-01-01",17167
"order2","U","P",2,"2017-01-29",17195
"order3","U","P",3,"2017-04-02",17258
"order4","U","P",1,"2017-07-06",17353
"order5","U","P",2,"2017-08-03",17381

where days is just a integer showing number of days since 1.1.1970 for the event time.

What I want is to predict is the next time of purchase and the quantity.
quantity is last purchased quantity in this case 2
and the forecast time should be somewhere in October.

I've played with this plugin:
and it works well if I have additional calculated field "lag" for each event which denotes the time period until NEXT purchase, so then the data above should look like:

orderId,user,product,quantity,time,days,lag
"order1","U","P",1,"2017-01-01",17167,28
"order2","U","P",2,"2017-01-29",17195,63
"order3","U","P",3,"2017-04-02",17258,95
"order4","U","P",1,"2017-07-06",17353,29
"order5","U","P",2,"2017-08-04",17382,?
GET /buying_habit/_search?size=0 { "query": { "match_all": {} }, "aggs": { "demand_p": { "linreg_predict": { "fields": ["quantity", "lag"], "inputs": [2] } } } }

Of course in this index I will have 10K of different products and 1M different Users.
My first question is how to update this field for the LAST event when new event comes in?
Is it possible to do it in index time?
Does this make sense at all or there is a better way?
Btw in case that there is only one purchase, I'd use the default lifecycle of the product in days (comes from e-commerce as well). But for cases where there is a buying pattern (at least 2 events) I 'd need to use user specific data.
I plan to run forecast query for each User/Product pair every hour to calculate the next forecast time (effectively when user SHOULD run out of supply).
What would be the way to optimize that (avoid doing this one by one)?

Thanks very much in advance,
Milan

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.