GithubHelp home page GithubHelp logo

ikatsov / tensor-house Goto Github PK

View Code? Open in Web Editor NEW
1.2K 56.0 449.0 153.77 MB

A collection of reference Jupyter notebooks and demo AI/ML applications for enterprise use cases: marketing, pricing, supply chain, smart manufacturing, and more.

License: Apache License 2.0

Jupyter Notebook 99.60% Python 0.40%
data-science ai models marketing supply-chain machine-learning reinforcement-learning deep-learning customer-analysis llm

tensor-house's Introduction

What is TensorHouse?

TensorHouse is a collection of reference Jupyter notebooks and demo AI/ML applications for enterprise use cases: marketing, pricing, supply chain, smart manufacturing, and more. The goal of the project is to provide a toolkit for rapid readiness assessment, exploratory data analysis, and prototyping of various modeling approaches for typical enterprise AI/ML/data science projects.

TensorHouse provides the following resources:

  • A well-documented repository of reference notebooks and demo applications (prototypes).
  • Readiness assessment and requirement gathering questionnaires for typical enterprise AI/ML projects.
  • Datasets, data generators, and simulators for rapid prototyping and model evaluation.

TensorHouse focuses mainly on industry-proven solutions that leverage deep learning, reinforcement learning, and casual inference methods and models. Most of these solutions were originally developed either by industry practitioners or by academic researchers who worked in collaboration with leading companies in technology, retail, manufacturing, and other sectors.

How Does TensorHouse Help?

TensorHouse helps to accelerate the following steps of the solution development:

  1. Faster evaluate readiness for specific use cases from the data, integration, and process perspectives using questionnaires and casual inference templates.
  2. Choose candidate methods and models for solving your use cases, evaluate and tailor them using simulators and sample datasets.
  3. Evaluate candidate methods and models on your data, build prototypes, and present preliminary results to stakeholders.

What Libs Does TensorHouse Use?

All prototypes and template are implemented in Python using a limited set of standard libraries:

  • Deep learning: mostly TensorFlow, some prototypes use PyTorch
  • Reinforcement learning: RLlib
  • Causal inference: DoWhy, EconML
  • Probabilistic programming / Bayesian inference: PyMC
  • Generative AI: LangChain
  • Traditional ML: statsmodels, scikit-learn, LightGBM
  • Basic libs: NumPy, pandas, matplotlib, seaborn

Illustrative Examples

Strategic price optimization using reinforcement learning

DQN learns a Hi-Lo pricing policy that switches between regular and discounted prices:

Supply chain optimization using reinforcement learning

DQN learns how to control procurement and logistics in a simulated environment:

Supply chain management using large language models

LLM dynamically writes a python script that invokes multiple APIs to answer user's question:

Anomaly detection in images using autoencoders

Deep autoencoders produce image reconstructions that facilitate detection of defect locations:

List of Prototypes and Templates

The artifacts listed in this section can help to rapidly evaluate different solution approaches and build prototypes using your datasets. Artifacts are marked with the following qualifiers:

  • ๐Ÿงช - artifacts that are particularly suitable for exploratory data analysis, evaluating the strength of causal effects in your data, and determining whether these data is feasible for solving a certain use case or not
  • ๐Ÿš€ - conceptual prototypes that use advanced methods and not necessarily suitable for productization
  • ๐Ÿ“š - notebooks that demonstrate basic algorithms and intended mainly for educational purposes

Promotions, Offers, and Advertisements

These notebooks can be used to analyze the behavior of individual customers, calculate customer propensity (affinity) scores, and personalize offers, content, or digital experience.

  • Customer Scoring and Lifetime Value
    • Customer Propensity Scoring Using Deep Learning (LSTM with Attention) (notebook)
    • Customer-level Uplift Modeling Based On Observational Data Using Causal Inference (notebook) (๐Ÿงช)
    • Customer Lifetime Value (LTV) Estimation Using Markov Chains (notebook)
    • Customer Lifetime Value (LTV) Estimation Using Bayesian Buy-Till-You-Die (BTYD) Model (notebook)
  • Decision Automation
    • Dynamic Content Personalization Using Contextual Bandits (LinUCB) (notebook)
    • Next Best Action Model Using Reinforcement Learning (Fitted Q Iteration) (notebook)

Marketing, Customer, and Content Analytics

The notebooks can be used to perform aggregated analysis of the customer population or segments, get insights from user-generated content, and optimize marketing budgets.

  • Content Analytics
    • Sentiment Analysis Using Basic Transformers (notebook)
    • Virtual Focus Groups Using LLMs (notebook)
  • Customer Behavior Analytics and Embeddings
    • Recency, Frequency, and Monetary Value (RFM) Analysis of Customer Purchases (notebook) (๐Ÿงช)
    • Analysis of Customer Behavior Patterns Using LSTM/Transformers (notebook)
    • Item2Vec Using Word2vec (notebook)
    • Customer2Vec Using Doc2vec (notebooks: simulator, prototype)
  • Media Mix, Attribution, and Budget Optimization
    • Campaign Effect Estimation In Observational Data Using Causal Inference (notebook) (๐Ÿงช)
    • Media Mix Modeling: Adstock Model for Campaign/Channel Attribution (notebook)
    • Media Mix Modeling: Bayesian Model with Carryover and Saturation Effects (notebook) (๐Ÿงช)
    • Multitouch Channel Attribution Model Using Deep Learning (LSTM with Attention) (notebook)

Search

These notebooks can be used to create enterprise search, product catalog search, and visual search solutions.

  • Text Search
    • Latent Semantic Analysis (LSA) (notebook) (๐Ÿ“š)
    • Retrieval-augmented Generation (RAG) Using LLMs (notebook)
  • Visual Search
    • Visual Search by Artistic Style (VGG16) (notebook)
    • Visual Search Based on Product Type (EfficientNetB0) (notebook)
    • Visual Search Using Variational Autoencoders (notebook)
    • Image Search Using a Language-Image Model (CLIP) (notebook)
  • Structured Data Search
    • Relational Data Querying Using LLMs (notebook)
  • Data Preprocessing
    • Product Attribute Discovery, Extraction, and Harmonization Using LLMs (notebook)

Recommendations

These notebooks can be used to prototype product recommendation solutions.

  • Basic Collaborative Filtering
    • Nearest Neighbor User-based Collaborative Filtering (notebook) (๐Ÿ“š)
    • Nearest Neighbor Item-based Collaborative Filtering (notebook) (๐Ÿ“š)
  • Deep and Hybrid Recommenders
    • Neural Collaborative Filtering - Prototype (notebook) (๐Ÿ“š)
    • Neural Collaborative Filtering - Hybrid Recommender (notebook)
    • Behavior Sequence Transformer (notebook)
    • Graph Recommender Using Node2Vec (notebook)

Demand Forecasting

These notebooks can be used to create demand and sales forecasting pipelines. These pipelines can further be used to solve inventory planning, price management, workforce optimization, and financial planning use cases.

  • Traditional Methods
    • Demand Forecasting for a Single Entity Using Exponential Smoothing (ETS) (notebook)
    • Demand Forecasting for a Single Entity Using Autoregression (ARIMA/SARIMAX) (notebook)
    • Demand Forecasting and Price Effect Estimation for Multiple Entities Using Generalized Linear Models (notebook) (๐Ÿงช)
  • Deep Learning Methods
    • Demand Forecasting for Multiple Entities Using DeepAR (notebook)
    • Demand Forecasting for a Single Entity Using NeuralProphet (notebook)
  • Dynamic Learning
  • Data Preprocessing

Pricing and Assortment

These notebooks can be used to create price optimization, promotion (markdown) optimization, and assortment optimization solutions.

  • Static Price, Promotion, and Markdown Optimization
    • Market Response Functions (notebook) (๐Ÿ“š)
    • Price Optimization for Multiple Products (notebook)
    • Price Optimization for Multiple Time Intervals (notebook)
  • Dynamic Pricing
    • Dynamic Pricing Using Thompson Sampling (notebook)
    • Dynamic Pricing with Limited Price Experimentation (notebook)
    • Price Optimization Using Reinforcement Learning (DQN) (notebook) (๐Ÿš€)

Supply Chain

These notebooks and applications can be used to develop procurement and inventory allocation solutions, as well as provide supply chain managers with advanced decisions support and automation tools.

  • Single-echelon Inventory Optimization Using (s,Q) and (R,S) Policies (notebook)
  • Inventory Allocation Optimization (notebook)
  • Multi-echelon Inventory Optimization Using Reinforcement Learning (DDPG, TD3) (notebook) (๐Ÿš€)
  • Supply Chain Simulator for Reinforcement Learning Based Optimization (PPO) (notebook) (๐Ÿš€)
  • Supply Chain Control Tower Using LLMs (notebook) (๐Ÿš€)

Smart Manufacturing

These notebooks can be used to prototype visual quality control and predictive maintenance solutions.

  • Noise Reduction in Multivariate Timer Series Using Linear Autoencoder (PCA) (notebook)
  • Remaining Useful Life Prediction Using Convolution Networks (notebook)
  • Anomaly Detection in Time Series (notebook)
  • Anomaly Detection in Images Using Autoencoders (notebook)

List of Questionnaires

These questionnaires can be used to assess readiness for typical AI/ML projects and collect the requirements for creating roadmaps and estimates.

More Documentation

  • The most basic models are described the Introduction to Algorithmic Marketing.
  • More advanced models that use deep learning and reinforcement learning techniques are described in The Theory and Practice of Enterprise AI.
  • Templates for basic data science and ML task are available in TensorHouseBasic repository.
  • Most notebooks contain references to specific research papers, industrial reports, and real-world case studies.
  • Follow LinkedIn and X (Twitter) for notifications about new developments and releases.

Contribution

We warmly welcome contributions, such as implementations of new use cases, advanced features and usability improvements for existing use cases, or enhancements to documentation.

tensor-house's People

Contributors

ikatsov avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tensor-house's Issues

Add requirements.txt

Hi, could you please add the Python packages as a requirements.txt file to your repository?

Wrong Calculation of Total Profit [price-optimization-using-dqn-reinforcement-learning]

Filename : pricing/price-optimization-using-dqn-reinforcement-learning.ipynb

at t=0, the Below function evaluates with p[0] and p[-1] as parameters which seems incorrect to me.
because p[-1] in python corresponds to last element of the array.

def profit_total(p, unit_cost, q_0, k, a, b):
  return profit_t(p[0], p[0], q_0, k, 0, 0, unit_cost) + sum(map(lambda t: profit_t(p[t], p[t-1], q_0, k, a, b, unit_cost), range(len(p))))

to fix this, we can use range(1,len(p)).

def profit_total(p, unit_cost, q_0, k, a, b):
  return profit_t(p[0], p[0], q_0, k, 0, 0, unit_cost) + sum(map(lambda t: profit_t(p[t], p[t-1], q_0, k, a, b, unit_cost), range(1,len(p))))

@ikatsov Do let me know if I am wrong or misunderstood something.

Questions about channel-attribution-lstm

Hi Ilya,

First, thanks so much for putting together this really helpful repo!

I've been trying to understand channel-attribution-lstm, and I've got a couple of questions about your features_for_lstm function.

  1. This code here:
    f_proj = df[['jid', 'campaigns', 'cats', 'click', 'cost', 'time_since_last_click_norm', \ 'timestamp_norm', 'conversion']]
    x2d = df_proj.values
    x3d_list = np.split(x2d[:, 1:], np.cumsum(np.unique(x2d[:, 0], return_counts=True)[1])[:-1])

Won't this split only work as intended if f_proj is sorted out by jid first? If the goal here is to separate out the sub-arrays for individual jids? Or am I missing the point?

  1. journey_matrix = journey_matrix[ journey_matrix[:, 5].argsort() ] # sort impressions by timestamp

Why 5 here? The timestamp_norm field is next to last in the journey_matrix array, so shouldn't it be journey_matrix.shape[1]-2?

  1. Finally, I had to convert y_train, y_val, and y_test into arrays for the model to run; it wouln't work for me as written because they were lists.

    I'd really appreciate your help! Thank you again --
    Natalia

Next best action example

I would like to try the Next Best Action model for a seller recommendation.
Can you please list steps I need to follow to try the Next Best Action model with your dataset?

name 'hist_all' is not defined

code in promotions (https://github.com/ikatsov/tensor-house/blob/master/promotions/channel-attribution-lstm.ipynb)

'hist_all' is not defined

In [198]:

Data exploration

def journey_lenght_histogram(df):
    counts = df.groupby(['jid'])['uid'].count().reset_index(name="count").groupby(['count']).count()
    return counts.index, counts.values / df.shape[0]

hist_x, hist_y = journey_lenght_histogram(df4)

plt.plot(range(len(hist_all)), hist_all, label='all journeys')
plt.yscale('log')
plt.xlim(0, 120)
plt.xlabel('Journey length (number of touchpoints)')
plt.ylabel('Fraction of journeys')
plt.show()

@ikatsov (want to know where hist_all is defined, it's not used in any block of code before)

Getting negative values calculating time steps for m > 4 in book-enterprise-ai-edition-2/recipe-10/dynamic-pricing-limited-experimentation.ipynb

Fantastic job on the book and notebooks in this repository.

I was working with book-enterprise-ai-edition-2/recipe-10/dynamic-pricing-limited-experimentation.ipynb and saw one minor issue.
I wanted to see what the timesteps looked like if I used a larger value of m (Using 6 price steps max rater than 4).
When I changed m to a higher value, then eventually the logx function attempts to calculate log of a value < 1 which returns a negative number.

This is the current function:
def logx(x, n):
for i in range(0, n):
x = math.log(x) if x>0 else 0
return x

To avoid that, I beleive this line:
x = math.log(x) if x>0 else 0
should be changed to:
x = math.log(x) if x>1 else 0

From section 3.2 (Notations) here: https://dspace.mit.edu/bitstream/handle/1721.1/119156/Pricing_v4_a.pdf?sequence=1&isAllowed=y

"We use log(m)T to represent m iterations of the logarithm, log(log(... log(T))), where m is the number of price changes.
For convenience, we let log(x) = 0 for all 0 โ‰ค x < 1, so the value of log(m) T is defined for all T โ‰ฅ 1."

Related with next-best-action-rl.ipynb - With multiple offerings at same time

Hello,

I was trying to replicate your RL approach in another but similar use case where the requirement is to offer multiple offerings at same time such as in your context both small and big discount. Please suggest how to handle that with your approach what would be the F array as we would have multiple values in same time step.

Thanks
San

Update World of supply (WoS) to latest ray module

Hi Ilya,

I have been experimenting with your nicely implemented module to understand 'Multi-agent deep reinforcement learning' and combinatorial optimization for 'multi-echelon supply chain'.

I hereby raise the issue since the latest RLLib module is not compatible with the code that you have created. It would be highly helpful for every one if you can provide the required modifications to make it compatible with the latest RLLib & ray package as multiple things have change in there.

Thanks
Ankit

ModuleNotFoundError: No module named 'ray.rllib.agents'

In tensor-house/supply-chain/supply-chain-reinforcement-learning.ipynb
the code

import ray.rllib.agents.ddpg as ddpg

gives the error

ModuleNotFoundError: No module named 'ray.rllib.agents'

After changing the line to

import ray.rllib.algorithms.ddpg as ddpg

there is still the problem that the config does not work

AttributeError: module 'ray.rllib.algorithms.ddpg' has no attribute 'DEFAULT_CONFIG'

I use
ray version 2.4.0
Python 3.10.9
Ubuntu

Input tensor

#image-artistic-style-similarity

After # compute styles when calling "style" facing an issue:

InvalidArgumentError: Must provide as many biases as the last dimension of the input tensor: [3] vs. [1,400,400,4] [Op:BiasAdd] name: style_model_7/BiasAdd/.

I found a reason initially if I run this:

compute styles

image_style_embeddings = {}
for image_path in tqdm(image_paths):
image_tensor = load_image(image_path)
style = style_to_vec(image_to_style(image_tensor))
image_style_embeddings[ntpath.basename(image_path)] = style

Then for some reason, the shape remains [1,400,400,4] when the tensor is empty but after when tensor start to fill the shape changes to [1,400,400,3]

But if I run this:

compute styles

image_style_embeddings = {}
for image_path in tqdm(image_paths):
->image_tensor = load_image(image_path)
->if (image_tensor.shape[3]!= 4):
-> ->style = style_to_vec(image_to_style(image_tensor))
->->image_style_embeddings[ntpath.basename(image_path)] = style

Its working fine is it ok to do?

Thanks in advance!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.