nvidia-merlin / models Goto Github PK

View Code? Open in Web Editor NEW

239.0 24.0 48.0 115.26 MB

Merlin Models is a collection of deep learning recommender system model reference implementations

Home Page: https://nvidia-merlin.github.io/models/main/index.html

License: Apache License 2.0

Python 99.69% Makefile 0.18% Shell 0.13%

deep-learning machine-learning pytorch recommendation-system recommender-system tensorflow dask gpu rapidsai recsys

models's People

Contributors

Stargazers

Watchers

models's Issues

CPU support (GPU Accelerated) - GPU Dataloaders to accelerate training recsys models

Aha! Link: https://nvaiinfra.aha.io/features/MERLIN-471

Will leverage dataset schema to create models without boilerplate of defining per feature layers

Aha! Link: https://nvaiinfra.aha.io/features/MERLIN-474

Polish and add tests for DCN-V2

Currently blocked by:

Retrieval Models

Support common retrieval models:

Two Tower
Matrix Factorization

Aha! Link: https://nvaiinfra.aha.io/features/MERLIN-489

Matrix Factorization

Aha! Link: https://nvaiinfra.aha.io/requirements/MERLIN-489-2

Migrate tensorflow/pytorch inference export from nvtabular

Dataloaders

Aha! Link: https://nvaiinfra.aha.io/features/MERLIN-496

Update model creation code in NVT example notebooks to use these models

[FEA] Validate Models on large datasets with extra columns

The issue is: How to validate models on large datasets and analyze by additional columns?

In our examples, we predict each batch and store a copy in a separate Tensor + store a copy of the targets. After we iterate over all batches, we concatenate them and calculate the performance.

What happens, when the dataset is too large, that the predictions and targets do not fit in memory?
What happens, if we want to calculate the performance of the models per category? (e.g. only new users)? These categories may be not used during training (not part of X).

Triton Inference Server integration / Workflow config / Feature eng. + DL ranking framework

Aha! Link: https://nvaiinfra.aha.io/features/MERLIN-482

Add functionality to EmbeddingFeatures to make exporting vectors to parquet easier

low-level/mid-level API. Components to build custom models

Aha! Link: https://nvaiinfra.aha.io/features/MERLIN-475

Add schemas fixtures for unit tests

Feature 1

Aha! Link: https://nvaiinfra.aha.io/features/MERLIN-463

[Examples] Ranking models notebook

DLRM
DCN-v2

[Examples] - Create schema for MovieLens example

POC Feature 1

Aha! Link: https://nvaiinfra.aha.io/features/MERLIN-463

f1-requirement 1

Aha! Link: https://nvaiinfra.aha.io/requirements/MERLIN-463-1

f1-requirement 1

Aha! Link: https://nvaiinfra.aha.io/requirements/MERLIN-463-1

Refactory & add tests to TabularFeatures Block to be functional

Create common recsys models (ranking/retrieval) - Single GPU and Multi-GPU

Aha! Link: https://nvaiinfra.aha.io/features/MERLIN-476

f2-requirement 2

Aha! Link: https://nvaiinfra.aha.io/requirements/MERLIN-463-2

[RMP] Ranking - TF : Wide and Deep, DLRM, DeepFM

The initial version should support some common ranking models like

DLRM
Wide and Deep
Factorization Machines

DeepFM - Factorization Machine - https://arxiv.org/abs/1703.04247

Learning sophisticated feature interactions behind user behaviors is critical in maximizing CTR for recommender systems. Despite great progress, existing methods seem to have a strong bias towards low- or high-order interactions, or require expertise feature engineering. In this paper, we show that it is possible to derive an end-to-end learning model that emphasizes both low- and high-order feature interactions. The proposed model, DeepFM, combines the power of factorization machines for recommendation and deep learning for feature learning in a new neural network architecture. Compared to the latest Wide & Deep model from Google, DeepFM has a shared input to its "wide" and "deep" parts, with no need of feature engineering besides raw features. Comprehensive experiments are conducted to demonstrate the effectiveness and efficiency of DeepFM over the existing models for CTR prediction, on both benchmark data and commercial data.

Aha! Link: https://nvaiinfa.aha.io/features/MERLIN-727

from merlin_models.tensorflow.models.retrieval import YouTubeDNN


YouTubeDNN(continuous_columns=[], 
           categorical_columns=[], 
           embedding_dims=[], 
           hidden_dims=[128, 128], 
           activations=None)

Solution:

import tensorflow as tf

from merlin_models.tensorflow.layers import DenseFeatures


class YouTubeDNN(tf.keras.Model):
    """
    https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45530.pdf
    See model architecture diagram in Figure 3
    """

    def __init__(self, continuous_columns, categorical_columns, embedding_dims, hidden_dims=None, activations=None):

        super().__init__()

        hidden_dims = hidden_dims or []
        activations = activations or []

        if len(hidden_dims) != len(activations):
            raise ValueError('"hidden_dims" and "activations" must be the same length.')

        channels = self.channels(continuous_columns, categorical_columns, embedding_dims)

        self.input_layer = DenseFeatures(channels["categorical"] + channels["continuous"])

        self.hidden_layers = []
        for dim, activation in zip(hidden_dims, activations):
            self.hidden_layers.append(
                tf.keras.layers.Dense(
                    dim,
                    activation=activation,
                    activity_regularizer="l2",
                )
            )

    def channels(self, continuous_columns, categorical_columns, embedding_dims):
        if not isinstance(embedding_dims, dict):
            embedding_dims = {col.name: embedding_dims for col in categorical_columns}

        embedding_columns = [
            tf.feature_column.embedding_column(col, embedding_dims[col.name])
            for col in categorical_columns
        ]

        return {"categorical": embedding_columns, "continuous": continuous_columns}

    def call(self, inputs, training=False):
        x = self.input_layer(inputs)
        for layer in self.hidden_layers:
            x = layer(x)
        return x

Commit to solve: 1c7bc3d

[] MF
Two-tower

nvidia-merlin / models Goto Github PK

models's People

Contributors

Stargazers

Watchers

Forkers

models's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs