This is a study note for
Metrix | Recommending focus | Example industry |
---|---|---|
novelty | different item | News, Video, Game |
personalization | similar item | music, searching |
diversity (novelty + personalization) | different + similar item | E-commerce, App |
pathway | output Contents | matching algorithm | Solving problem |
---|---|---|---|
u2i | personalized items | query from log database | Gray Sheep |
i2i | similar items | embedding(item2vec/DSSM) + KNN_TopN | Synonymy |
u2u | similar user | embedding(user2vec/DSSM) + KNN_TopN | cold start |
u2u2i | similar user, personalized items | embedding(ALS_MF/NCF/DSSM) + KNN_TopN | cold start, Gray Sheep, Data Sparsity |
u2i2i | personalized items, similar items | embedding(ALS_MF/NCF/DSSM) + KNN_TopN | Gray Sheep, Synonymy, Data Sparsity |
pathway | output Contents | Matching algorithm | Solving problem |
---|---|---|---|
u2tag | same tag | query from profile database | Gray Sheep |
tag2tag | similar tag | embedding(tag2vec) + KNN_TopN | Novelty, Data Sparsity |
tag2i | similar items | query from content database | cold start |
u2tag2tag2i | similar tag, similar items | embedding(ALS_MF/NCF/DSSM) + KNN_TopN | cold start, Gray Sheep, novelty, Data Sparsity |
Hybridization techniques | Description |
---|---|
ordering | given order of strategies, choose the best one strategy to present |
average | \(S_i = \frac{\sum_{s} S_{s,c}}{\#s}\) |
weighted average | Given weight \(W_s\) of each Matching components, \(S_i = \frac{\sum_{s} S_{s,c} \times W_s}{\sum_s W_s}\) |
dynamic weighting | instantaneously calculate KPI, such as Click Through Rate (CTR), Average Transaction Value (ATV) of each Matching components, and update weight, then use weighted average |
algorithm-based weighting | use model (LR、FM、Embedding+MLP, AFM, IAFM, Wide&Deep, FNN, NFM, DeepFM, DCN, xDeepFM, PNN, OENN, OANN, FGCNN, FiBiNET) to calculate Click Through Rate and assign weight, , then use weighted average |
model | Advantages | Disadvantages | data |
---|---|---|---|
traditional Matrix Factorization | Latent semantic indexing | no crossing either item or user features, high memory complexity | x: user-item interaction matrix, y: user latent vector, item latent vector |
YouTube’ user-embedding DNN | crossing user features | no crossing item features | x: user behavior, y: user-item interaction matrix |
NCF, Neural Collaborative Filtering | embedding user and item variable, respectively, to enable feature-crossing | the later mlp for user and item feature-crossing does not help to obtain fit the user-item interaction matrix | x: user feature, item feature; y: user-item interaction matrix |
NMF, Neural Matrix Factorization | crossing user and item features | crossing user and item features increasing computation latency during serving | x: user feature, item feature; y: user-item interaction matrix |
DSSM, Deep Semantic Similarity Model | improvement of NCF on removing mlp after embedding. 1) embedding user and item variable, respectively, to enable feature-crossing 2) During serving, the seperation of user and item embedding reduce computation cost by pre-embedding items. | Asynchrony problem when model version is different due to high-frequency incremental learning | x: user feature, item feature; y: user-item interaction matrix |