This article has participated in 「 New people's creation ceremony 」 Activities , Start the road of nuggets creation together .

# The problem background

## Research background

We are in an era of information explosion , From reading books to reading newspapers , To watch the live broadcast 、 Brush micro-blog 、 Short video , The volume of information is becoming larger and larger , Information is presented in more and more forms . We enjoy the satisfaction of easy access to information , But information overload can also make us anxious . As an effective means to solve the problem of information overload ,** Recommendation system ** At present, it is widely used in e-commerce websites 、 Movie Music 、 Social networking and other fields .

## Research significance

The necessity of research on poison attack in Recommendation System ：

- One side , Test the robustness of the recommended algorithm .
- On the other hand ,“ To promote prevention ”, Provide suggestions on data security for the recommended algorithm

## What is a recommendation system

Predict possible future connections based on existing connections

## Recommend the attack purpose and mode of the system

Purpose ： The attacker supports or suppresses certain items for certain purposes .

The way ： Data poisoning attack ( Because this course is a data poisoning attack course , Only this kind of ) The attacker creates fake user data in batch , And pretend to be someone who shares the interests of the deceived user , The recommendation results are influenced by the effects of a large number of assists and running companions .

## Recommend data poisoning in the system / Fraud attacks

The scoring matrix of the recommendation system is as follows ：

Suppose you find these targets ： Aggression ： By forging similar users , To achieve the purpose of the attack .

The specific ideas are as follows ：

## Methods classification

The specific algorithm of recommendation system can be divided into collaborative filtering 、 Based on content 、 Mixed recommendation, etc Each kind of algorithm can also be subdivided according to its characteristics for example , Collaborative filtering can be divided into :

- Based on nearest neighbor
- Based on association rules
- Based on decision tree
- Based on Naive Bayes
- Based on matrix factorization ( Hidden factor model )
- ……

According to the characteristics of different recommendation algorithms , The corresponding attack model can be formulated

# Technical examples

## Precision strike

The most basic attack method is , Using manual rules, select filling items according to different methods, and design scoring vectors for forged users . common 3 Examples of attacks :

- Random attack ( Random attack)
- Average attack ( AVera individual attack)
- Epidemic attacks (Popular attack/bandwagon attack)

The above methods basically do not need to know the specific algorithm of the recommendation system , Applicable to almost all recommended scenarios , The advantage of the general method is that it is easy to implement , But in “ Precision strike ” Insufficient efforts in this respect . One “ Precision strike ” Example :** Data poisoning attack against hidden factor model **

## Hidden factor model

Sometimes translated into “ Potential factor model ”、“ latent In the factor model ” It is considered to be one of the most advanced methods in the recommendation system Some famous dimensionality reduction methods are used to fill in the missing items

- Dimensionality reduction method is often used in other fields of data analysis to obtain the low dimension of original data

Express

### Intuitive thought

The scoring matrix R decompose ： U It's a m * k Matrix ,V It's a n * K Matrix m Is the number of users ,n Is the number of items scored ** for example ：**** Be careful ** In the above example , matrix R It's absolutely certain , So decomposition is not particularly helpful . In the actual recommendation system , matrix R Not completely known , If the hidden factor can still be estimated U and V when , This method can become very useful .

## The problem of optimization needs to be solved

Once the matrix is estimated U and V, The whole scoring matrix can be used UV^T Complete the estimation at one time , Then you can get all the missing scores . Here's the picture ：

## Poison attack against hidden factor model — No goal

In order to recommend the target project , It is possible to forge the scoring matrix

## Poison attack against hidden factor model — Targeted

Also forge the scoring matrix

## Poison attack against hidden factor model — Mixed attack

( To be continued )

# summary

版权声明

本文为[sec0nd]所创，转载请带上原文链接，感谢

https://cdmana.com/2022/134/202205141312285103.html