This article has participated in 「 New people's creation ceremony 」 Activities , Start the road of nuggets creation together .
The problem background
We are in an era of information explosion , From reading books to reading newspapers , To watch the live broadcast 、 Brush micro-blog 、 Short video , The volume of information is becoming larger and larger , Information is presented in more and more forms . We enjoy the satisfaction of easy access to information , But information overload can also make us anxious . As an effective means to solve the problem of information overload , Recommendation system At present, it is widely used in e-commerce websites 、 Movie Music 、 Social networking and other fields .
The necessity of research on poison attack in Recommendation System ：
- One side , Test the robustness of the recommended algorithm .
- On the other hand ,“ To promote prevention ”, Provide suggestions on data security for the recommended algorithm
What is a recommendation system
Predict possible future connections based on existing connections
Recommend the attack purpose and mode of the system
Purpose ： The attacker supports or suppresses certain items for certain purposes .
The way ： Data poisoning attack ( Because this course is a data poisoning attack course , Only this kind of ) The attacker creates fake user data in batch , And pretend to be someone who shares the interests of the deceived user , The recommendation results are influenced by the effects of a large number of assists and running companions .
Recommend data poisoning in the system / Fraud attacks
The scoring matrix of the recommendation system is as follows ：
Suppose you find these targets ： Aggression ： By forging similar users , To achieve the purpose of the attack .
The specific ideas are as follows ：
The specific algorithm of recommendation system can be divided into collaborative filtering 、 Based on content 、 Mixed recommendation, etc Each kind of algorithm can also be subdivided according to its characteristics for example , Collaborative filtering can be divided into :
- Based on nearest neighbor
- Based on association rules
- Based on decision tree
- Based on Naive Bayes
- Based on matrix factorization ( Hidden factor model )
According to the characteristics of different recommendation algorithms , The corresponding attack model can be formulated
The most basic attack method is , Using manual rules, select filling items according to different methods, and design scoring vectors for forged users . common 3 Examples of attacks :
- Random attack ( Random attack)
- Average attack ( AVera individual attack)
- Epidemic attacks (Popular attack/bandwagon attack)
The above methods basically do not need to know the specific algorithm of the recommendation system , Applicable to almost all recommended scenarios , The advantage of the general method is that it is easy to implement , But in “ Precision strike ” Insufficient efforts in this respect . One “ Precision strike ” Example : Data poisoning attack against hidden factor model
Hidden factor model
Sometimes translated into “ Potential factor model ”、“ latent In the factor model ” It is considered to be one of the most advanced methods in the recommendation system Some famous dimensionality reduction methods are used to fill in the missing items
- Dimensionality reduction method is often used in other fields of data analysis to obtain the low dimension of original data
The scoring matrix R decompose ： U It's a m * k Matrix ,V It's a n * K Matrix m Is the number of users ,n Is the number of items scored for example ： Be careful In the above example , matrix R It's absolutely certain , So decomposition is not particularly helpful . In the actual recommendation system , matrix R Not completely known , If the hidden factor can still be estimated U and V when , This method can become very useful .
The problem of optimization needs to be solved
Once the matrix is estimated U and V, The whole scoring matrix can be used UV^T Complete the estimation at one time , Then you can get all the missing scores . Here's the picture ：
Poison attack against hidden factor model — No goal
In order to recommend the target project , It is possible to forge the scoring matrix
Poison attack against hidden factor model — Targeted
Also forge the scoring matrix
Poison attack against hidden factor model — Mixed attack
( To be continued )