编程知识 cdmana.com

Attack and defense technology data - Recommend data poisoning in the system

This article has participated in 「 New people's creation ceremony 」 Activities , Start the road of nuggets creation together .

The problem background

Research background

We are in an era of information explosion , From reading books to reading newspapers , To watch the live broadcast 、 Brush micro-blog 、 Short video , The volume of information is becoming larger and larger , Information is presented in more and more forms . We enjoy the satisfaction of easy access to information , But information overload can also make us anxious . As an effective means to solve the problem of information overload , Recommendation system At present, it is widely used in e-commerce websites 、 Movie Music 、 Social networking and other fields .

Research significance

The necessity of research on poison attack in Recommendation System :

  • One side , Test the robustness of the recommended algorithm .
  • On the other hand ,“ To promote prevention ”, Provide suggestions on data security for the recommended algorithm

What is a recommendation system

Predict possible future connections based on existing connections

Recommend the attack purpose and mode of the system

Purpose : The attacker supports or suppresses certain items for certain purposes .

The way : Data poisoning attack ( Because this course is a data poisoning attack course , Only this kind of ) The attacker creates fake user data in batch , And pretend to be someone who shares the interests of the deceived user , The recommendation results are influenced by the effects of a large number of assists and running companions .

Recommend data poisoning in the system / Fraud attacks

The scoring matrix of the recommendation system is as follows :

Suppose you find these targets :  Insert picture description here Aggression :  Insert picture description here By forging similar users , To achieve the purpose of the attack .

The specific ideas are as follows :  Insert picture description here

Methods classification

The specific algorithm of recommendation system can be divided into collaborative filtering 、 Based on content 、 Mixed recommendation, etc Each kind of algorithm can also be subdivided according to its characteristics for example , Collaborative filtering can be divided into :

  • Based on nearest neighbor
  • Based on association rules
  • Based on decision tree
  • Based on Naive Bayes
  • Based on matrix factorization ( Hidden factor model )
  • ……

According to the characteristics of different recommendation algorithms , The corresponding attack model can be formulated

Technical examples

Precision strike

The most basic attack method is , Using manual rules, select filling items according to different methods, and design scoring vectors for forged users . common 3 Examples of attacks :

  1. Random attack ( Random attack)
  2. Average attack ( AVera individual attack)
  3. Epidemic attacks (Popular attack/bandwagon attack)

The above methods basically do not need to know the specific algorithm of the recommendation system , Applicable to almost all recommended scenarios , The advantage of the general method is that it is easy to implement , But in “ Precision strike ” Insufficient efforts in this respect . One “ Precision strike ” Example : Data poisoning attack against hidden factor model

Hidden factor model

Sometimes translated into “ Potential factor model ”、“ latent In the factor model ” It is considered to be one of the most advanced methods in the recommendation system Some famous dimensionality reduction methods are used to fill in the missing items

  • Dimensionality reduction method is often used in other fields of data analysis to obtain the low dimension of original data

Express

Intuitive thought

The scoring matrix R decompose :  Insert picture description here U It's a m * k Matrix ,V It's a n * K Matrix m Is the number of users ,n Is the number of items scored for example : Insert picture description here  Insert picture description here Be careful In the above example , matrix R It's absolutely certain , So decomposition is not particularly helpful . In the actual recommendation system , matrix R Not completely known , If the hidden factor can still be estimated U and V when , This method can become very useful .

The problem of optimization needs to be solved

 Insert picture description here Once the matrix is estimated U and V, The whole scoring matrix can be used UV^T Complete the estimation at one time , Then you can get all the missing scores . Here's the picture :  Insert picture description here

Poison attack against hidden factor model — No goal

In order to recommend the target project , It is possible to forge the scoring matrix  Insert picture description here  Insert picture description here  Insert picture description here

Poison attack against hidden factor model — Targeted

Also forge the scoring matrix  Insert picture description here

Poison attack against hidden factor model — Mixed attack

 Insert picture description here ( To be continued )

summary

版权声明
本文为[sec0nd]所创,转载请带上原文链接,感谢
https://cdmana.com/2022/134/202205141312285103.html

Scroll to Top