CFKG AlgorithmIntroduction of the task
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
Project3 CFKG AlgorithmIntroduction of the task
• In this task, you need to construct a Recommender System by the given user-item
interaction data and knowledge graph (KG).
• The key to the Recommender System is a score function ?(?, ?)
• For the samples in the given user-item interaction data
• While inputting a positive sample, ? should return a high score
• While inputting a negative sample or a sample not in the given data, ? should return a low score.
2How to score a sample
• The input of the ?(?, ?),? and ? is only the index of the user/item.
• There is no information in the index of the user/item, it’s impossible to give a significative score
only based on the index.
• For the user ? and item ?, we need to find their feature vectors, and record as ? and ?.
• The calculation process of ?(?, ?) depends on ? and ?.
3How to find a good feature?
• An intuitive method is designing some statistical metric as the feature, such as the
correlation coefficient
• Designing the statistical metric is difficult.
• It’s hard to use the information in the KG.
• Another method is representing the user/item as a low-dimensional real value
vector (also called Embedding)
• How to get the embedding?
• Firstly, we define the calculation process of ?(?, ?)
• Get the embeddings of all the users and items by the Gradient Descent method, let the return value of
?(?, ?) for all the combinations of users and items are accordance with the given interaction data
• Each element in the embedding can represent some information about the user/item, but
we cannot explain them.
• The information of KG can also be represented in the embedding.
4How to mix information of KG and interaction
data?——CFKG Algorithm
• CFKG algorithm models the interaction records
as a new type of relations in the KG, there
relation type is “Interested-in”.
• The interaction data and origin KG will be mixed
into a bigger KG
• We also can define a score function ?(ℎ, ?,?) for
the relation in the KG:
• Input is a relation in KG: ? returns a high score
• Input is a relation not in KG: ? returns a low score
• If we define ?! = Interested-in:
• For a positive sample, ? ?, ?! , ? returns a high score
• For a negative sample or a sample not in the given
data , ? ?, ?! , ? returns a low score.
5
Integrating interaction data with knowledge graph
data to form a new knowledge graphA specific design scheme of ?
— the TransE Algorithm
• TransE Algorithm is an example of ?(ℎ, ?,?)
• Entity and relation type will be represented as a vector in the
d-dimensional Euclidean Space
• Ideally, when the relational triplet (ℎ, ?,?) holds, its
corresponding vector representation should satisfy:
? + ? = ?
• However, the ideal situation often does not hold true.
Therefore, the likelihood of a relational triplet (ℎ, ?,?) being
valid is positively correlated with − ? + ? − ?
• Therefore, ? ℎ, ?,? = − ? + ? − ?
Schematic diagram of the calculation principle of
the TransE relation model
6Get Embedding from given data
• For the merged knowledge graph, the Embedding representations of all entities and relations
should satisfy:
• For a relation in KG: ? returns a high score
• For a relation not in KG: ? returns a low score