Event

Seminar Representation Learning for Knowledge Graphs (Master) [WS202513601]

Type

seminar (S)
Online

Term

WS 20/21

SWS

Language

Englisch

Appointments

Links

ILIAS

Lecturers

Organisation

Information Service Engineering

Part of

Appointments

04.11.2020 12:00 - 13:30 - Room: 05.20 5A-09
11.11.2020 12:00 - 13:30 - Room: 05.20 5A-09
18.11.2020 12:00 - 13:30 - Room: 05.20 5A-09
25.11.2020 12:00 - 13:30 - Room: 05.20 5A-09
02.12.2020 12:00 - 13:30 - Room: 05.20 5A-09
09.12.2020 12:00 - 13:30 - Room: 05.20 5A-09
16.12.2020 12:00 - 13:30 - Room: 05.20 5A-09
23.12.2020 12:00 - 13:30 - Room: 05.20 5A-09
13.01.2021 12:00 - 13:30 - Room: 05.20 5A-09
20.01.2021 12:00 - 13:30 - Room: 05.20 5A-09
27.01.2021 12:00 - 13:30 - Room: 05.20 5A-09
03.02.2021 12:00 - 13:30 - Room: 05.20 5A-09
10.02.2021 12:00 - 13:30 - Room: 05.20 5A-09
17.02.2021 12:00 - 13:30 - Room: 05.20 5A-09

Note

Participation is restricted to 10 students max.

Contributions of the students:

Each student will be assigned at max 2 papers on the topic. Out of which the student will have to give a seminar presentation and write a seminar report paper of 15 pages explaining the methods from at least one of the two assigned papers, in their own words.

Implementation (if applicable):

If code is available from the authors, then re-implementation of it for small scale experiments using Google Colab with python.

Teaching Team:

Dr. Mehwish Alam
Dr. Danilo Dessi
M. Sc. Russa Biswas

Data representation or feature representation plays a key role in the performance of machine learning algorithms. In recent years, rapid growth has been observed in Representation Learning (RL) of words and Knowledge Graphs (KG) into low dimensional vector spaces and its applications to many real-world scenarios. Word embeddings are a low dimensional vector representation of words that are capable of capturing the context of a word in a document, semantic similarity as well as its relation with other words. Similarly, KG embeddings are a low dimensional vector representation of entities and relations from a KG preserving its inherent structure and capturing the semantic similarity between the entities. Each embedding space exhibits different semantic characteristics based on the source of information, e.g, text or KGs as well as the learning of the embedding algorithms. The same algorithm, when applied to different representations of the same training data, leads to different results due to the variation in the features encoded in the respective representations. The distributed representation of text in the form of the word and document vectors as well as of the entities and relations of the KG in form of entity and relation vectors have evolved as the key elements of various natural language processing tasks such as Entity Linking, Named Entity Recognition and disambiguation, etc. Different embedding spaces are generated for textual documents of different languages, hence aligning the embedding spaces has become a stepping stone for machine translation. On the other hand, in addition to multilingualism and domain-specific information, different KGs of the same domain have structural differences, making the alignment of the KG embeddings more challenging. In order to generate coherent embedding spaces for knowledge-driven applications such as question answering, named entity disambiguation, knowledge graph completion, etc., it is necessary to align the embedding spaces generated from different sources.

In this seminar, we would like to study the different state of the art algorithms for aligning embedding space. We would focus on two types of alignment algorithms: (1) Entity - Entity alignment, and (2) Entity - Word alignment.