Abstract
Word embedding techniques have gained a lot of interest
from natural language processing researchers recently and they are valuable resource in identifying a list of semantically related terms for a
search query. These related terms build a natural addition for query expansion, but might mismatch when the application domains use different
jargon. Using the Skip-Gram algorithm of Word2Vec, terms are selected
only from a specific subset of the corpus, which is extended by documents
from co-authorship and citations. We demonstrate that locally-trained
word embeddings with this extension provides a valuable augmentation
and can improve retrieval performance. First result suggest that query
expansion and word embeddings could also benefit from other related
information.
from natural language processing researchers recently and they are valuable resource in identifying a list of semantically related terms for a
search query. These related terms build a natural addition for query expansion, but might mismatch when the application domains use different
jargon. Using the Skip-Gram algorithm of Word2Vec, terms are selected
only from a specific subset of the corpus, which is extended by documents
from co-authorship and citations. We demonstrate that locally-trained
word embeddings with this extension provides a valuable augmentation
and can improve retrieval performance. First result suggest that query
expansion and word embeddings could also benefit from other related
information.
Original language | English |
---|---|
Pages | 46 |
Number of pages | 53 |
Publication status | Published - 26 Mar 2018 |
Event | Bibliometric-enhanced Information Retrieval - Frenoble, France Duration: 26 Mar 2018 → 26 Mar 2018 Conference number: 7 https://www.gesis.org/en/services/events/events-archive/conferences/ecir-workshops/ecir-workshop-2018/ |
Workshop
Workshop | Bibliometric-enhanced Information Retrieval |
---|---|
Abbreviated title | BIR 2018 |
Country/Territory | France |
City | Frenoble |
Period | 26/03/18 → 26/03/18 |
Internet address |
Fields of Expertise
- Information, Communication & Computing