Local Word Embeddings for Query Expansion based on Co-Authorship and Citations

André Rattinger, Jean-Marie Le Goff, Christian Gütl

Research output: Contribution to conferencePaper

Abstract

Word embedding techniques have gained a lot of interest
from natural language processing researchers recently and they are valuable resource in identifying a list of semantically related terms for a
search query. These related terms build a natural addition for query expansion, but might mismatch when the application domains use different
jargon. Using the Skip-Gram algorithm of Word2Vec, terms are selected
only from a specific subset of the corpus, which is extended by documents
from co-authorship and citations. We demonstrate that locally-trained
word embeddings with this extension provides a valuable augmentation
and can improve retrieval performance. First result suggest that query
expansion and word embeddings could also benefit from other related
information.

Workshop

WorkshopBibliometric-enhanced Information Retrieval
Abbreviated titleBIR 2018
CountryFrance
CityFrenoble
Period26/03/1826/03/18
Internet address

Fingerprint

Processing

Fields of Expertise

  • Information, Communication & Computing

Cite this

Rattinger, A., Le Goff, J-M., & Gütl, C. (2018). Local Word Embeddings for Query Expansion based on Co-Authorship and Citations. 46. Paper presented at Bibliometric-enhanced Information Retrieval, Frenoble, France.

Local Word Embeddings for Query Expansion based on Co-Authorship and Citations. / Rattinger, André ; Le Goff, Jean-Marie; Gütl, Christian.

2018. 46 Paper presented at Bibliometric-enhanced Information Retrieval, Frenoble, France.

Research output: Contribution to conferencePaper

Rattinger, A, Le Goff, J-M & Gütl, C 2018, 'Local Word Embeddings for Query Expansion based on Co-Authorship and Citations' Paper presented at Bibliometric-enhanced Information Retrieval, Frenoble, France, 26/03/18 - 26/03/18, pp. 46.
Rattinger A, Le Goff J-M, Gütl C. Local Word Embeddings for Query Expansion based on Co-Authorship and Citations. 2018. Paper presented at Bibliometric-enhanced Information Retrieval, Frenoble, France.
Rattinger, André ; Le Goff, Jean-Marie ; Gütl, Christian. / Local Word Embeddings for Query Expansion based on Co-Authorship and Citations. Paper presented at Bibliometric-enhanced Information Retrieval, Frenoble, France.53 p.
@conference{0d7a9e99aec54ff8b6d8a0b9e7a8ec91,
title = "Local Word Embeddings for Query Expansion based on Co-Authorship and Citations",
abstract = "Word embedding techniques have gained a lot of interestfrom natural language processing researchers recently and they are valuable resource in identifying a list of semantically related terms for asearch query. These related terms build a natural addition for query expansion, but might mismatch when the application domains use differentjargon. Using the Skip-Gram algorithm of Word2Vec, terms are selectedonly from a specific subset of the corpus, which is extended by documentsfrom co-authorship and citations. We demonstrate that locally-trainedword embeddings with this extension provides a valuable augmentationand can improve retrieval performance. First result suggest that queryexpansion and word embeddings could also benefit from other relatedinformation.",
author = "Andr{\'e} Rattinger and {Le Goff}, Jean-Marie and Christian G{\"u}tl",
year = "2018",
month = "3",
day = "26",
language = "English",
pages = "46",
note = "Bibliometric-enhanced Information Retrieval, BIR 2018 ; Conference date: 26-03-2018 Through 26-03-2018",
url = "https://www.gesis.org/en/services/events/events-archive/conferences/ecir-workshops/ecir-workshop-2018/",

}

TY - CONF

T1 - Local Word Embeddings for Query Expansion based on Co-Authorship and Citations

AU - Rattinger,André

AU - Le Goff,Jean-Marie

AU - Gütl,Christian

PY - 2018/3/26

Y1 - 2018/3/26

N2 - Word embedding techniques have gained a lot of interestfrom natural language processing researchers recently and they are valuable resource in identifying a list of semantically related terms for asearch query. These related terms build a natural addition for query expansion, but might mismatch when the application domains use differentjargon. Using the Skip-Gram algorithm of Word2Vec, terms are selectedonly from a specific subset of the corpus, which is extended by documentsfrom co-authorship and citations. We demonstrate that locally-trainedword embeddings with this extension provides a valuable augmentationand can improve retrieval performance. First result suggest that queryexpansion and word embeddings could also benefit from other relatedinformation.

AB - Word embedding techniques have gained a lot of interestfrom natural language processing researchers recently and they are valuable resource in identifying a list of semantically related terms for asearch query. These related terms build a natural addition for query expansion, but might mismatch when the application domains use differentjargon. Using the Skip-Gram algorithm of Word2Vec, terms are selectedonly from a specific subset of the corpus, which is extended by documentsfrom co-authorship and citations. We demonstrate that locally-trainedword embeddings with this extension provides a valuable augmentationand can improve retrieval performance. First result suggest that queryexpansion and word embeddings could also benefit from other relatedinformation.

M3 - Paper

SP - 46

ER -