Invited Keynote Speakers
Davide Buscaldi
Davide Buscaldi is currently working on the automatic building of ontologies and their effective use in Information Retrieval. He is also interested in the semantic similarity methods and their applications in IR and the semantic web. Finally, he is interested in text mining problems and knowledge extraction from texts. Some previous talks given by Davide are available here: https://sites.google.com/site/davidebuscaldi/downloads/talks
Improving access to scientific literature: a semantic IR perspective
Nowadays, the flow of data and publications in almost every field of research is continuously growing. This data deluge presents a bottleneck for scientific progress and a challenge for existing search engines. The problems to be solved are some old ones: the ambiguity of a concept, especially among dierent research fields (for instance, ''lattice" in computer science vs. physics), and the synonymy (or quasi-synonymy) of concepts that are expressed in dierent ways: for instance, ''opinion mining" and ''sentiment analysis". These issues may affect various tasks: a researcher building a state of the art for a specific topic, an editor finding reviewers for a given paper, or a government official studying a project proposal, among others.
Recently, at LIPN, Davide and his colleagues started working on the access to scientific information from a semantic information retrieval perspective, therefore leveraging the use of ontologies and similar semantic resources for this task. The first step has been to build a typology of semantic relations that are often used in state of the art sections of scientific paper. Some of these relations link methods and the problems they solve, others link a resource and a system that used it. This typology can evolve or be integrated into more complex ontologies. The next step was to verify whether it is possible to detect these relations automatically. They focused on unsupervised methods that exploit the information coming from keywords and patterns around the entities that are connected by the relations, and tested the possibility to improve these results using semantic embeddings. They produced a set of annotated documents that were used for task-7 at SemEval 2018, where various participants showed the eectiveness of Deep Neural Networks (DNN) methods to detect and classify the relations. The results show that these methods are usually able to predict with a high accuracy (around 90%) the type of a relation, if they are fed the information about the linked entities, but there is still a lot of work to be done for the detection of the relations (around 50% for the best system).
Miguel Martínez Álvarez
Miguel Martínez Álvarez is the Co-founder and Chief Data Scientist at Signal Media, a fast-growing UK company that analyses in real-time millions of news articles per day in order to improve the quality of business intelligence and decision making across organisations. During the last five years, Miguel has led the Signal Research team, building an R&D group with strong university collaborations, with the main goal of transforming the best research principles, models and algorithms from academia into real, scalable products. Research interests among others include news processing, information retrieval, text analysis, text classification, entity linking, natural language processing and system evaluation.
During his time at Signal , the company has grown from three to 70+ people and raised more than 15M in funding. Miguel has been awarded the Business Leader of Tomorrow Award in 2014 by Innovate UK and was included in Bloomberg's list of UK Business Innovators in 2016. The team has also won the Best Demonstration award in ECIR 2015 and they are the main organisers of the NewsIR workshop as part of ECIR. Some works by Miguel are available here: https://miguelmalvarez.com/publications/
Analysing the world's news: Learnings from Industry
The talk will focus on the learnings from the last 5 years of research at Signal. It will showcase the efforts to transform the best research in IR/NLP to build a large-scale text analytics pipeline capable of processing millions of documents daily to power the Signal media monitoring and intelligence platform. The talk will touch not only technical details but also the challenges and differences of working in industry, from the organisation decisions and the collaboration with universities to the prioritisation challenges between product and research.
Sponsors of the Conference
|
|