Authorea

AT deleted beginitemize__item_C.tex almost 9 years ago

Commit id: 8577e89a1b0d0137bf4c083d26ee561d941efbd1

deletions | additions

\begin{itemize} \item Clear-cut questions: \textbf{Did X Y become President?}, \textbf{Did Iran test detonate a nuclear bomb?}, \textbf{Did Germany ban the PEDIGA movement?} These should be generally easy to parse by Syphon and to answer unambiguously from news sources. \item Fuzzy-matching questions: \textbf{Was Saudi Arabia’s monarchy dethroned in a coup d’etat?} \textbf{Did Germany abandon the euro?}, \textbf{Did Sears declare bankruptcy?} These are about unambiguous facts and should pass Syphon, but simple text matching strategies might fail when answering the question - the system needs to match ``revolution'' to ``dethroned in a coup d'etat'', many news titles would use ``leave the eurozone'' instead of ``abandon the euro'' and ``declare bankrupcy'' is the same as e.g.\ ``go bankrupt''. Based on results like \citep{DefGen,QANTA}, we can predict that many of the simpler paraphrases would be covered by vector embeddings% \footnote{There is also a Wordnet dictionary that covers many synonymic relationships of individual words, but extending it to verb/noun combinations is less trivial.} but this is unlikely to solve the ``abandon the euro''. We can either rely on stricter Syphon (we could suggest manually created templates like ``becomes $X$'' and ``stops being $X$'' with regard to a knowledge base attribute), or simply hope that the news sources scanned will use so much variation in the phrasing of what happened that the particular phrasing used in the question (or something fairly close to it involving close synonyms) will still appear often enough; the alpha prototype should shed more light on this. Also, another tricky part is that \textbf{Sears} is the name of many legal entities; do we mean the department store chain, \textbf{Sears Holdings} or some sister company? However, Syphon can catch ambiguous references like this. \item Questions requiring inference: \textbf{Did the US dollar collapse by at least 50\%?}, \textbf{Did the euro collapse another 50\%?}, \textbf{Did a real blizzard lead to power cuts for more than 10 million Americans?} Answering these questions with simple text matching is unlikely to produce any results. Syphon shouldn't be able to decompose this question to simple elements and relationships between these and so shouldn't let these questions pass through. Of course, in the whole-system perspective this means that the Syphon will instead prompt the user, ask for more detail and eventually let a sufficiently precise question on this topic through. \item Subjective questions: \textbf{Was Obamacare dismantled by the Supreme Court?}, \textbf{Did Israel attack Iran?}, \textbf{Did the EU slide into a new recession?}, \textbf{Was a miracle cure for diabetes discovered?} These questions are particularly tricky as filtering them with Syphon may be problematic. Entity X attacking entity Y seems quite unambiguous and well-defined. On the other hand, the definition of ``attack'' may vary widely\footnote{Israel is a good example as the country is known for executing military and intelligence operations covertly and avoiding admission of involvement.} --- from the bombing of a specific site to a full-scale military attack. We don't have a good solution to this class of questions, except warning users that the act of ``dismantling'' simply means that enough commentators consider the court judgement to be this, the same with ``recession'' or what is a ``miracle cure'' (we already have several, or none). \end{itemize}

subsectionNLP_for_Ye.tex beginitemize__item_t.tex sectionStructure_of_.tex beginitemize__item_C.tex sectionHow_Can_We_Kn.tex sectionA_Concrete_Pr.tex beginitemize__item_S.tex