Alexander Kirillov edited bf_Abstract_The_goal_of__.tex  about 8 years ago

Commit id: 06e1981aae331cd3c3bfabd5eae256d95452ce5c

deletions | additions      

       

the neural-networks language-modeling crowd, we had to struggle quite a bit to  figure out the rationale behind the equations.  We show how to consider similarity between  features for calculation of similarity of objects in the Vector Space Model (VSM) for machine   learning algorithms  and other classes of methods that involve similarity between objects. Unlike LSA, we assume   that similarity  between features is known (say, from a synonym dictionary) and does not need to be learned from the data.  We call the proposed similarity measure soft similarity.  Similarity between features is common, for example, in  natural language processing: words, n-grams, or syntactic n-grams can be somewhat different (which makes  them different features) but still have much in common:  for example, words “play” and “game” are different but  related. When there is no similarity between features  then our soft similarity measure is equal to the standard  similarity. For this, we generalize the well-known cosine  similarity measure in VSM by introducing what we call  “soft cosine measure”. We propose various formulas  for exact or approximate calculation of the soft cosine  measure. For example, in one of them we consider  for VSM a new feature space consisting of pairs of  the original features weighted by their similarity. Again,  for features that bear no similarity to each other, our  formulas reduce to the standard cosine measure.  \bf{Can we make this intuition more precise? We’d really like to see something  more formal. Goldberg, Levy. 2014. Литература № 4}