Applications Vector space model
relevance rankings of documents in keyword search can calculated, using assumptions of document similarities theory, comparing deviation of angles between each document vector , original query vector query represented same kind of vector documents.
in practice, easier calculate cosine of angle between vectors, instead of angle itself:
cos
θ
=
d
2
⋅
q
∥
d
2
∥
∥
q
∥
{\displaystyle \cos {\theta }={\frac {\mathbf {d_{2}} \cdot \mathbf {q} }{\left\|\mathbf {d_{2}} \right\|\left\|\mathbf {q} \right\|}}}
where
d
2
⋅
q
{\displaystyle \mathbf {d_{2}} \cdot \mathbf {q} }
intersection (i.e. dot product) of document (d2 in figure right) , query (q in figure) vectors,
∥
d
2
∥
{\displaystyle \left\|\mathbf {d_{2}} \right\|}
norm of vector d2, ,
∥
q
∥
{\displaystyle \left\|\mathbf {q} \right\|}
norm of vector q. norm of vector calculated such:
∥
q
∥
=
∑
i
=
1
n
q
i
2
{\displaystyle \left\|\mathbf {q} \right\|={\sqrt {\sum _{i=1}^{n}q_{i}^{2}}}}
as vectors under consideration model elementwise nonnegative, cosine value of 0 means query , document vector orthogonal , have no match (i.e. query term not exist in document being considered). see cosine similarity further information.
Comments
Post a Comment