FIX: Relevance search will now consider document length in ranking.

The default ranking options ranks by the number of matches which is
highly problematic when posts are stuffed with a keyword. The ranking
will now be divided by the document length which is a much fairer way to
rank.
This commit is contained in:
Guo Xiang Tan
2019-04-01 13:40:11 +08:00
parent cadd1d670f
commit e87ca59401
2 changed files with 56 additions and 11 deletions

View File

@ -838,13 +838,14 @@ class Search
posts = posts.order("posts.like_count DESC")
end
else
# 0|32 default normalization scaled into the range zero to one
# 2|32 divides the rank by the document length and scales the range from
# zero to one
data_ranking = <<~SQL
(
TS_RANK_CD(
post_search_data.search_data,
#{ts_query(weight_filter: weights)},
0|32
2|32
) *
(
CASE categories.search_priority