Commit Graph

58 Commits

Author SHA1 Message Date
ecac62dd6f DEV: Make search results blurb non-pg headlines setting dependent (#20939)
Followup to #20915. If we're grouping search results that don't rely on core's search, we won't have access to pg headlines. This is now configurable via the constructor, defaulting to `SiteSetting.use_pg_headlines_for_excerpt`
2023-04-03 11:09:36 -03:00
Sam
cd247d5322 FEATURE: Roll out new search optimisations (#20364)
- Reduce duplication of terms in post index from unlimited to 6. This will
result in reduced index size and reduced weighting for posts containing
a huge amount of duplicate terms. (Eg: a post containing "sam sam sam sam
sam sam sam sam", will index as "sam sam sam sam sam sam", only including
the word up to 6 times.) This corrects a flaw where title weighting could
be ignored.

- Prioritize exact matches of words in titles. Our search always performs
a prefix match. However we want to give special weight to exact title matches
meaning that a search for "sum" will find topics such as "the sum of us" vs
"summer in spring".

- Pick up fixes to our search algorithm which are missing from old indexes.
Specifically pick up the fix that indexes URLs properly. (`https://happy.com`
was stemmed to `happi` in keywords and then was not searchable)

see also:

https://meta.discourse.org/t/refinements-to-search-being-tested-on-meta/254158

Indexing will take a while and work in batches, in the background.
2023-02-20 11:53:35 +11:00
6417173082 DEV: Apply syntax_tree formatting to lib/* 2023-01-09 12:10:19 +00:00
8222810099 FIX: Limits for PM and group header search (#16887)
When searching for PMs or PMs in a group inbox, results in the header search were not being limited to 5 with a "More" link to the full page search. This PR fixes that.

It also simplifies the logic and updates the search API docs to include recently added `in:messages` and `group_messages:groupname` options.
2022-05-24 11:31:24 -04:00
4e5f5b67b0 DEV: Remove monkey patch that is no longer required (#16648) 2022-05-06 10:33:42 +08:00
34b4b53bac FEATURE: Use Postgres unaccent to ignore accents (#16100)
The search_ignore_accents site setting can be used to make the search
indexer remove the accents before indexing the content. The unaccent
function from PostgreSQL is better than Ruby's unicode_normalize(:nfkd).
2022-03-07 23:03:10 +02:00
930f51e175 FEATURE: Split up text segmentation for Chinese and Japanese.
* Chinese segmenetation will continue to rely on cppjieba
* Japanese segmentation will use our port of TinySegmenter
* Korean currently does not rely on segmentation which was dropped in c677877e4fe5381f613279901f36ae255c909573
* SiteSetting.search_tokenize_chinese_japanese_korean has been split
into SiteSetting.search_tokenize_chinese and
SiteSetting.search_tokenize_japanese respectively
2022-02-07 09:21:14 +08:00
Sam
5b342ae505 FIX: remove superfluous spaces from CJK blurbs (#12629)
Previously we used the raw data indexed to generate blurbs even for cases
when Chinese/Korean/Japanese text was used.

This caused superfluous spaces to show up in excerpts.
2021-04-12 12:46:42 +10:00
93f8396b4b FIX: Limit PG headline based search blurb generation to 200 characters.
* Recovers omission characters '...' in blurb as well.
2020-08-12 15:34:27 +08:00
2193d02433 PERF: Use PG headlines for blurb generation and highlighting for search. 2020-08-06 14:56:29 +08:00
255b0e9f14 PERF: Replace video and audio links in search blurb while indexing.
In the near future, we will be swtiching to PG headlines to generate the
search blurb. As such, we need to replace audio and video links in the
raw data used for headline generation. This also means that we avoid
replacing links each time we need to generate the blurb.
2020-08-06 12:25:03 +08:00
06ef87da51 DEV: Make rubocop happy. 2020-08-06 10:11:07 +08:00
ee5d8fba0c PERF: Optimize ActionView::Helpers::TextHelper#excerpt. 2020-08-06 09:59:20 +08:00
4c03a944f6 PERF: Move URI regexp in GroupSearchResults.blurb_for into constant
No need to generate the huge regexp over and over again.
2020-08-04 14:38:43 +08:00
181c4eb760 PERF: Avoid parsing Post#cooked with Nokogiri for every search. 2020-07-24 10:43:09 +08:00
ce39733b1a FIX: Incorrect search blurb when advanced search filters are used take2
Also remove include_blurbs attribute which isn't used.
2020-07-14 11:50:40 +08:00
cb1f891392 Revert "FIX: Incorrect search blurb when advanced search filters are used."
This change was causing advanced search filters to disappear from the search input

This reverts commit 2e1eafae0647f6db439d6337b26d83edbca42865.
2020-07-09 16:19:18 +01:00
2e1eafae06 FIX: Incorrect search blurb when advanced search filters are used. 2020-07-08 11:59:49 +08:00
0dfc594784 FIX: skip invalid URLs when checking for audio/video in search blurbs
Fixes 500 errors on search queries introduced in 580a4a8
2019-11-06 10:32:15 -05:00
15b25547bb DEV: Cleanup misspelled TextHelper param 2019-10-31 09:32:42 -04:00
f8b72d9835 DEV: Refactor excluding audio/video URLs from search result blurbs
Followup to 580a4a82
2019-10-31 09:13:24 -04:00
580a4a827b Exclude audio/video URLs from search result blurbs
Displays translatable "[audio]" or "[video]" placeholders instead of ugly (and often long) URLs.
2019-10-30 13:07:16 -04:00
4dcc5f16f1 FEATURE: when under extreme load disable search
The global setting disable_search_queue_threshold
(DISCOURSE_DISABLE_SEARCH_QUEUE_THRESHOLD) which default to 1 second was
added.

This protection ensures that when the application is unable to keep up with
requests it will simply turn off search till it is not backed up.

To disable this protection set this to 0.
2019-07-02 11:22:01 +10:00
30990006a9 DEV: enable frozen string literal on all files
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.

Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
2019-05-13 09:31:32 +08:00
e2bcf55077 DEV: move send => public_send in lib folder
This handles most of the cases in `lib` where we were using send instead
of public_send
2019-05-07 12:25:44 +10:00
451f7842ff DEV: More send -> public_send. 2019-05-07 10:05:58 +08:00
dae0bb4c67 FIX: Post blurb incorrect when search contains a phrase match.
If the blurb generated is not around the search term, we will not be
able to highlight it on the client side.
2019-03-26 17:01:52 +08:00
dc4001370c FEATURE: displays groups in menu search (#7090) 2019-03-04 10:30:09 +01:00
4481836de2 FEATURE: new 'search_ignore_accents' site setting 2018-09-17 10:42:30 +02:00
2c56f8df7c FEATURE: show tags in search results 2017-08-25 11:52:59 -04:00
7c1d7fb423 Merge branch 'master' into fix_limited_search_results 2017-07-31 15:55:31 -04:00
5012d46cbd Add rubocop to our build. (#5004) 2017-07-28 10:20:09 +09:00
7b40de5ac4 Add attribute to grouped search results for more available posts. 2017-07-20 18:07:13 +02:00
21e02d6969 Include the search_log_id in search results 2017-07-17 12:10:32 -04:00
Sam
0a78ae739d Remove SearchObserver, aim is to remove all observers
rails-observers gem is mostly unmaintained and is a pain to carry forward
new implementation contains significantly less magic as a bonus
2016-12-22 13:13:14 +11:00
Sam
50f7616d04 FIX: include pinned status in search results 2016-03-18 16:26:20 +11:00
Sam
2876725e1b REFACTOR: remove hacky search from discovery 2015-07-27 16:47:06 +10:00
Sam
41ceff8430 UX: move search to its own route
previously search was bundled with discovery, something that makes stuff confusing internally
2015-07-27 16:47:06 +10:00
6422d5efbd Use the same component for similar topics as search results. 2015-06-24 15:08:22 -04:00
Sam
0ade9bafff FIX: highlight in yellow, not blue
FEATURE: highlight in title
2014-09-04 15:01:13 +10:00
Sam
9c29c1c072 FEATURE: highlight search results 2014-09-03 17:09:01 +10:00
Sam
f06ad7ed8e remove old unused files 2014-09-03 12:15:48 +10:00
Sam
4f09d552ed FEATURE: increase search expansion to 50 results
refactor search code to deal with proper objects
use proper serializers, test the controllers
2014-09-03 12:13:25 +10:00
Sam
69e418facf FEATURE: wider search with more context 2014-09-01 17:04:57 +10:00
e8cade40c7 Improve search results by introducing an aggregate post search data
filter. It seems performant despite the extra content being searched.
2014-08-22 16:56:26 -04:00
7ef61144e7 Avoid using to_s when performing String Interpolation 2014-08-14 23:55:27 +05:30
Sam
c5a3bfdfa9 BUGFIX: missing avatars in search 2014-05-29 14:38:52 +10:00
84d100be85 Add blurb of post to search results via API 2014-04-17 07:58:51 -05:00
Sam
abb2de22ab BUGFIX: search could break when expanding 2014-02-17 14:34:14 +11:00
Sam
2b10fdc97f FEATURE: search auto scopes on topic first 2014-02-17 13:54:51 +11:00