0559a4736a
FIX: don't double request when downloading a file
2018-02-24 12:35:57 +01:00
b6277e208b
FIX: Cookies header didn't have the right format
2018-02-19 12:46:57 +01:00
fa5880e04f
PERF: ability to crawl for titles without extra HEAD req
...
Also, introduces a much more aggressive timeout for title crawling
and introduces gzip to body that is crawled
2018-01-29 15:40:12 +11:00
1dd2b51059
remove redundent stubs
2017-10-18 12:10:30 +11:00
8185b8cb06
FEATURE: cache https redirects per hostname
...
If a hostname does an https redirect we cache that so next
lookup does not incur it.
Also, only rate limit per ip once per final destination
Raise final destination protection to 1000 ip lookups an hour
2017-10-17 16:22:54 +11:00
70bb2aa426
FEATURE: allow specifying s3 config via globals
...
This refactors handling of s3 so it can be specified via GlobalSetting
This means that in a multisite environment you can configure s3 uploads
without actual sites knowing credentials in s3
It is a critical setting for situations where assets are mirrored to s3.
2017-10-06 16:20:01 +11:00
5324c01209
FIX: Don't raise an error if reading from URL timeout.
2017-09-27 14:53:22 +08:00
367fb1c524
FIX: Onebox fails on encoded URL.
...
https://meta.discourse.org/t/onebox-breaks-if-theres-chinese-text-in-url/67364
2017-09-26 18:34:54 +08:00
6cd8203686
FIX: allows onebox to force GET hosts returning wrong headers on HEAD
2017-08-08 11:44:27 +02:00
b059a0f789
extract url escaping to a dedicated class method and improved tests
2017-07-29 22:16:51 +05:30
1fe553873c
FIX: preserve fragment identifier when escaping url
2017-07-29 17:22:45 +05:30
b534778f46
FIX: Escape URL before attempting to resolve it.
2017-07-18 10:04:24 +09:00
db485ae0da
FIX: Support for skipping redirects on certain domains (like steam)
2017-06-26 15:38:43 -04:00
009f0921dc
FEATURE: Whitelist hosts for internal crawling
2017-06-13 12:59:54 -04:00
a3729b51eb
FIX: Always allow the host the forum is hosted on
2017-06-12 13:22:51 -04:00
53b95f009f
FIX: If HEAD is not supported, try GET. Also set cookies
2017-06-06 13:53:49 -04:00
56f98de7b2
Use webmock to stub external web requests.
2017-05-26 15:19:09 +08:00
f8f1548fd4
Revert "FIX: Use Excon to do its own stubbing"
...
This reverts commit 80af54460a81bf1593cacb1f0b23e0e0a42c8076.
2017-05-26 13:04:25 +08:00
3b0cbf7013
FIX: Always allow downloads from CDN
2017-05-23 16:32:54 -04:00
b81e7be9a1
FEATURE: Rate limit how often we'll crawl a destination IP
2017-05-23 15:03:04 -04:00
36e477750c
FIX: Use same code path for downloading images
2017-05-23 14:51:30 -04:00
e5e7a15a85
SECURITY: Never crawl by IP
2017-05-23 13:07:18 -04:00
93a5fc62bf
FEATURE: A site setting to prevent crawling on private IP blocks
2017-05-23 11:56:06 -04:00
80af54460a
FIX: Use Excon to do its own stubbing
2017-05-22 18:19:20 -04:00
b51126dd5e
FIX: Reset the WebMock after before every test
2017-05-22 17:52:31 -04:00
4c690f7089
Use FinalDestination
to ensure public redirects for onebox
2017-05-22 16:42:49 -04:00
b23fc2bf84
Helper to find the final destination for a URL
2017-05-22 15:52:41 -04:00