Commit Graph

1794 Commits

Author SHA1 Message Date
Nicolas 1f779e261a Update rate-limiter.ts 2024-08-22 18:30:45 -03:00
Gergő Móricz 8e3c2b2855 fix(crawler): verify URL 2024-08-22 23:30:19 +02:00
Gergő Móricz e690a6fda7 fix: remove QueueEvents 2024-08-22 22:38:39 +02:00
Gergő Móricz 76c8e9f996 fix 2024-08-22 22:24:24 +02:00
Gergő Móricz ad82175fb8 fix(scrape): poll 2024-08-22 22:12:02 +02:00
rafaelsideguide 5f60a55967 workflow and npm now running v1 tests 2024-08-22 15:28:49 -03:00
rafaelsideguide 30e809966f Merge remote-tracking branch 'origin/v1/python-sdk' into v1-webscraper 2024-08-22 15:18:05 -03:00
rafaelsideguide a37681bdff fix: replace jest, removed map for v0 2024-08-22 15:16:46 -03:00
rafaelsideguide 7473b74021 fix: html and rawlhtmls for pdfs 2024-08-22 15:15:45 -03:00
Gergő Móricz dd737f1235 feat(sentry): add queue instrumentation to 2024-08-22 19:17:51 +02:00
Nicolas d2521612b4 Update .gitignore 2024-08-22 14:15:19 -03:00
Gergő Móricz 7265ab7c67 fix(search): filter docs properly 2024-08-22 18:46:56 +02:00
rafaelsideguide b1d61d8557 Merge remote-tracking branch 'origin/v1-webscraper' into v1/python-sdk 2024-08-22 13:39:09 -03:00
rafaelsideguide ab88a75c70 fixes sdks 2024-08-22 13:38:34 -03:00
Gergő Móricz d036738da0 fix(bullmq): duplicate redis connection for QueueEvents 2024-08-22 18:04:09 +02:00
Gergő Móricz 6d48dbcd38 feat(sentry): add trace continuity for queue 2024-08-22 16:47:38 +02:00
Gergő Móricz 6d92b8524d feat(scrape): record job result in span 2024-08-22 16:00:13 +02:00
Gergő Móricz 5ca36fe9fc feat(api): add more captureExceptions 2024-08-22 15:49:16 +02:00
Gergő Móricz 0e8fd6ce70 fix(scrape): ensure extractionSchema is an object if llm-extraction is specified 2024-08-22 14:50:51 +02:00
Gergő Móricz 4bd2ff26d3 fix(llm-extract): pass stacktrace properly 2024-08-22 14:37:09 +02:00
Gergő Móricz e4adbaa88e fix(llm-extract): handle llm-extract if scrape failed 2024-08-22 14:12:52 +02:00
Gergő Móricz 670d253a8c fix(auth): fix error reporting 2024-08-22 14:08:09 +02:00
Gergő Móricz 7d9f5bf8b1 fix(crawl): don't use sitemap if it's empty
Fixes FIRECRAWL-SCRAPER-JS-11
2024-08-22 13:41:33 +02:00
Gergő Móricz 1f580deefc fix(crawl): validate includes.excludes regexes 2024-08-22 13:29:11 +02:00
Gergő Móricz fbbc3878f1 fix(crawler): make sure includes/excludes is an array 2024-08-22 13:18:26 +02:00
Gergő Móricz 508568f943 fix(search): handle scrape timeouts on search
Fixes FIRECRAWL-SCRAPER-JS-15
2024-08-22 13:10:58 +02:00
Gergő Móricz 14fa75cae6 fix(crawl): send error if url is not a string
Fixes FIRECRAWL-SCRAPER-JS-1E and FIRECRAWL-SCRAPER-JS-Z
2024-08-22 13:09:08 +02:00
Nicolas 8a778278a9 Merge branch 'main' into nsc/job-priority 2024-08-21 22:57:55 -03:00
Gergo Moricz 0cdf41587e feat(sentry): add error handles to try-catch blocks 2024-08-22 03:55:40 +02:00
Nicolas 53ca704620 Update index.ts 2024-08-21 22:55:39 -03:00
Nicolas 477c3257dc Nick: 2024-08-21 22:53:33 -03:00
Nicolas c7bfe4ffe8 Nick: 2024-08-21 22:20:40 -03:00
Nicolas 6bdb1d045d Merge branch 'main' into nsc/job-priority 2024-08-21 21:52:05 -03:00
Nicolas e78d2af1f0 Nick: 2024-08-21 21:51:54 -03:00
Nicolas e64d3815ea Merge branch 'main' into nsc/job-priority 2024-08-21 20:54:57 -03:00
Nicolas 0ea0a5db46 Nick: wip 2024-08-21 20:54:39 -03:00
rafaelsideguide 0b37cbce4a Update .gitignore 2024-08-21 15:58:51 -03:00
rafaelsideguide a4686e3c8c fixing tests 2024-08-21 15:56:48 -03:00
rafaelsideguide fe2e8c0b7a includehtml fix 2024-08-21 15:54:00 -03:00
Gergő Móricz 629da74a5c fix(sentry): decrease tracesSampleRate 2024-08-21 20:51:35 +02:00
Gergő Móricz 55009e51f5 fix: filter out invalid URLs from crawl links 2024-08-21 20:49:25 +02:00
Gergő Móricz dae1408e66 fix(Dockerfile): retain sentry auth token properly 2024-08-21 20:40:42 +02:00
Gergő Móricz ac9783ed2f fix(sentry): adjust profiles sample rate to be even lower 2024-08-21 20:21:16 +02:00
Gergő Móricz 9579f03c4b fix: import resolution 2024-08-21 20:16:06 +02:00
Gergő Móricz 6104d74213 fix(sentry): drop profiling sample rate 2024-08-21 20:12:47 +02:00
Gergő Móricz 3d5dc9d90a feat(sentry): add log + server name 2024-08-21 19:39:10 +02:00
Nicolas 79f5d49d3f
Merge pull request #562 from mendableai/nsc/sentry
Added Sentry Monitoring
2024-08-21 14:22:29 -03:00
Gergő Móricz 85ff0c311e Add worker ID to job attribute 2024-08-21 19:21:29 +02:00
Gergő Móricz 3ad9bf7ac0 Update GH Actions deployment 2024-08-21 19:15:25 +02:00
Gergő Móricz 920702cdde Update builder to handle uploading sourcemaps 2024-08-21 19:08:03 +02:00