Commit Graph

1835 Commits

Author SHA1 Message Date
Eric Ciarla 20b998e66a Delete o1_job_recommender.ipynb 2024-09-26 14:51:07 -04:00
Eric Ciarla 5c4d436f1e Create o1_job_recommender.py 2024-09-26 14:46:48 -04:00
Eric Ciarla 51bc2f25fe remove actions crawler 2024-09-26 11:44:55 -04:00
Eric Ciarla 289af6f89e example 2024-09-25 21:10:09 -04:00
Nicolas a9773a24a3 Nick: increased timeout for chrome-cdp due to smart wait 2024-09-25 19:27:02 -04:00
Eric Ciarla abdc08edea
Merge pull request #679 from h4r5h4/fix/folder-name
remove space in the examples/o1_web_crawler folder name
2024-09-25 10:40:09 -04:00
Nicolas 1da026b26e Update single_url.ts 2024-09-24 23:29:48 -04:00
Nicolas b8266cc329 Update website_params.ts 2024-09-24 23:28:58 -04:00
Gergő Móricz f00c0b82f9 fix(v1/scrape): add total wait specified in request to timeout 2024-09-24 21:56:22 +02:00
Nicolas 3f138e559e Update website_params.ts 2024-09-24 15:14:26 -04:00
Gergő Móricz 43730b5db6 feat(WebScraper): always report error of last scraper in order 2024-09-24 20:03:49 +02:00
Gergő Móricz 3e661a2087 fix(v1/crawl-cancel): avoid double authing 2024-09-24 20:01:34 +02:00
Nicolas 86744f6deb
Update README.md 2024-09-24 13:22:09 -04:00
Gergő Móricz 4194525640 fix(blocklist): unblock TikTok Business page
This is just a regular business site, not social media.
2024-09-24 16:55:19 +02:00
Gergő Móricz 4a623c084a fix(fly): don't use Depot builders (doesn't work) 2024-09-24 10:50:30 +02:00
Gergő Móricz a59b5836d5 Revert error tallying 2024-09-24 10:27:49 +02:00
Gergő Móricz a4b128e8b7 fix(rust): blocklisted error test 2024-09-23 23:03:00 +02:00
Gergő Móricz 483f97d21b fix(v0/search): don't sent scrape fail errors to Sentry 2024-09-23 18:49:27 +02:00
Gergő Móricz d927cafeea fix(queue-worker): don't send scraping errors to sentry 2024-09-23 18:48:01 +02:00
Gergő Móricz 677faa27f3 fix(WebScraper): explicitly ignore 404s 2024-09-23 18:47:07 +02:00
Gergő Móricz 83d8287c14 fix(v0, sentry): don't send all scraping methods failed errors to Sentry 2024-09-23 18:40:21 +02:00
Gergő Móricz d2f7031069 fix(WebScraper): fatal error handler triggering for 404s 2024-09-23 18:33:10 +02:00
Nicolas 4721aa1687
Merge pull request #690 from mendableai/nsc/search-fix-version-v0
Fix the error message when trying search in v0
2024-09-21 21:12:01 -04:00
Nicolas 848a2b364a Update package.json 2024-09-21 21:11:23 -04:00
Nicolas dfdbae74c6 Update fireEngine.ts 2024-09-21 21:10:05 -04:00
Nicolas fbb5f23016 Update index.ts 2024-09-21 20:53:33 -04:00
Nicolas 607e46267c Update package.json 2024-09-20 19:46:17 -04:00
Nicolas db161ac55a Nick: press + write 2024-09-20 19:45:23 -04:00
Nicolas 380dcc2fd6
Merge pull request #682 from mendableai/feat/actions
feat: Actions
2024-09-20 18:43:19 -04:00
Nicolas 3fc5ce17d2 Nick: fixed error handling for v0 scrape 2024-09-20 18:35:30 -04:00
Nicolas 0690cfeaad Merge branch 'main' into feat/actions 2024-09-20 18:24:13 -04:00
Gergő Móricz 95e4c8920b fix(sdk/rust): license 2024-09-20 21:55:05 +02:00
Gergő Móricz e1a34b0a99 Revert "feat(scrape): scroll down/up with actions if fullpagescreenshot"
This reverts commit 815bfc8f07.
2024-09-20 21:43:22 +02:00
Gergő Móricz 815bfc8f07 feat(scrape): scroll down/up with actions if fullpagescreenshot
revert this if unneeded
2024-09-20 21:42:09 +02:00
Gergő Móricz d663bbf0ca feat(actions): add scroll 2024-09-20 21:41:53 +02:00
Gergő Móricz 3dd912ec91 feat(actions): add typeText, pressKey, fix playwright screenshot/waitFor 2024-09-20 21:02:53 +02:00
Gergő Móricz 719dfbccbb Update docs 2024-09-20 20:30:46 +02:00
Gergő Móricz 939040bf44 Update docs and example 2024-09-20 20:10:11 +02:00
Gergő Móricz 3ec0bbe28d feat(sdk/rust/crawl): paginate through results 2024-09-20 20:10:11 +02:00
Gergő Móricz a078cdbd9d Rust SDK 1.0.0 2024-09-20 20:10:11 +02:00
Gergő Móricz 93a20442e3 feat(sdk/rust): first batch of changes for 1.0.0 2024-09-20 20:10:11 +02:00
Eric Ciarla 6aa468163e Update README.md 2024-09-20 09:10:46 -04:00
Nicolas 74565a9da3
Merge pull request #639 from yekkhan/main
feat: kubernetes example optimization
2024-09-19 18:59:42 -04:00
Nicolas bbb8d41850
Merge pull request #623 from itasli/patch-1
fix wrong link to self host documentation
2024-09-19 18:57:54 -04:00
Nicolas 9233182464 Update README.md 2024-09-19 18:57:38 -04:00
Nicolas 48eb6fc494 Merge branch 'main' into pr/623 2024-09-19 18:56:18 -04:00
Nicolas b2f61da7c6 Nick: clarification on open source vs cloud 2024-09-19 17:56:26 -04:00
Gergő Móricz 506d5c2716 Revert "return links"
This reverts commit 2d597672be.
2024-09-19 20:07:06 +02:00
Anjor Kanekar 2d597672be return links 2024-09-19 20:06:07 +02:00
anjor c45a132cd5 Remove print statement in map 2024-09-19 20:06:07 +02:00