I was wondering if AI bots can scan gemini sites via the existing http proxies and if so, wouldnt it be better to get rid of those proxies?
4 months ago
Actions
4 Replies
it would take like 2 hours to build a bot to scrape gemini. the way the protocol is built makes it really easy. i dont think taking down the on ramp for new users would make and appreciable difference · 4 months ago
I was asking ChatGPT if it would be able to parse a gemini site, but it replied that it is not able to, since its only able to parse http/https. · 4 months ago
throwing anubis or go-away on the proxies would most likely fix that issue, if it's even a concern.
honestly, I'm not super concerned anyway; to use a proxy website you'd first need a link to one with a gemini URL pre-filled in, which doesn't seem very common (it seems that people just link to gemini URLs directly).
regardless, AI crawlers are going to crawl every thing they can get their grubby hands on; the real solution is to kill these companies (which will probably happen on its own eventually; AI is consistently non-profitable). the utility of the HTTP proxies I think is more important than the threat of AI scrapers which I'm sure will pass. · 4 months ago
AI bots can scan gemini just fine without an http proxy. Also, they generally scan a website by following links or the sitemap.xml, just like a traditional crawler. They're unlikely to just sit there plugging random things into a proxy search, that's not an efficient use of their crawling time. · 4 months ago