6 Comments
Oct 17Liked by Paul Iusztin

What about the context window? I mean how, will you handle 10000 posts?

Expand full comment
author

You have to write a batch system, that splits the 10000 posts into ~10-100 items / batch. API's such as OpenAI support batch calls.

Expand full comment
Oct 15Liked by Paul Iusztin

I worked on a similar project about 9 months ago and encountered challenges with the Instaloader package due to Instagram's rate limits. My script got detected after roughly 200 requests. Have you found a workaround for this issue? I'd be interested in hearing about any solutions you've discovered.

Expand full comment
author

Have you tried using a proxy? Or sleeping after 100 requests?

Expand full comment

Yes, I've already used those methods, but it still got detected.

Expand full comment
author

Hmm... Maybe you can find something useful in this article: https://decodingml.substack.com/p/highly-scalable-data-ingestion-architecture?r=1ttoeh

Expand full comment