4 Comments
Apr 2, 2023Liked by Aaron Batilo

I wonder if you would think about putting the go rewrite of llama index into an actual library. Getting this stuff faster would be awesome as we're also struggling with the current sluggishness of it. Would be great if you could maybe share a blog post on this alone, this sounds super promising (although not familiar with go).

Expand full comment
author

I didn't do a complete re-write. Only the piece that I used for this. I don't know if I'll do a whole post on just this but I'm glad to talk about it! Feel free to email me. AaronBatilo@gmail.com

Maybe I can do a micro post or something too.

Expand full comment

cool, just thinking it might be worth it, so you've been basically still be working with text embeddings using GTPSimpleVectorIndex? Right now, if we have a json index size of about 800MB and want a decent amount of refinements (like 5-10), it takes forever for gpt-3.5-turbo to answer (like 1-2 minutes)... I thought it might be due to bandwidth limitations or similar, but if it's actually the python code of llama index, I'd love to understand what could be done about that :)

Expand full comment
author

The majority of the implementation is right here:

https://gist.github.com/abatilo/56521166eae5812a116bb1476e1a764f

My "data set" is only 15 files. The `index.json` is only about 300KB. So it's very small. If you have a large amount of documents to do your cosine (or other) similarity search through, I'd suggest that the next thing you do is look at a vector database like Milvus and then you can do your searches more quickly to build the surrounding context information.

I'm not entirely sure what you're referring to when you say that you have "refinements". Is that just what you call it when you're looking for the right vectors to use for your question?

Expand full comment