Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch updates lead to high IOWAIT and huge disk write pressure #4260

Open
rohan-uiuc opened this issue May 17, 2024 · 0 comments
Open

Batch updates lead to high IOWAIT and huge disk write pressure #4260

rohan-uiuc opened this issue May 17, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@rohan-uiuc
Copy link

rohan-uiuc commented May 17, 2024

Current Behavior

When trying to use Batch Update operation in a loop to update fields in the payload using SetPayload operations, the requests(ALL) start timing out with high consistent IOWAIT of 45%, disk writes as high as 250 MB/s and disk read of 150 MB/s

Steps to Reproduce

def add_document_groups_to_documents(self, course_name: str, documents: List[dict], doc_group_name: str):
    """
    Add document groups to documents in the vector database.
    """
    update_operations = []
    for document in documents:
      key = "url" if document["url"] else "s3_path"
      value = models.MatchValue(value=document[key])
      searchFilter = models.Filter(
				must=[
					models.FieldCondition(key="course_name", match=models.MatchValue(value=course_name)), 
					models.FieldCondition(key=key, 
																match=value)])
      
      payload = {
					"doc_groups": [group["name"] for group in document["doc_groups"]] + [doc_group_name],
			}
      
      update_operations.append(models.SetPayloadOperation(
				set_payload=models.SetPayload(
					payload=payload,
					filter=searchFilter
				),
			))

    print(f"update_operations for qdrant: {len(update_operations)}")
    result = self.qdrant_client.batch_update_points(
      collection_name=os.environ['QDRANT_COLLECTION_NAME'], 
      update_operations=update_operations, wait=False)
    return result

Context (Environment)

The 2 main fields in the payload we are interacting with, are course_name and doc_groups. The collection has 1675570 points for a certain course_name and I am trying to update doc_groups for all of these in a loop with a max batch size of 1500 operations. After some initial updates the all queries to the database start timing out.

Some logs:

update_operations for qdrant: 4
update_operations for qdrant: 4
update_operations for qdrant: 466
Failed to fetch/update documents for page 1 and doc_group group_1, group_2 due to: timed out
update_operations for qdrant: 1499
Failed to fetch/update documents for page 1 and doc_group group_3, group_4 due to: timed out
update_operations for qdrant: 1499
Failed to fetch/update documents for page 1 and doc_group group_5 due to: timed out
update_operations for qdrant: 1
update_operations for qdrant: 10
Failed to fetch/update documents for page 1 and doc_group group_6 due to: timed out

Is there a better way to implement such an operation?

@rohan-uiuc rohan-uiuc added the bug Something isn't working label May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant