Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

For some reason ragpipeline stopped working out of nowhere TypeError: SimpleHybridRetriever._aretrieve() missing 1 required positional argument: 'query' #1222

Open
cranyy opened this issue Apr 24, 2024 · 6 comments

Comments

@cranyy
Copy link

cranyy commented Apr 24, 2024

  File "E:\Project\22222\MetaStocky\ooo.py", line 100, in query
    response = await asyncio.wait_for(self._engine.aquery(f"{OPTIONS_ANALYSIS_PROMPT}\n{question}"), timeout=120)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\marin\AppData\Local\Programs\Python\Python311\Lib\asyncio\tasks.py", line 479, in wait_for
    return fut.result()
           ^^^^^^^^^^^^
  File "E:\Project\MetaStocky\env\Lib\site-packages\llama_index\core\instrumentation\dispatcher.py", line 307, in async_wrapper
    result = await func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\Project\MetaStocky\env\Lib\site-packages\llama_index\core\base\base_query_engine.py", line 65, in aquery
    query_result = await self._aquery(str_or_query_bundle)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\Project\MetaStocky\env\Lib\site-packages\llama_index\core\instrumentation\dispatcher.py", line 307, in async_wrapper
    result = await func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\Project\MetaStocky\env\Lib\site-packages\llama_index\core\query_engine\retriever_query_engine.py", line 204, in _aquery
    nodes = await self.aretrieve(query_bundle)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\Project\22222\MetaStocky\metagpt\rag\engines\simple.py", line 168, in aretrieve
    nodes = await super().aretrieve(query_bundle)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\Project\MetaStocky\env\Lib\site-packages\llama_index\core\query_engine\retriever_query_engine.py", line 148, in aretrieve
    nodes = await self._retriever.aretrieve(query_bundle)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\Project\MetaStocky\env\Lib\site-packages\llama_index\core\instrumentation\dispatcher.py", line 307, in async_wrapper
    result = await func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\Project\MetaStocky\env\Lib\site-packages\llama_index\core\base\base_retriever.py", line 276, in aretrieve
    nodes = await self._aretrieve(query_bundle=query_bundle)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: SimpleHybridRetriever._aretrieve() missing 1 required positional argument: `'query'`



This started happening literally today. I havent changed anything ecept update to the newest version, and it worked fine two - three days ago when I last tested
@cranyy cranyy changed the title For some reason ragpipeline stopped TypeError: SimpleHybridRetriever._aretrieve() missing 1 required positional argument: 'query' working out of nowhere For some reason ragpipeline stopped working out of nowhere TypeError: SimpleHybridRetriever._aretrieve() missing 1 required positional argument: 'query' Apr 24, 2024
@seehi
Copy link
Contributor

seehi commented Apr 24, 2024

Is it reproducible, could you provide more details, such the code that was executed.

@cranyy
Copy link
Author

cranyy commented Apr 24, 2024

Hello, I solved it - it seems the dependencies for the RAG were not updated properly. But, now the bigger issue is that it gives this -

2024-04-24 05:40:33.851 | ERROR    | __main__:query:108 - Traceback (most recent call last):
  File "E:\Project\MetaStocky\ooo.py", line 100, in query
    response = await asyncio.wait_for(self._engine.aquery(f"{OPTIONS_ANALYSIS_PROMPT}\n{question}"), timeout=120)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\marin\AppData\Local\Programs\Python\Python311\Lib\asyncio\tasks.py", line 479, in wait_for
    return fut.result()
           ^^^^^^^^^^^^
  File "E:\Project\MetaStocky\env\Lib\site-packages\llama_index\core\base\base_query_engine.py", line 46, in aquery
    return await self._aquery(str_or_query_bundle)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\Project\MetaStocky\env\Lib\site-packages\llama_index\core\query_engine\retriever_query_engine.py", line 201, in _aquery
    nodes = await self.aretrieve(query_bundle)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\Project\MetaStocky\metagpt\rag\engines\simple.py", line 175, in aretrieve
    nodes = await super().aretrieve(query_bundle)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\Project\MetaStocky\env\Lib\site-packages\llama_index\core\query_engine\retriever_query_engine.py", line 147, in aretrieve
    return self._apply_node_postprocessors(nodes, query_bundle=query_bundle)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\Project\MetaStocky\env\Lib\site-packages\llama_index\core\query_engine\retriever_query_engine.py", line 136, in _apply_node_postprocessors
    nodes = node_postprocessor.postprocess_nodes(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\Project\MetaStocky\env\Lib\site-packages\llama_index\core\postprocessor\types.py", line 55, in postprocess_nodes
    return self._postprocess_nodes(nodes, query_bundle)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\Project\MetaStocky\env\Lib\site-packages\llama_index\core\postprocessor\llm_rerank.py", line 99, in _postprocess_nodes
    raw_choices, relevances = self._parse_choice_select_answer_fn(
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\Project\MetaStocky\env\Lib\site-packages\llama_index\core\indices\utils.py", line 104, in default_parse_choice_select_answer_fn
    answer_num = int(line_tokens[0].split(":")[1].strip())
                     ~~~~~~~~~~~~~~~~~~~~~~~~~^^^
IndexError: list index out of range when    

I remember that I had fixed it somehow in the utils.py from the llama_core package

def default_parse_choice_select_answer_fn(
    answer: str, num_choices: int, raise_error: bool = False
) -> Tuple[List[int], List[float]]:
    """Default parse choice select answer function."""
    answer_lines = answer.split("\n")
    answer_nums = []
    answer_relevances = []
    for answer_line in answer_lines:
        line_tokens = answer_line.split(",")
        if len(line_tokens) != 2:
            if not raise_error:
                continue
            else:
                raise ValueError(
                    f"Invalid answer line: {answer_line}. "
                    "Answer line must be of the form: "
                    "answer_num: <int>, answer_relevance: <float>"
                )
        answer_num = int(line_tokens[0].split(":")[1].strip())
        if answer_num > num_choices:
            continue
        answer_nums.append(answer_num)
        # extract just the first digits after the colon.
        _answer_relevance = re.findall(r"\d+", line_tokens[1].split(":")[1].strip())[0]
        answer_relevances.append(float(_answer_relevance))
    return answer_nums, answer_relevances

But after updating I cant remember what the issue was or how I had fixed it. I am prompting the engine with quite the long thing so thats what is maybe causing its malformed response.

@seehi
Copy link
Contributor

seehi commented Apr 24, 2024

This is because the answer from the LLM is incorrect.
What is your model? prefer gpt-4-turbo.

@cranyy
Copy link
Author

cranyy commented Apr 24, 2024

I am using gpt-4-turbo-preview which I have always been using it and it used to work properly because i remember fixing this once and i remember that this specific code caused the issue as the answer could sometimes be a weird problematic response. --- And after some testing I remember that the issue is when the answer from the query provides a list with colons in it it screws up the process, so it doesnt matter what LLM you are using, if it returns a Value1: key1, key2 key then value2: key, key key it breaks and leads to an empty response INFO | llama_index.core.indices.utils:default_parse_choice_select_answer_fn:97 - Processing answer line: - **OI**: 22 2024-04-24 06:54:39.293 | WARNING | llama_index.core.indices.utils:default_parse_choice_select_answer_fn:100 - Invalid answer line format: - **OI**: 22 2024-04-24 06:54:39.294 | INFO | llama_index.core.indices.utils:default_parse_choice_select_answer_fn:97 - Processing answer line: - **Volume**: 140 2024-04-24 06:54:39.295 | WARNING | llama_index.core.indices.utils:default_parse_choice_select_answer_fn:100 - Invalid answer line format: - **Volume**: 140 2024-04-24 06:54:39.295 | INFO | llama_index.core.indices.utils:default_parse_choice_select_answer_fn:97 - Processing answer line: - **Mark**: 5.365 2024-04-24 06:54:39.296 | WARNING | llama_index.core.indices.utils:default_parse_choice_select_answer_fn:100 - Invalid answer line format: - **Mark**: 5.365
I cant be bothered to fix it in the code now so i just hardcoded my prompt to never use ":"

@cranyy
Copy link
Author

cranyy commented Apr 24, 2024

Actually my hardcoing never to use ":" doesnt work at all it seems now that i actually test it. The issue is when it creates multiple lines from a single answer. I.e. creates multiple answers for a single query and it always fucks up the process then. And i cant for the life of me remember how to fix it now. I keep getting either empty responses or File "E:\Project\MetaStocky\env\Lib\site-packages\llama_index\core\indices\utils.py", line 104, in default_parse_choice_select_answer_fn
answer_num = int(line_tokens[0].split(":")[1].strip())
~~~~~~~~~~~~~~~~~~~~~~~~~^^^
IndexError: list index out of range --- regardless of what LLM i use

@seehi
Copy link
Contributor

seehi commented May 6, 2024

When using an LLM to rerank, its not always guaranteed that the output will be parseable for reranking.
You can print the result of answer_line, which

Correct Format
['Doc: 1', ' Relevance: 8']

Wrong Format
['Based on the text'], [' here are some things']
['3. First of all: The ability'], [' as they need to be']

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants