LangChain(二)

前文对langchain框架及其构成进行了简单的介绍，本文将对其中的各个重要组件进行解析。

1. agent

1.1 agent解析过程

代理(agent)通过使用工具(tools)协助语言模型(llm)对任务进行解析。通常情况下，合适的agent会协助语言模型拆解主任务为数个或同步递进或并行异步的子任务，并根据子任务的类型将其交付给描述相近的工具进行处理，最后交付语言模型组织得到答案。示例如下：

> Entering new AgentExecutor chain...
 I need to find out who Leo DiCaprio's girlfriend is and then calculate her age raised to the 0.43 power.
Action: Search
Action Input: "Leo DiCaprio girlfriend"
Observation: Camila Morrone
Thought: I need to find out Camila Morrone's age
Action: Search
Action Input: "Camila Morrone age"
Observation: 25 years
Thought: I need to calculate 25 raised to the 0.43 power
Action: Calculator
Action Input: 25^0.43
Observation: Answer: 3.991298452658078
 
Thought: I now know the final answer
Final Answer: Camila Morrone is Leo DiCaprio's girlfriend and her current age raised to the 0.43 power is 3.991298452658078.
 
> Finished chain.

可以看出，agent将总问题“查询里昂的女友并计算其年龄的0.43次方”拆解为了三个子任务，即“谁是里昂的女友”，“女友的年龄是多少”以及“计算年龄的0.43次方”。
每个子任务在执行过程中由四个步骤执行：Question，Action，Action Input，Observation。当Observation中出现Answer时组织输出Final Answer。

1.2 agent类型

langchain提供的原始类别的代理有四种，确定其类别主要依据使用LLM确定的作用以及以何顺序执行这些操作有关。

zero-shot-react-description：仅基于工具的描述来确定要使用的工具
react-docstore：用于与文档存储进行交互，必须提供Search工具和Lookup工具
self-ask-with-search：用于查找问题的事实性答案
conversational-react-description：用于对话环境中，使用内存来记忆先前的对话交互

1.3 agent自定义

class FakeAgent(BaseSingleActionAgent):
    """Fake Custom Agent."""
    
    @property
    def input_keys(self):
        return ["input"]
    
    def plan(
        self, intermediate_steps: List[Tuple[AgentAction, str]], **kwargs: Any
    ) -> Union[AgentAction, AgentFinish]:
        return AgentAction(tool="Search", tool_input=kwargs["input"], log="")

以上是一个简单的自定义agent，仅包含一个有意义的函数plan，其输入为一个元祖构成的列表intermediate_step，返回值为一个由AgentAction和AgentFinish构成的联合键值对。示例中的plan中不存在对参数的处理，可以根据输入的AgentAction进行action的转换或者转至action的终结——AgentFinish。从示例可以看出，输出的双值Union可以仅包含AgentAction或AgentFinish，暂未测试输出时同时包含二者的结果，猜测为同步时使用。

1.4 工具检索

一般使用向量存储来为每个工具准备向量嵌入，对于传入的查询，我们可以为该查询创建嵌入，并进行相关工具的相似性搜索。示例如下：

def fake_func(inp: str) -> str:
    return "foo"
fake_tools = [
    Tool(
        name=f"foo-{i}", 
        func=fake_func, 
        description=f"a silly function that you can use to get more information about the number {10-i}"
    ) 
    for i in range(10)
]
ALL_TOOLS = [search_tool,math_tool]+ fake_tools

docs = [Document(page_content=t.description, metadata={"index": i}) for i, t in enumerate(ALL_TOOLS)]# 创建Documents

from langchain.embeddings import HuggingFaceEmbeddings
embedding = HuggingFaceEmbeddings(model_name='../hkunlp/instructor-xl')
vector_store = FAISS.from_documents(docs, embedding)# 创建向量存储
retriever = vector_store.as_retriever()

def get_tools(query):
    docs = retriever.get_relevant_documents(query) # 创建查询
    return [ALL_TOOLS[d.metadata["index"]] for d in docs]

请注意，llm通过对比tools部分的description而不是name来选择合适的tool。注意上述代码中fooltool用以创建无效的工具以展示选择的准确性，其name顺序编号，description逆序编号。以下是查询准确性的示例：

get_tools("whats the meaning of number 7")
>>> [Tool(name='foo-3', description='a silly function that you can use to get more information about the number 7', func=<function fake_func at 0x7f1e23b74670>),
 Tool(name='foo-4', description='a silly function that you can use to get more information about the number 6', func=<function fake_func at 0x7f1e23b74670>),
 Tool(name='foo-9', description='a silly function that you can use to get more information about the number 1', func=<function fake_func at 0x7f1e23b74670>),
 Tool(name='foo-2', description='a silly function that you can use to get more information about the number 8', func=<function fake_func at 0x7f1e23b74670>)]

1.5 自定义工具

每个tool由必选的name，func与可选的description，return_direct，arg_schema组成。

name，是必需的，并且必须在提供给代理的一组工具中是唯一的
function，是必需的，决定工具的作用
description，可选，代理用于确定工具使用
return_direct，决定是否执行进一步的思考，直接返回结果
args_schema，可选，用于提供更多信息或验证预期参数。

math_tool = Tool(
    name = "Calculate",
    func=llm_math.run,
    description = "useful when you need to answer questions about math"
)

1.6 输出解析器

class CustomOutputParser(AgentOutputParser):
 #对llm的每一步输出进行解析，判断下一步的action
    def parse(self, llm_output: str) -> Union[AgentAction, AgentFinish]:
        # 判断agent终止
        if "Final Answer:" in llm_output:
            return AgentFinish(
                return_values={"output": llm_output.split("Final Answer:")[-1].strip()},
                log=llm_output,
            )
       
        regex = r"Action\s*\d*\s*:(.*?)\nAction\s*\d*\s*Input\s*\d*\s*:[\s]*(.*)"
        match = re.search(regex, llm_output, re.DOTALL)
        if not match:
            raise ValueError(f"Could not parse LLM output: `{llm_output}`")
        action = match.group(1).strip()# 解析非结束action内容
        action_input = match.group(2)
        # Return the action and action input
        return AgentAction(tool=action, tool_input=action_input.strip(" ").strip('"'), log=llm_output)

完成各项后，agent展示如下：

llm_chain = LLMChain(llm=llm, prompt=prompt)
tools = get_tools("how to calculate 4 plus 4?")
tool_names = [tool.name for tool in tools]
agent = LLMSingleActionAgent(
    #单action的agent
    llm_chain=llm_chain, 
    output_parser=output_parser,
    stop=["\nObservation:"], 
    allowed_tools=tool_names
)
agent_executor = AgentExecutor.from_agent_and_tools(agent=agent, tools=tools, verbose=True)