精品欧美一区二区三区在线观看 _久久久久国色av免费观看性色_国产精品久久在线观看_亚洲第一综合网站_91精品又粗又猛又爽_小泽玛利亚一区二区免费_91亚洲精品国偷拍自产在线观看 _久久精品视频在线播放_美女精品久久久_欧美日韩国产成人在线

如何優化大型語言模型(LLM)的分塊策略

譯文 精選
人工智能
本文將更深入地探討LLM不同的分塊方法及其策略,以及它們在為現實世界的應用程序優化LLM中的作用。

譯者 | 李睿

審校 | 重樓

本文探討了LLM分塊的不同方法,包括固定大小分塊、遞歸分塊、語義分塊和代理分塊,每種方法都各有獨特的優勢。

大型語言模型(LLM)通過其生成類似人類水平的文本、解答復雜問題的能力以及對大量信息進行分析所展現出的驚人準確性,已經改變了自然語言處理(NLP)領域。從客戶服務到醫學研究,LLM在處理各種查詢并生成詳細回復的能力使它們在許多領域都具有不可估量的價值。然而,隨著LLM的規模擴大以處理不斷增長的數據,它們在管理長文檔和高效檢索最相關信息方面面臨著挑戰。

盡管LLM擅長處理和生成類似人類的文本,但它們的“場景窗口”相對有限。這意味著它們一次只能在內存中保留有限的信息,這使得管理非常長的文檔面臨重重困難。此外,LLM還難以從大型數據集中快速找到最相關的信息。更重要的是,LLM是在固定數據集上訓練的,因此隨著新信息的不斷涌現,它們可能會逐漸過時。為了保持準確性和實用性,需要定期更新數據。

檢索增強生成(RAG)解決了這些挑戰。在RAG工作流中有許多組件,例如查詢、嵌入、索引等等。以下對LLM分塊策略進行探討。

通過將文檔分成更小的、有意義的部分,并將它們嵌入到向量數據庫中,RAG系統可以搜索和檢索每個查詢最相關的塊。這種方法使LLM能夠專注于特定的信息,從而提高響應的準確性和效率。

本文將更深入地探討LLM不同的分塊方法及其策略,以及它們在為現實世界的應用程序優化LLM中的作用。

什么是分塊?

分塊是將大數據源拆分成更小的、可管理的部分或“塊”。這些塊存儲在向量數據庫中,允許基于相似性的快速有效搜索。當用戶提交查詢時,向量數據庫會找到最相關的塊,并將它們發送給LLM。這樣,這些模型可以只關注最相關的信息,使其響應更快、更準確。

分塊可以幫助語言模型更順利地處理大型數據集,并通過縮小需要查看的數據范圍來提供精確的答案。

對于需要快速、精確答案的應用程序(例如客戶支持或法律文檔搜索),分塊是提高性能和可靠性的基本策略。

以下是一些在RAG中使用的主要分塊策略:

  • 固定大小分塊
  • 遞歸分塊
  • 語義分塊
  • 代理分塊

現在深入探討各種分塊策略的細節。

1.固定大小分塊

固定大小分塊涉及將數據分成大小相同的部分,從而更容易處理大型文檔。

有時,開發人員會在各個塊之間添加少許重疊部分,也就是讓一個段落的小部分內容在下一個段落的開頭重復出現。這種重疊的方法有助于模型在每個塊的邊界上保留場景,確保關鍵信息不會在邊緣丟失。這種策略對于需要連續信息流的任務特別有用,因為它使模型能夠更準確地解釋文本,并理解段落之間的關系,從而產生更連貫和場景感知的響應。

上圖是固定大小分塊的完美示例,其中每個塊都由一種獨特的顏色表示。綠色部分表示塊之間的重疊部分,確保模型在處理下一個分塊時能夠訪問前一個分塊的相關場景信息。

這種重疊策略提高了模型處理和理解全文的能力,從而在摘要或翻譯等任務中獲得更好的性能,在這些任務中,維護跨塊邊界的信息流至關重要。

代碼示例

現在使用一個代碼示例重新創建這個示例。將使用LangChain來實現固定大小分塊。

Python 
1 from langchain.text_splitter import RecursiveCharacterTextSplitter
2
3 # Function to split text with fixed-size chunks and overlap
4 def split_text_with_overlap(text, chunk_size, overlap_size):
5    # Create a text splitter with overlap
6    text_splitter = RecursiveCharacterTextSplitter(
7        chunk_size=chunk_size, 
8        chunk_overlap=overlap_size
9    )
10    
11    # Split the text
12    chunks = text_splitter.split_text(text)
13   
14    return chunks
15
16 # Example text
17 text = """Artificial Intelligence (AI) simulates human intelligence in machines for tasks like visual perception, speech recognition, and language translation. It has evolved from rule-based systems to data-driven models, enhancing performance through machine learning and deep learning."""
18
19 # Define chunk size and overlap size
20 chunk_size = 80  # 80 characters per chunk
21 overlap_size = 10  # 10 characters overlap between chunks
22
23 # Get the chunks with overlap
24 chunks = split_text_with_overlap(text, chunk_size, overlap_size)
25
26 # Print the chunks and overlaps
27 for i in range(len(chunks)):
28    print(f"Chunk {i+1}:")
29    print(chunks[i])  # Print the chunk itself
30    
31    # If there's a next chunk, print the overlap between current and next chunk
32    if i < len(chunks) - 1:
33        overlap = chunks[i][-overlap_size:]  # Get the overlap part
34        print(f"Overlap with Chunk {i+2}:")
35        print(overlap)
36    
37    print("\n" + "="*50 + "\n")
執行上述代碼后,它將生成以下輸出:
HTML 
1 Chunk 1:
2 Artificial Intelligence (AI) simulates human intelligence in machines for tasks
3 Overlap with Chunk 2:
4 for tasks
5
6 ==================================================
7
8 Chunk 2:
9 for tasks like visual perception, speech recognition, and language translation.
10 Overlap with Chunk 3:
11 anslation.
12
13 ==================================================
14
15 Chunk 3:
16 It has evolved from rule-based systems to data-driven models, enhancing
17 Overlap with Chunk 4:
18  enhancing
19
20 ==================================================
21
22 Chunk 4:
23 enhancing performance through machine learning and deep learning.

2.遞歸分塊

遞歸分塊是一種高效的方法,它通過將文本反復拆分為更小的子塊,從而系統地將龐大的文本內容拆分為更易于管理的部分。這種方法在處理復雜或具有層次結構的文檔時特別有效,能夠確保每個拆分后的部分都保持一致性且場景完整。該過程會持續進行,直至文本被拆分成適合模型進行有效處理的大小。

以需要由具有有限場景窗口的語言模型處理的一個冗長文檔為例,遞歸分塊方法會首先將該文檔拆分為幾個主要部分。若這些部分仍然過于龐大,該方法會進一步將其細分為更小的子部分,并持續這一過程,直至每個塊都符合模型的處理能力。這種層次分明的拆分方式不僅保留了原始文檔的邏輯流程和場景,而且使LLM能夠更有效地處理長文本。

在實際應用中,遞歸分塊可以根據文檔的結構和任務的特定需求采用多種策略來實現,根據標題、段落或句子進行拆分。

在上圖中,文本通過遞歸分塊被拆分為四個不同顏色的塊,每個塊都代表了一個更小、更易管理的部分,并且每個塊包含最多80個單詞。這些塊之間沒有重疊。顏色編碼有助于展示內容是如何被分割成邏輯部分,使模型更容易處理和理解長文本,避免了重要場景的丟失。

代碼示例

現在編寫一個示例,演示如何實現遞歸分塊。

Python 
1 from langchain.text_splitter import RecursiveCharacterTextSplitter
2
3 # Function to split text into chunks using recursive chunking
4 def split_text_recursive(text, chunk_size=80):
5    # Initialize the RecursiveCharacterTextSplitter
6    text_splitter = RecursiveCharacterTextSplitter(
7        chunk_size=chunk_size,  # Maximum size of each chunk (80 words)
8        chunk_overlap=0         # No overlap between chunks
9    )
10    
11    # Split the text into chunks
12    chunks = text_splitter.split_text(text)
13    
14    return chunks
15
16 # Example text
17 text = """Artificial Intelligence (AI) simulates human intelligence in machines for tasks like visual perception, speech recognition, and language translation. It has evolved from rule-based systems to data-driven models, enhancing performance through machine learning and deep learning."""
18
19 # Split the text using recursive chunking
20 chunks = split_text_recursive(text, chunk_size=80)
21
22 # Print the resulting chunks
23 for i, chunk in enumerate(chunks):
24    print(f"Chunk {i+1}:")
25    print(chunk)
26    print("="*50)

上述代碼將生成以下輸出:

HTML 
1 Chunk 1:
2 Artificial Intelligence (AI) simulates human intelligence in machines for tasks
3 ==================================================
4 Chunk 2:
5 like visual perception, speech recognition, and language translation. It has
6 ==================================================
7 Chunk 3:
8 evolved from rule-based systems to data-driven models, enhancing performance
9 ==================================================
10 Chunk 4:
11 through machine learning and deep learning.

在理解了這兩種基于長度的分塊策略之后,是理解一種更關注文本含義/場景的分塊策略的時候了。

3.語義分塊

語義分塊是指根據內容的含義或場景將文本拆分成塊。這種方法通常使用機器學習或自然語言處理(NLP)技術,例如句子嵌入,來識別文本中具有相似含義或語義結構的部分。

在上圖中,每個塊都采用不同的顏色表示——藍色代表人工智能,黃色代表提示工程。這些塊是分隔開的,因為它們涵蓋了不同的想法。這種方法可以確保模型對每個主題都能有清晰且準確的理解,避免了不同主題間的混淆與干擾。

代碼示例

現在編寫一個實現語義分塊的示例。

Python 
1 import os
2 from langchain_experimental.text_splitter import SemanticChunker
3 from langchain_openai.embeddings import OpenAIEmbeddings
4
5 # Set the OpenAI API key as an environment variable (Replace with your actual API key)
6 os.environ["OPENAI_API_KEY"] = "replace with your actual OpenAI API key" 
7
8 # Function to split text into semantic chunks
9 def split_text_semantically(text, breakpoint_type="percentile"):
10    # Initialize the SemanticChunker with OpenAI embeddings
11    text_splitter = SemanticChunker(OpenAIEmbeddings(), breakpoint_threshold_type=breakpoint_type)
12    
13    # Create documents (chunks)
14    docs = text_splitter.create_documents([text])
15    
16    # Return the list of chunks
17    return [doc.page_content for doc in docs]
18
19 def main():
20    # Example content (State of the Union address or your own text)
21    document_content = """
22 Artificial Intelligence (AI) simulates human intelligence in machines for tasks like visual perception, speech recognition, and language translation. It has evolved from rule-based systems to data-driven models, enhancing performance through machine learning and deep learning.
23
24 Prompt Engineering involves designing input prompts to guide AI models in producing accurate and relevant responses, improving tasks such as text generation and summarization.
25    """
26    
27    # Split text using the chosen threshold type (percentile)
28    threshold_type = "percentile"
29    print(f"\nChunks using {threshold_type} threshold:")
30    chunks = split_text_semantically(document_content, breakpoint_type=threshold_type)
31    
32    # Print each chunk's content
33    for idx, chunk in enumerate(chunks):
34        print(f"Chunk {idx + 1}:\n{chunk}\n")
35        
36 if __name__ == "__main__":
37    main()

上述代碼將生成以下輸出:

HTML 
1 Chunks using percentile threshold:
2 Chunk 1:
3 Artificial Intelligence (AI) simulates human intelligence in machines for tasks like visual perception, speech recognition, and language translation. It has evolved from rule-based systems to data-driven models, enhancing performance through machine learning and deep learning.
4
5 Chunk 2:
6 Prompt Engineering involves designing input prompts to guide AI models in producing accurate and relevant responses, improving tasks such as text generation and summarization.

4.代理分塊

在這些策略中,代理分塊是一種強大的策略。這個策略利用像GPT這樣的LLM作為分塊過程中的代理。LLM不再依賴于人工設定的規則來確定內容的拆分方式,而是憑借其強大的理解能力,主動地對輸入信息進行組織或劃分。LLM會依據任務的具體場景,自主決定如何將內容拆分成易于管理的部分,從而找到最佳的拆分方案。

上圖顯示了一個分塊代理將一個龐大的文本拆分成更小的、有意義的部分。這個代理是由人工智能驅動的,這有助于它更好地理解文本,并將其分成有意義的塊。這被稱為“代理分塊”,與簡單地將文本拆分為相等的部分相比,這是一種更智能的處理文本的方式。

接下來探討如何在代碼示例中實現。

Python 
1 from langchain.chat_models import ChatOpenAI
2 from langchain.prompts import PromptTemplate
3 from langchain.chains import LLMChain
4 from langchain.agents import initialize_agent, Tool, AgentType
5
6 # Initialize OpenAI chat model (replace with your API key)
7 llm = ChatOpenAI(model="gpt-3.5-turbo", api_key="replace with your actual OpenAI API key")
8
9 # Step 1: Define Chunking and Summarization Prompt Template
10 chunk_prompt_template = """
11 You are given a large piece of text. Your job is to break it into smaller parts (chunks) if necessary and summarize each chunk.
12 Once all parts are summarized, combine them into a final summary. 
13 If the text is already small enough to process at once, provide a full summary in one step. 
14 Please summarize the following text:\n{input}
15 """
16 chunk_prompt = PromptTemplate(input_variables=["input"], template=chunk_prompt_template)
17
18 # Step 2: Define Chunk Processing Tool
19 def chunk_processing_tool(query):
20    """Processes text chunks and generates summaries using the defined prompt."""
21    chunk_chain = LLMChain(llm=llm, prompt=chunk_prompt)
22    print(f"Processing chunk:\n{query}\n")  # Show the chunk being processed
23    return chunk_chain.run(input=query)
24
25 # Step 3: Define External Tool (Optional, can be used to fetch extra information if needed)
26 def external_tool(query):
27    """Simulates an external tool that could fetch additional information."""
28    return f"External response based on the query: {query}"
29
30 # Step 4: Initialize the agent with tools
31 tools = [
32    Tool(
33        name="Chunk Processing",
34        func=chunk_processing_tool,
35        description="Processes text chunks and generates summaries."
36    ),
37    Tool(
38        name="External Query",
39        func=external_tool,
40        description="Fetches additional data to enhance chunk processing."
41    )
42 ]
43
44 # Initialize the agent with defined tools and zero-shot capabilities
45 agent = initialize_agent(
46    tools=tools,
47    agent_type=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
48    llm=llm,
49    verbose=True
50 )
51
52 # Step 5: Agentic Chunk Processing Function
53 def agent_process_chunks(text):
54    """Uses the agent to process text chunks and generate a final output."""
55    # Step 1: Chunking the text into smaller, manageable sections
56    def chunk_text(text, chunk_size=500):
57        """Splits large text into smaller chunks."""
58        return [text[i:i + chunk_size] for i in range(0, len(text), chunk_size)]
59
60    chunks = chunk_text(text)
61
62    # Step 2: Process each chunk with the agent
63    chunk_results = []
64    for idx, chunk in enumerate(chunks):
65        print(f"Processing chunk {idx + 1}/{len(chunks)}...")
66        response = agent.invoke({"input": chunk})  # Process chunk using the agent
67        chunk_results.append(response['output'])  # Collect the chunk result
68
69    # Step 3: Combine the chunk results into a final output
70    final_output = "\n".join(chunk_results)
71    return final_output
72
73 # Step 6: Running the agent on an example large text input
74 if __name__ == "__main__":
75    # Example large text content
76    text_to_process = """
77    Artificial intelligence (AI) is transforming industries by enabling machines to perform tasks that
78    previously required human intelligence. From healthcare to finance, AI is driving innovation and improving
79    efficiency. For instance, in healthcare, AI algorithms assist doctors in diagnosing diseases, interpreting
80    medical images, and predicting patient outcomes. Meanwhile, in finance, AI helps detect fraud, manage
81    investments, and automate customer service.
82
83    However, the widespread adoption of AI also raises ethical concerns. Issues like privacy invasion,
84    algorithmic bias, and the potential loss of jobs due to automation are significant challenges. Experts
85    argue that it's essential to develop AI responsibly to ensure that it benefits society as a whole.
86    Proper regulations, transparency, and accountability can help address these issues, ensuring that AI
87    technologies are used for the greater good.
88
89    Beyond individual industries, AI is also impacting the global economy. Nations are investing heavily
90    in AI research and development to maintain a competitive edge. This technological race could redefine
91    global power dynamics, with countries that excel in AI leading the way in economic and military strength.
92    Despite the potential for AI to contribute positively to society, its development and application require
93    careful consideration of ethical, legal, and societal implications.
94    """
95
96    # Process the text and print the final result
97    final_result = agent_process_chunks(text_to_process)
98    print("\nFinal Output:\n", final_result)

上述代碼將生成以下輸出:

HTML 
1 Processing chunk 1/3...
2
3
4 > Entering new AgentExecutor chain...
5 I should use Chunk Processing to extract the key information from the text provided.
6 Action: Chunk Processing
7 Action Input: Artificial intelligence (AI) is transforming industries by enabling machines to perform tasks that previously required human intelligence. From healthcare to finance, AI is driving innovation and improving efficiency. For instance, in healthcare, AI algorithms assist doctors in diagnosing diseases, interpreting medical images, and predicting patient outcomes. Meanwhile, in finance, AI helps detect fraud, manage investments, and automate customer service.Processing chunk:
8 Artificial intelligence (AI) is transforming industries by enabling machines to perform tasks that previously required human intelligence. From healthcare to finance, AI is driving innovation and improving efficiency. For instance, in healthcare, AI algorithms assist doctors in diagnosing diseases, interpreting medical images, and predicting patient outcomes. Meanwhile, in finance, AI helps detect fraud, manage investments, and automate customer service.
9
10 Observation: Artificial intelligence (AI) is revolutionizing various industries by allowing machines to complete tasks that once needed human intelligence. In healthcare, AI algorithms aid doctors in diagnosing illnesses, analyzing medical images, and forecasting patient results. In finance, AI is used to identify fraud, oversee investments, and streamline customer service. AI is playing a vital role in enhancing efficiency and driving innovation across different sectors.
11 Thought:I need more specific information about the impact of AI in different industries.
12 Action: External Query
13 Action Input: Impact of artificial intelligence in healthcare
14 Observation: External response based on the query: Impact of artificial intelligence in healthcare
15 Thought:I should now look for information on the impact of AI in finance.
16 Action: External Query
17 Action Input: Impact of artificial intelligence in finance
18 Observation: External response based on the query: Impact of artificial intelligence in finance
19 Thought:I now have a better understanding of how AI is impacting healthcare and finance.
20 Final Answer: Artificial intelligence is revolutionizing industries like healthcare and finance by enhancing efficiency, driving innovation, and enabling machines to perform tasks that previously required human intelligence. In healthcare, AI aids in diagnosing diseases, interpreting medical images, and predicting patient outcomes, while in finance, it helps detect fraud, manage investments, and automate customer service.
21
22 > Finished chain.
23 Processing chunk 2/3...
24
25 > Entering new AgentExecutor chain...
26 This question is discussing ethical concerns related to the widespread adoption of AI and the need to develop AI responsibly.
27 Action: Chunk Processing
28 Action Input: The text providedProcessing chunk:
29 The text provided
30
31 Observation: I'm sorry, but you haven't provided any text to be summarized. Could you please provide the text so I can assist you with summarizing it?
32 Thought:I need to provide the text for chunk processing to summarize.
33 Action: External Query
34 Action Input: Retrieve the text related to the ethical concerns of AI adoption and responsible development
35 Observation: External response based on the query: Retrieve the text related to the ethical concerns of AI adoption and responsible development
36 Thought:Now that I have the text related to ethical concerns of AI adoption and responsible development, I can move forward with chunk processing.
37 Action: Chunk Processing
38 Action Input: The retrieved textProcessing chunk:
39 The retrieved text
40
41 Observation: I'm sorry, but it seems like you have not provided any text for me to summarize. Could you please provide the text you would like me to summarize? Thank you!
42 Thought:I need to ensure that the text related to ethical concerns of AI adoption and responsible development is provided for chunk processing to generate a summary.
43 Action: External Query
44 Action Input: Retrieve the text related to the ethical concerns of AI adoption and responsible development
45 Observation: External response based on the query: Retrieve the text related to the ethical concerns of AI adoption and responsible development
46 Thought:Now that I have the text related to ethical concerns of AI adoption and responsible development, I can proceed with chunk processing to generate a summary.
47 Action: Chunk Processing
48 Action Input: The retrieved textProcessing chunk:
49 The retrieved text
50
51 Observation: I'm sorry, but you haven't provided any text to be summarized. Can you please provide the text so I can help you with the summarization?
52 Thought:I need to make sure that the text related to ethical concerns of AI adoption and responsible development is entered for chunk processing to summarize.
53 Action: Chunk Processing
54 Action Input: Text related to ethical concerns of AI adoption and responsible developmentProcessing chunk:
55 Text related to ethical concerns of AI adoption and responsible development
56
57 Observation: The text discusses the ethical concerns surrounding the adoption of artificial intelligence (AI) and the importance of responsible development. It highlights issues such as bias in AI algorithms, privacy violations, and the potential for autonomous AI systems to make harmful decisions. The text emphasizes the need for transparency, accountability, and ethical guidelines to ensure that AI technologies are developed and deployed in a responsible manner.
58 Thought:The text provides information on ethical concerns related to AI adoption and responsible development, emphasizing the need for regulation, transparency, and accountability. 
59 Final Answer: The text discusses the ethical concerns surrounding the adoption of artificial intelligence (AI) and the importance of responsible development.
60
61 > Finished chain.
62 Processing chunk 3/3...
63
64 > Entering new AgentExecutor chain...
65 This question seems to be about the impact of AI on the global economy and the potential implications.
66 Action: Chunk Processing
67 Action Input: The text providedProcessing chunk:
68 The text provided
69
70 Observation: I'm sorry, but you did not provide any text for me to summarize. Please provide the text that you would like me to summarize.
71 Thought:I need to provide the text for Chunk Processing to summarize.
72 Action: External Query
73 Action Input: Fetch the text about the impact of AI on the global economy and its implications.
74 Observation: External response based on the query: Fetch the text about the impact of AI on the global economy and its implications.
75 Thought:Now that I have the text about the impact of AI on the global economy and its implications, I can proceed with Chunk Processing.
76 Action: Chunk Processing
77 Action Input: The text about the impact of AI on the global economy and its implications.Processing chunk:
78 The text about the impact of AI on the global economy and its implications.
79
80 Observation: The text discusses the significant impact that artificial intelligence (AI) is having on the global economy. It highlights how AI is revolutionizing industries by increasing productivity, reducing costs, and creating new job opportunities. However, there are concerns about job displacement and the need for retraining workers to adapt to the changing landscape. Overall, AI is reshaping the economy and prompting a shift in the way businesses operate.
81 Thought:Based on the summary generated by Chunk Processing, the impact of AI on the global economy seems to be significant, with both positive and negative implications.
82 Final Answer: The impact of AI on the global economy is significant, revolutionizing industries, increasing productivity, reducing costs, creating new job opportunities, but also raising concerns about job displacement and the need for worker retraining.
83
84 > Finished chain.
85
86 Final Output:
87  Artificial intelligence is revolutionizing industries like healthcare and finance by enhancing efficiency, driving innovation, and enabling machines to perform tasks that previously required human intelligence. In healthcare, AI aids in diagnosing diseases, interpreting medical images, and predicting patient outcomes, while in finance, it helps detect fraud, manage investments, and automate customer service.
88 The text discusses the ethical concerns surrounding the adoption of artificial intelligence (AI) and the importance of responsible development.
89 The impact of AI on the global economy is significant, revolutionizing industries, increasing productivity, reducing costs, creating new job opportunities, but also raising concerns about job displacement and the need for worker retraining.

分塊策略的比較

為了更容易理解不同的分塊方法,下表比較了固定大小分塊、遞歸分塊、語義分塊和代理分塊的工作原理、何時使用它們以及它們的局限性。



分塊類型



描述




方法





適用場景





局限性

固定大小

分塊

將文本分成大小相等的塊,而不考慮內容。

基于固定的單詞或字符限制創建的塊。

簡單、結構化的文本,場景連續性并不重要。

可能會丟失場景或拆分句子/想法。

遞歸分塊

不斷地將文本分成更小的塊,直到達到可管理的大小。

分層拆分,如果太大,將部分進一步拆分。

冗長、復雜或分層的文檔(例如技術手冊)。

如果部分過于寬泛,仍可能會丟失場景。

語義分塊

根據意義或相關主題將文本分成塊。

使用句子嵌入等NLP技術對相關內容進行拆分。

場景敏感的任務,連貫性和主題連續性至關重要。

需要NLP技術;實施起來更復雜。

代理分塊

利用人工智能模型(如GPT)將內容自主拆分為有意義的部分。

基于模型的理解和特定任務的場景采用人工智能驅動的拆分。

在內容結構多變的復雜任務中,人工智能可以優化分塊。

可能具有不可預測性,并需要進行調整。

結論

分塊策略與檢索增強生成(RAG)對于提升LLM性能至關重要。分塊策略有助于將復雜數據簡化為更小、更易管理的部分,從而促進更高效的處理;而RAG通過在生成工作流中融入實時數據檢索來改進LLM。總的來說,這些方法通過將有組織的數據與生動、實時的信息相結合,使LLM能夠提供更精確、更貼合場景的回復。

原文標題:Chunking Strategies for Optimizing Large Language Models (LLMs)作者:Usama Jamil

責任編輯:姜華 來源: 51CTO內容精選
相關推薦

2025-08-05 03:22:00

LLM系統語言模型

2023-06-19 16:05:22

大型語言模型人工智能

2024-08-13 08:09:34

2024-05-30 08:40:41

大型語言模型LLM人工智能

2024-11-21 08:22:45

2023-10-08 15:54:12

2023-11-06 08:38:50

LLM語言模型ChatGPT

2024-04-16 16:14:01

人工智能LLMRAG

2024-03-20 10:31:27

2025-03-21 14:34:17

2025-08-19 10:10:46

2025-06-19 10:09:55

2025-08-13 01:00:00

2025-08-13 09:25:06

2025-06-25 10:21:08

2024-07-10 11:38:15

2025-02-17 10:13:27

2024-03-08 09:00:00

大型語言模型人工智能生成式人工智能

2025-11-17 08:00:00

LLMAWQGPTQ

2024-03-29 15:43:32

大型語言模型人工智能
點贊
收藏

51CTO技術棧公眾號

国产亚洲情侣一区二区无| 欧美成人午夜激情在线| www日韩视频| 91大神在线网站| 国产一区三区三区| 欧美激情小视频| 国产sm调教视频| 在线观看亚洲精品福利片| 亚洲一区二区不卡免费| 欧美日韩在线高清| 国产av一区二区三区精品| 在线欧美亚洲| 深夜福利亚洲导航| 无码人妻精品一区二区三| 亚洲日本网址| 亚洲最大成人综合| 日产精品高清视频免费| www.久久成人| 免费人成在线不卡| 国内精品中文字幕| 国精品人伦一区二区三区蜜桃| youjizz欧美| 欧美日韩另类一区| 69sex久久精品国产麻豆| 91在线视频免费看| 91网站黄www| 岛国一区二区三区高清视频| 成人a v视频| 亚洲国产高清视频| 久久久精品一区二区三区| 扒开jk护士狂揉免费| 综合中文字幕| 8x福利精品第一导航| 国产极品美女高潮无套久久久| 男人添女人下部高潮视频在线观看| 久久日一线二线三线suv| 成人免费视频网站| 国产精品久久欧美久久一区| 日本成人在线不卡视频| 97精品国产97久久久久久春色 | 91网站视频在线观看| 97人人做人人人难人人做| 亚洲特级黄色片| 日韩av在线播放中文字幕| 日韩av不卡电影| 国产一级片毛片| 极品av少妇一区二区| 欧美国产第一页| 中文字幕在线有码| 68国产成人综合久久精品| 日日骚久久av| 成人信息集中地| 久久美女精品| 色综久久综合桃花网| 久久久久久国产免费a片| 久久av影视| 亚洲视频在线播放| 国产综合精品久久久久成人av| 欧美美女在线| 国产午夜精品视频| 五月天精品在线| 日韩影院二区| 九九热99久久久国产盗摄| 蜜臀久久精品久久久用户群体| 一区二区蜜桃| 欧美国产日韩一区二区| 久久久久亚洲天堂| 一本色道久久综合亚洲精品高清| 97久久精品国产| 久久国产视频一区| 日韩国产成人精品| 国产日韩欧美中文| www.亚洲天堂.com| 成人福利电影精品一区二区在线观看| 国产偷久久久精品专区| 日本福利片高清在线观看| 国产调教视频一区| 熟女熟妇伦久久影院毛片一区二区| a级影片在线观看| 亚洲图片有声小说| 日韩精品一区二区三区色欲av| 日韩一级二级 | 少妇无码一区二区三区| 成人午夜大片免费观看| 欧美一级爽aaaaa大片| 91社区在线观看| 亚洲精品国产品国语在线app| 每日在线观看av| 精品欧美一区二区三区在线观看 | 日韩精品电影| 欧美另类99xxxxx| 精品人妻一区二区三区免费看| 老司机一区二区| 国产精品免费区二区三区观看 | 丝袜老师办公室里做好紧好爽| 朝桐光一区二区| 日韩免费性生活视频播放| av网站免费在线播放| 91综合久久| 91精品国产免费久久久久久| 一级黄色大毛片| 99国产精品久久久久久久久久| 日韩影视精品| 成年男女免费视频网站不卡| 欧美午夜精品一区二区三区| av av在线| 91日韩视频| 欧美一乱一性一交一视频| 国产精品久久久久久久一区二区 | 精品在线亚洲视频| 九九99久久| 麻豆免费在线视频| 欧美日韩国产一区二区| 亚欧美一区二区三区| 精品久久成人| 国内精品中文字幕| av在线资源观看| 久久免费精品国产久精品久久久久| 真人做人试看60分钟免费| 国产精品久久亚洲不卡| 精品99999| 91日韩中文字幕| 青娱乐精品视频在线| 就去色蜜桃综合| 男女视频在线| 欧美一区二区视频在线观看2020 | 国产精品一级伦理| 亚洲不卡av一区二区三区| 天天操精品视频| 不卡中文一二三区| 日本在线精品视频| 四虎精品成人免费网站| 亚瑟在线精品视频| 女女调教被c哭捆绑喷水百合| 欧美激情欧美| 国产精品香蕉国产| 国产女人在线视频| 日本韩国精品在线| 人妻精品久久久久中文字幕 | www.99在线| 亚洲黄色录像| 奇米成人av国产一区二区三区| 开心激情综合网| 一区二区欧美国产| 欧美人与性动交α欧美精品| 91成人超碰| 91超碰在线免费观看| 在线免费观看的av| 日韩三区在线观看| 久久久久香蕉视频| 成人午夜看片网址| 动漫av网站免费观看| 久久影院资源站| 91高清免费在线观看| 日韩精品123| 色综合久久久久综合体| 亚洲精品乱码久久久久久久久久久久| 先锋影音久久| 欧美国产综合视频| a成人v在线| 色av吧综合网| 国产99对白在线播放| 亚洲一区二区三区四区在线| 国产午夜在线一区二区三区| a91a精品视频在线观看| 欧美日本韩国一区二区三区| 日韩av一级| 久久久999成人| 不卡的日韩av| 精品电影在线观看| a天堂中文字幕| 精品一二三四区| 久久久久久久9| 妖精视频一区二区三区| 国产精品视频免费观看www| 日本中文字幕在线播放| 日韩久久久精品| 欧美一级片免费在线观看| 国产人伦精品一区二区| 爽爽爽在线观看| 最新国产拍偷乱拍精品| 任我爽在线视频精品一| 日本黄色成人| 久久久噜久噜久久综合| 免费av在线电影| 制服.丝袜.亚洲.中文.综合| 精品少妇久久久久久888优播| 26uuu久久天堂性欧美| 日本人视频jizz页码69| 国产精品v日韩精品v欧美精品网站| 精品欧美一区二区久久久伦| 国产一区二区主播在线| 欧美激情日韩图片| 国产福利片在线| 日韩视频一区二区在线观看| 午夜精品久久久久久久久久久久久蜜桃| 国产亚洲一本大道中文在线| 潘金莲一级淫片aaaaa| 亚洲免费影视| av动漫在线免费观看| 国产一区99| 国产伦精品一区二区三区| 成人不卡视频| 欧美国产日本在线| 成人激情电影在线看| 精品久久久久久久久久久久久久久久久 | 激情欧美日韩| 亚洲看片网站| 欧美一级一片| 亚洲一区二区中文字幕| 成人精品电影在线| 97精品一区二区三区| 黄色片免费在线观看| 亚洲欧美制服另类日韩| 亚洲老妇色熟女老太| 欧美日韩高清一区二区三区| 亚洲国产成人精品激情在线| 亚洲精品免费看| 国产综合精品久久久久成人av| 91在线观看地址| 亚洲午夜精品在线观看| 青青草97国产精品免费观看无弹窗版 | 九九热视频在线观看| 精品国产成人在线影院| 国产精品国产三级国产aⅴ| 在线精品观看国产| 99热国产在线观看| 亚洲宅男天堂在线观看无病毒| 91 在线视频| 国产精品国产a| 国产真人做爰视频免费| 久久久久久亚洲综合影院红桃| 最新版天堂资源在线| 国产乱码一区二区三区| 欧美成人乱码一二三四区免费| 天堂在线一区二区| 波多野结衣家庭教师视频 | 日本不卡一区二区在线观看| 天堂蜜桃91精品| 国产成人黄色片| 亚洲视频播放| 国产极品在线视频| 亚洲毛片视频| 欧美三级在线观看视频| 亚洲国产日韩欧美一区二区三区| av 日韩 人妻 黑人 综合 无码| 亚洲国产精品91| 日本黄色播放器| 91精品一区国产高清在线gif| 吴梦梦av在线| 欧美777四色影| 免费日韩在线观看| 国产精品多人| 国产黄视频在线| 蜜桃伊人久久| 日本三级黄色网址| 黑人精品欧美一区二区蜜桃| 97免费公开视频| 高清在线观看日韩| 久久久久成人精品无码中文字幕| 9i在线看片成人免费| 久久丫精品国产亚洲av不卡| ww久久中文字幕| 极品久久久久久久| 亚洲婷婷在线视频| 欧美日韩国产精品综合| 午夜久久久久久久久| 国内免费精品视频| 午夜伊人狠狠久久| 无码人妻精品一区二区| 欧美日韩免费不卡视频一区二区三区| 国产喷水吹潮视频www| 精品成人免费观看| 国产69久久| 久久久精品影院| 午夜久久中文| 91精品国产综合久久香蕉| 亚洲一区电影| 免费看成人午夜电影| 天天做综合网| 分分操这里只有精品| 青青青伊人色综合久久| 伊人av在线播放| 国产视频一区在线观看| 中文字幕在线有码| 色综合久久久网| 99在线精品视频免费观看20| 日韩av在线最新| 精品欧美色视频网站在线观看| 国模叶桐国产精品一区| 99蜜月精品久久91| 国产女主播一区二区三区| 欧美在线免费看视频| 草草草视频在线观看| 视频在线在亚洲| 麻豆免费在线观看视频| 中文字幕欧美区| 国产在线观看99| 欧美裸体bbwbbwbbw| 亚洲 欧美 激情 小说 另类| 日韩天堂在线视频| 自由日本语热亚洲人| 亚洲一区免费网站| 精品国产精品久久一区免费式| www.国产亚洲| 日韩高清在线一区| 国产精品无码网站| 亚洲一区二区三区三| 一道本无吗一区| 亚洲老头老太hd| 欧美v亚洲v| 国产色婷婷国产综合在线理论片a| 精品欧美午夜寂寞影院| 超碰10000| 麻豆精品一二三| 国产男男chinese网站| 亚洲午夜久久久久久久久电影网 | 日韩和的一区二在线| 国产伦精品一区二区三区视频孕妇| 欧美gayvideo| 免费看a级黄色片| 99久久久国产精品| 久久婷婷综合国产| 日韩一区二区三| 老司机在线看片网av| 国产精品久久久久久影视| 免费观看久久av| 国产精品无码一区二区在线| 成人性视频网站| 麻豆国产尤物av尤物在线观看| 欧美喷潮久久久xxxxx| 国产福利小视频在线观看| 日本一区二区三区在线播放 | 亚洲女人被黑人巨大进入| 动漫一区二区| 国产99在线免费| 欧美涩涩网站| 99国产精品免费视频| 一区二区三区在线影院| 国产特级黄色片| 成人97在线观看视频| 久久中文字幕一区二区| av电影一区二区三区| 国产麻豆精品在线| 黄色片在线观看网站| 日韩欧美在线不卡| 日本片在线看| 国产精品久久久久久久小唯西川| 欧美黄色一区| 性农村xxxxx小树林| 偷拍亚洲欧洲综合| 欧洲免费在线视频| 日本一区二区三区在线播放| 精品一区不卡| 自拍偷拍21p| 1区2区3区国产精品| 国产精品一级二级| 欧美精品在线观看| 国产精品一区二区三区美女| 日韩一级片免费视频| 99久久精品免费| 亚洲第一网站在线观看| 中文字幕亚洲国产| 蜜桃精品视频| 久久99中文字幕| 久久久久久黄色| 亚洲专区在线播放| 色综合久久久888| 欧美激情久久久久久久久久久| 男人操女人免费软件| 国产免费观看久久| 99精品在线视频观看| 欧美激情videos| 希岛爱理av免费一区二区| 天天视频天天爽| 一区二区三区中文字幕精品精品| 人妻夜夜爽天天爽| 国产精品99一区| 欧美影院一区| 午夜在线观看一区| 欧美一区二区三区不卡| 天堂√中文最新版在线| 亚洲午夜精品久久久久久浪潮| 国产激情91久久精品导航| 国产美女激情视频| xxxx性欧美| 五月综合久久| 99精品999| 色综合一个色综合亚洲| 国产色在线观看| 免费看国产精品一二区视频| 精品亚洲欧美一区| 91丝袜一区二区三区| 精品自在线视频| 精品视频国产| 91视频在线免费| 欧美人牲a欧美精品| 久久男人av资源站| 精品91一区二区三区| 久久久久国产精品麻豆|