��Ļ�д Prompt ? �� RAG Ӧ��е�ָ��

�� ChatGPT �ȴ��ģ��(LLM)�Ĳ��Ϸ�չ��Խ��Խ��о��Ա��ʼ��ע��ģ�͵�Ӧ�á�

��У��ǿ��ɣ�Retrieval-augmented generation��RAG��һ��֪ʶ�ܼ�� NLP ��ɷ��ͨ��ɹ��֪��֪ʶ��м��Ϣ��Щ��Ϣ�� LLM ��ϣ��Ӷ��ɵ�׼ȷ�ԺͿɿ��ԡ��ַ��ʵ�ָ��֪ʶ�ܼ�� NLP ��ʴ��ժ��ɡ��ȡ�

��Ľ��ӽ��Ż� RAG ϵͳ��һ��ͨ��չʾʹ�� LLM Prompt Engineering �ķ��ͳ NLP ��⡣

01.

��̽

��Դ��Ŀ Akcio��https://github.com/zc277584121/akcio�� һ�� RAG �ʴ�ϵͳ��û��˽��רҵ֪ʶ��Ϳ��Թ��רҵ��ʴ�ϵͳ��

��Akcio �ļܹ�ͼ��רҵ֪ʶ�Ǹ�� Documents��ͨ�� DataLoader �� Store��ÿ�� Question ��LLM ��Խ��ٻ�֪ʶ�� LLM ��Ȼ��Ӧ�Ļش�

�ٸ��ӣ��ǽ�һƪ��Ϊ��2023 ��ģ��ؽ�չ��ƶ��챨�桷��£�� Akcio��Ϳ��ƪ��ˣ��磺

2023�꣬��ģ����ҵ��Ӧ�ó������Է�Ϊ�ļ��ࣿ

ͨ��һЩ�ٻز��ԣ�� Store ��ٻس��ˡ��桷�У��ص� 3 ��ԭ��Ƭ�Σ�

['��2023�꣬��ģ����ҵ��Ӧ�ó����ɷ�Ϊ���ɺ;�������Ӧ�ó���,���߳���Ԥ��ҵ��ֵ���ߡ�',

'��ģ����ҵ�����ɳ�����Ҫ�жԻ����������뿪����������ȡ�',

'NLP��Ӧ�ó������ı����࣬�������룬��з������Զ�ժҪ�ȡ�']

��Ȼ��õ�Ƭ��ǵ�һ��û��ϵ��Akcio �� 3 ��Ϊ context��ȥ�� LLM��ʵģ�

���������֪ʶ�ش����⣺



֪ʶ��



��2023�꣬��ģ����ҵ��Ӧ�ó����ɷ�Ϊ���ɺ;�������Ӧ�ó���,���߳���Ԥ��ҵ��ֵ���ߡ�

��ģ����ҵ�����ɳ�����Ҫ�жԻ����������뿪����������ȡ�

NLP��Ӧ�ó������ı����࣬�������룬��з������Զ�ժҪ�ȡ�



���⣺



2023�꣬��ģ����ҵ��Ӧ�ó������Է�Ϊ�ļ��ࣿ

LLM �Ϳ��Ը��Ļش�:

��ģ����ҵ��Ӧ�ó������Է�Ϊ���ɺ;�������Ӧ�ó�����

��Ļ��·��ͨ�ˡ��׼ܹ��߼��Ʋ��ӣ��뵽��У��ͻᷢ��һЩ�ѵ��Ҫ��

��ڶ��ֶԻ��£��Ҫ��һ��⣺��һ�ֵ��ʣ��Щָ��ĵĴ��ʣ��ô��ֱ��ȥ��ٻأ��ܿ��ܻ��ٻش��֪ʶ��磺

��1: 2023�꣬��ģ����ҵ��Ӧ�ó������Է�Ϊ�ļ��ࣿ

��1: ��ģ����ҵ��Ӧ�ó������Է�Ϊ���ɺ;�������Ӧ�ó�����

��2: ������ʲô�����ܾ���˵����

��Ȼָ��ɺ;��Ӧ�ó��ԭ��ɺ;��߳��ʲô��ܾ��˵��ֱ��ʲô��ܾ��˵��ȥ��ٻأ��Ǻ��п��ٻص��Ǳ��֪ʶƬ�Σ�

['BERT��GPT����NLP�������Ҫģ�ͣ������ǵ���ƺ�Ӧ�ó����кܴ������',

'��ģ�ͺ�Сģ�͵������������ģ�͸��Ӷȡ���ģ��ͨ�����и���Ĳ����͸����ӵĽṹ����Ҫ����ļ�����Դ��ʱ����ѵ������������Сģ������Լ򵥣��������٣�ѵ���������ٶȽϿ졣',

'��û�и������Ϣ��������������Ʒ����Ϊ���ǿ������ǳ����ơ�']

��Ȼ��ˣ��Щ�ٻص�֪ʶ�϶�Ҳ�Ͳ��ˣ�LLM ��Щ��õ�֪ʶҲ��ø��û��ܺõĻش��ˡ�

��ôҪ��ʲô�õİ취�أ�

��ȿ��뵽��NLP��е�һ��ָ��⣨Coreference resolution��ָ��Ȼ��Դ��NLP��е�һ��Ҫ��ȷ��ı��ָ��ͬʵ��Ĵ����ּ��ʶ��ʡ��ʶ��ȣ��ǰ�ᵽ��ʵ��磬�ھ��John saw Mary. He waved to her.��У�coreference resolution�Ὣ��He��John��Լ��her��Mary��Ϊͬһʵ�塣

Ҳ��԰��ǽ��⣬��ʵ��֣��ͨ�� spacy�� huggingface��Ŀǰ�Ŀ�Դģ�ͣ��ָ��һ��ľ��ԣ�ֻ�ܴ��Ƚϼ򵥵ĳ��磺

��1:��ģ����ʲô��

��2:����ʲô�ã�

��ҳ��ָ��ģ��Ȼ��ڸ��ӵ�ָ��ȴ��ʶ��磺

��1:GPT3��ʲô��

��2:GPT4����ʲôʱ�򷢲��ģ�

��3:������ʲô����������ʲô���ƣ�

û��ʶ��ָ�� GPT3 �� GPT4��ָ��GPT4��ٱ��磺

��1:GPT4����ʲôʱ�򷢲��ģ�

��1:GPT4���� 2023 �귢����

��2:��һ���ڼ�����Ӿ���ʲô��չ��

û��ʶ��һ��ָ��2023��

Ҳ��˵��е� NLP Сģ�ͣ�ֻ�ܴ��ʶ��ȼ򵥵Ĵ��ʣ��ڸ��ӵ�ָ��û��ʶ��

�Ǹ��ô��أ��ڸ��Գ��Ҳ��õĴ��ô�ģ�ͣ��Ͼ� ChatGPT ��ʱ��Ǻų�� NLP ��ڵ��ռ��ǣ��ǿ��Գ��ԣ�� LLM ��ָ��

02.

�� ChatGPT ��ָ��

��ñ��õ� ChatGPT ��Ĺؼ��ʵ�� prompt ��ôд��⣬Ҳ��ǳ�˵ prompt engineering��ʵ��һ��ܼ򵥣��Ҫ��ã��ʵ��򵥵�һ��顣

��ʼ��һЩ�򵥵�� prompt�� ChatGPT ȥ��еĴ��ʣ�

prompt = f'''Please return a new question with the following requirements:

1. If there are pronouns or conditions are missing in the question, please make a complete question according to the context.

2. If the question is complete, please keep the original question.



{history}

Question: {question}'''

�� prompt ��ӣ��Ҫ��Ҫô�滻��ʣ�Ҫô��ԭ��ⲻ�䣬��ʷ��ģ�� GPT��һЩЧ��˵��GPT ��ó�Ӣ�ģ��ֱ��Ӣ�Ĳ��ԣ��

history = ''' What is NLP? 

NLP stands for Natural Language Processing. It is a field of computer science and artificial intelligence that focuses on the interaction between computers and humans using natural language. '''

question = 'What problem does it solve?'

ChatGPT��ķ��ǣ�

What problem does Natural Language Processing (NLP) solve?

��񻹲��it��ʶ��NLP��һ��Natural Language Processing (NLP)��ȫ�ơ�

��Ը��ӵ��ӣ�

history = '''When was GPT4 released?

GPT4 was released in 2023'''

question = 'What progress has been made in computer vision this year?'

��ǣ�

What progress has been made in computer vision this year?

GPT ��û�гɹ��滻�� this year��ʧ��ˡ�

��Ը��ӣ�

history = '''What is GPT3?

GPT-3 (Generative Pre-trained Transformer 3) is a state-of-the-art language processing model developed by OpenAI.

When was GPT4 released?

GPT4 was released in 2023'''

question = 'What is the difference between the two? What are the advantages of the latter?'

��ǣ�

Sorry, the information provided about GPT-4's release date is not accurate as it has not been officially announced by OpenAI. Therefore, it is not possible to answer the question about the difference between GPT-3 and GPT-4 or the advantages of the latter.

ChatGPT ��û�а��ǵ�Ҫ��ȥ��д��ʣ��ֱ��ȥ�ش��ˣ��õ� GPT3 ģ��ƫ��ڵģ��ѵ��ﻹû�й�� GPT4 ��Ϣ��ش��Լ��֪��

��ǵĺ��Ĳ��У�� ChatGPT ��һ��ĸ��ʻ�ȥֱ�ӻش��⣬��ǰ��ǵ�Ҫ�� prompt ȥ��д��ʡ��ǳ�� prompt ��һЩҪ��Ҫȥֱ�ӻش��⣬��һ��⻹��С�ĸ��ʻ��֡��ChatGPT ��Ҫ��ʱ��Ҳ��ױ��Ϣ��Ŵ��ҡ�

�ڴ�ͳ NLP Сģ�ͣ�� ChatGPT ��Ĵ�ģ�Ͷ��޷��£�Ӧ��ô��أ�Ҳ��Ǵ��һ��Ƕȳ��Ż��ǵ� prompt�� ChatGPT ��

03.

Few-shot prompt + CoT

�� prompt engineering ��һЩ��ɣ�� LLM ��Ⲣ��ָ���磬��Ǹ� LLM �࿴��ο��𰸣��ģ��ǵĲο�� few-shot context learning �ķ��Ҫ fine-tune LLM ģ�ͣ��ܲ��ܺõ�Ч��һ�� LLM �� 5 �� shot ��LLM ��ܺܺõ�˳��Ĳο��ش�

��Ƿ��Ҫ��滻��Ҫ��ԭ��⣬��һ��Ƚϸ��ӵ� NLP ��⡣��ʹ�� ChatGPT �� LLM��Ҳ��һ��԰��ռ��ο��ӣ�ֱ�Ӹ��ȷ�Ļش𡣻��ã�� prompt engineering �ﻹ��һ��ķ�� LLM ��õش��߼��⣬�Ǿ��˼ά�� CoT��

��˼ά�� (Chain-of-thought��CoT) �ĸ�� Google �� "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models" ��https://arxiv.org/abs/2201.11903��б��״��˼ά��CoT��һ�ָĽ��ʾ��ԣ�� LLM �ڸ��е��ܣ��ʶ��ͷ��˼��ǣ�� LLM �ڻش�Ĺ��У��м��Ƶ�Ҳһ��ش�� LLM ��Ƶ��Ĵ𰸡�� LLM ֱ�ӻش��Ĵ𰸵�׼ȷ��Ҫ�ߵöࡣ

��ԣ��ȫ��ڸ� few-shot examples ʱ��Ҳ�� LLM Chain-of-thought��Ҳ��˼��Ĺ��̡��׼ȷ�ʡ�

��ǣ��ǰ��˼�룬д��ǵ�� prompt��

REWRITE_TEMP = f'''

HISTORY:

[]

NOW QUESTION: Hello, how are you?

NEED COREFERENCE RESOLUTION: No => THOUGHT: So output question is the same as now question. => OUTPUT QUESTION: Hello, how are you?

-------------------

HISTORY:

[Q: Is Milvus a vector database?

A: Yes, Milvus is a vector database.]

NOW QUESTION: How to use it?

NEED COREFERENCE RESOLUTION: Yes => THOUGHT: I need to replace 'it' with 'Milvus' in now question. => OUTPUT QUESTION: How to use Milvus?

-------------------

HISTORY:

[]

NOW QUESTION: What is the features of it?

NEED COREFERENCE RESOLUTION: Yes => THOUGHT: I need to replace 'it' in now question, but I can't find a word in history to replace it, so the output question is the same as now question. => OUTPUT QUESTION: What is the features of it?

-------------------

HISTORY:

[Q: What is PyTorch?

A: PyTorch is an open-source machine learning library for Python. It provides a flexible and efficient framework for building and training deep neural networks. 

Q: What is Tensorflow?

A: TensorFlow is an open-source machine learning framework. It provides a comprehensive set of tools, libraries, and resources for building and deploying machine learning models.]

NOW QUESTION: What is the difference between them?

NEED COREFERENCE RESOLUTION: Yes => THOUGHT: I need replace 'them' with 'PyTorch and Tensorflow' in now question. => OUTPUT QUESTION: What is the different between PyTorch and Tensorflow?

-------------------

HISTORY:

[{history}]

NOW QUESTION: {question}

NEED COREFERENCE RESOLUTION: '''

�� ChatGPT �� 4 ��ο��ӡ��һ��һ��յĶԻ��ʷ��ڶ��򵥵��ӣ��һ��滻ʧ�ܵ��ӣ��һ��滻��ָ��ӡ�

�ڸ�ʽ�ϣ��HISTORY:��ʷ��ģ�Ϊ�˺��ֿ��ڸ�ʽ�ϣ�ÿ��ǰ��Q:��ÿ�λش�ǰ��A:��NOW QUESTION��һ�ֵ��ʡ��NEED COREFERENCE RESOLUTION:��һ�к��棬��һ��CoT��Ƶ��LLM��ж��Ƿ��Ҫ��COREFERENCE RESOLUTION�� Ȼ��THOUGHT:��һ�£��ڳ��ȥ�滻ʱ��Ƿ�ɹ��OUTPUT QUESTION:��д��⡣

�� LLM �ķ��ʱ��ֻ��Ҫ�� LLM ��ַ��ĸ�ʽ��⿪��ֻ��NEED COREFERENCE RESOLUTION:��Yes��ȥ��OUTPUT QUESTION:��д��⡣

�� prompt ��ʹ��Ч��ʧ�ܵ��ӣ�LLM �� ChatGPT��

history = '''When was GPT4 released?

GPT4 was released in 2023'''

question = 'What progress has been made in computer vision this year?'

��أ�

Yes => THOUGHT: I need to replace "this year" with "2023" in the now question. => OUTPUT QUESTION: What progress has been made in computer vision in 2023?

��Կ�� ChatGPT �ɹ��Ҫ�Ľ��ɹ��this year��滻��2023��

��һ��ӣ�

history = '''What is GPT3?

GPT-3 (Generative Pre-trained Transformer 3) is a state-of-the-art language processing model developed by OpenAI.

When was GPT4 released?

GPT4 was released in 2023'''

question = 'What is the difference between the two? What are the advantages of the latter?'

��أ�

Yes => THOUGHT: I need to replace 'the two' with 'GPT-3 and GPT-4' and 'the latter' with 'GPT-4' in the now question. => OUTPUT QUESTION: What is the difference between GPT-3 and GPT-4? What are the advantages of GPT-4?

��Կ��ָ��⣬Ҳ�ܳɹ��the two��滻��GPT-3 and GPT-4��the latter��滻��GPT-4��

04.

�ܽ�

��Ĵӽ��Ż� RAG ϵͳ��һ��ʹ�� LLM prompt engineering �ķ��ͳ NLP ��⣬��õ�һ��Կ��д prompt ��ʱ�ܼ򵥣��ʱȴ��һ��ѵ��¡�� LLM ��֮��prompt engineering ��֧��Ҳ��γɲ��չ��д prompt ��Ҳ��չʾ�� prompt ��ܲ��Ž⣬��һ��ȴ�ͳ NLP ��õöࡣ��Ҳ��ǿ��ˣ�� LLM ��ģ�ͣ�ȷʵ��԰��Ż��ͽ��ܶഫͳ NLP ��⣬Ҳ��Ǵ�ģ��֮��ǿ��ĵط��ɡ�

��GPT-4֮��Ĵ�ģ��Խ��Խǿ��Ҳ�� NLP ��ĺܶ��Ѳ��⣬ͨ�õ��˹��ܻ��߽��ǡ�

��ԣ�https://mp.weixin.qq.com/s/QYSdrMO6dGRy9_czCgqcKQ