Llm

ChatGPT CO-STAR Prompt Optimizer Tool	Updated	2024/10/18
Words	252
Tags	#llm #prompt #optimizer	Read	1 minute

I wrote an LLM prompt generator/optimizer ChatGPT - CO-STAR Prompt Optimizer base on How I Won Singapore’s GPT-4 Prompt Engineering Competition | by Sheila Teo | Towards Data Science

It use the CO-STAR technique:

C: Context
O: Object
S: Style
T: Tone
A: Audience
R: Response

and helps me to create relevant, creative, and precise responses that meet my needs most of the time.

Cheap and Capable `gpt40-mini` Boosts Input Quality	Updated	2024/09/21
Words	135
Tags	#api #gpt40-mini #llm	Read	1 minute

API price for gpt40-mini is very cheap yet the model is very capable, it is basically a nearly-free boost for the input quality for larger [[LLM]] models.

Here is a prompt I used

Enhance the following text to improve its quality for processing by a larger language model:

1. Correct any grammatical or spelling errors.
2. Improve sentence structure and flow.
3. Clarify any ambiguous or vague statements.
4. Ensure logical coherence and progression of ideas.
5. Remove redundant information while preserving all key points.
6. Maintain the original tone and intent of the text.
7. Do not add new information or alter the core meaning.

Provide the enhanced text in a clear, concise format. If any part of the text is unclear or requires subject matter expertise to interpret, flag it with [NEEDS CLARIFICATION] at the end of the relevant sentence.

Minimum Probability and Temperature	Updated	2024/08/18
Words	281
Tags	#llm #min_p #temperature	Read	2 minutes

LLM Sampling Techniques: Minimum Probability and Temperature

Minimum Probability Sampling

Definition

Minimum probability sampling is a technique used in language model APIs to balance between diversity and coherence in the model’s output.

How it works

Sets a dynamic threshold for token selection based on the probability of the most likely token.
The threshold is a fraction (determined by the min_p value) of the top token’s probability.

Example explanation

Let’s say min_p = 0.1, and we’re generating the next token:

Scenario A:

Most likely token probability: 95%
Threshold: 95% * 0.1 = 9.5%
Only tokens with probabilities ≥ 9.5% are considered

Scenario B:

Most likely token probability: 10%
Threshold: 10% * 0.1 = 1%
Tokens with probabilities ≥ 1% are considered

Adaptive nature

When the model is very confident (high top probability), the threshold is higher, limiting options to maintain coherence.
When the model is less certain (lower top probability), the threshold lowers, allowing more diverse options.

Benefits

Preserves diversity for open-ended choices
Maintains coherence for deterministic choices (e.g., programming syntax)
Allows higher temperatures without losing coherence

Temperature in LLM Sampling

Definition

Temperature controls the randomness in token selection during text generation.

Effects of Higher Temperature

Increased diversity in outputs
Exploration of less likely options
Reduced repetitiveness
Better performance on open-ended tasks
Potential mitigation of model biases
Improved resilience to prompt engineering

Challenges

Maintaining coherence and relevance at higher temperatures

Optimal Use

Lower temperatures: Tasks requiring high accuracy or factual correctness
Higher temperatures: Creative or exploratory tasks

Synergy: min_p and Temperature

Combining min_p sampling with higher temperatures allows for:

Increased creativity and diversity in outputs
Maintained coherence by filtering out extremely improbable tokens

Key Takeaways

min_p sampling adapts token selection threshold based on the model’s confidence.
Higher temperatures increase output diversity but risk coherence.
Combining min_p with higher temperatures balances creativity and coherence.
The optimal sampling strategy depends on the specific task and desired outcome.

AI时代下的知识管理演进：从搜索引擎到LLM问答，再到全面的自动化	Updated	2024/01/09
Words	157
Tags	#ai #llm #note	Read	1 minute

过去几十年里，信息技术的迅猛发展彻底改变了我们管理和利用知识的方式。从谷歌等搜索引擎的崛起到大型语言模型（ LLM ）如 ChatGPT 的出现，每一个技术进步都对知识管理的理念和实践产生了深远影响。现在，随着 AI技术的进一步成熟，我们正步入一个集成化和自动化知识管理的新时代。

搜索引擎时代：重视知识的结构和脉络

谷歌等搜索引擎变化了人类搜索已有知识的方式，博闻强识的重要性已经大为降低。在搜索引擎时代，人们的主要挑战是从海量的信息中找到所需的知识。搜索引擎降低了个人对于知识点具体细节记忆的依赖，如果有需要可以用搜索引擎补全细节。这使我们更加重视知识的结构和脉络。知识不再是孤立的事实，而是一个互相联系的网络。在这个时代，有效的知识管理意味着能够理解和记忆知识点的连接，通过搜索引擎快速地定位和获取信息。

LLM 的兴起：重视理解问题的能力和方法论

大型语言模形 LLM 如 ChatGPT 的出现又一次改变了我们检索和管理知识的方式。 LLM不仅能提供信息，更重要的是，它们能够理解复杂的查询，感知提问的上下文，提供更加深入和全面的回答，我认为这会促使人们开始重视提问的能力和用于理解问题本质的方法论。

这要求用户不仅要了解他们正在寻找的信息的性质，而且要能够清晰、准确地表达他们的问题。在这个阶段，知识管理变得更加动态和互动，强调的是理解问题的核心，以及如何提问来有效地利用 LLM获取和理解信息。

AI 辅助的知识管理

展望未来，我觉得 AI 的角色将会从简单的信息加工转变为主动的知识管理。 AI辅助的知识管理将会有以下三个特点：自动化的数据收集，智能信息提取与分析，个性化的知识管理。

自动化的数据处理

LLM 技术将能够自动收集、分类和整理来自各种渠道的数据。这包括结构化数据（如数据库中的记录）和非结构化数据（如社交媒体帖子、视频内容）。 LLM 可以提取关键信息，生成摘要，初步加工，分类，并将其转化为结构化的数据格式。

智能信息提取与分析

LLM 可以从用户自己收集的大量数据中提取有价值的信息，并进行深入分析。这不仅限于数字数据，还包括对文本、图像甚至语音信息的理解。通过这种方式， LLM 能够帮助人们获取关于数据的深度洞察，发现潜在的问题和价值。比如 ChatGPT 的 Code Interpreter 就是这方面的一个尝试。

本地运行 LLM和个性化的知识管理

未来的知识管理系统将能够根据用户的特定需求和偏好来定制内容。 AI 将基于用户的行为、历史交互和偏好来推荐相关的资料和信息，从而提供更加个性化的体验。 LLM本地化运行己经成为趋势，这意味着我们不用担心其他人会获取到这些隐私信息。

未来的关键能力：信息分析与综合判断能力

设想在未来，当每个人都拥有智能助手来辅助知识收集、整理、分析，并且可以通过强大的搜索引擎随时检索信息细节，那么在这样一个技术高度成熟的时代，什么能力会变得重要？

我尝试设想一下：

信息分析与综合判断力

在海量信息和数据的环境中，信息分析与综合判断力会变得至关重要。这 不仅是对信息的简单接受，而是需要对信息进行深度分析和评价

这个能力包括

信息分析力：能够理解和分析信息的能力，包括对数据进行深入挖掘，理解其背后的意义和潜在的联系。
逻辑推理：基于收集到的信息进行合理的推理。
综合判断力：综合不同信息源和观点，形成一个全面、平衡的观点。这不仅包括对事实的判断，还包括伦理、价值和实用性的考虑。
批判性思维：虽然不是唯一元素，但批判性思维仍是此技能的重要组成部分。它涉及对信息的质疑、检验和反思。

创造性和创新能力

当基础性的知识管理和数据分析由 AI完成时，人类的创造性思维将成为不可替代的宝贵资源。创新不仅涉及新想法的产生，还包括如何将这些想法转化为实际的产品、服务或解决方案。

人际交往和沟通能力

在技术日益发达的世界里，人际交往和沟通能力仍然是关键。这包括有效的沟通、团队协作、情感智力和领导力。这些技能对于建立人际关系、推动团队合作和领导创新项目至关重要。

关键能力

在未来，基础的信息处理和数据分析任务将由 AI 承担，但人类的高阶思维能力、创新能力、人际交往能力依然无可取代。至少目前 transformer 架构的 LLM还没有展示出 system 2 thinking 式的思考能力。这些能力在未来高度依赖数据和信息的社会中尤为重要。

结论

从搜索引擎到 LLM ，再到未来的 AI集成化和自动化，知识管理将会有一场深刻的变革。每一次技术进步都要求我们以新的方式理解和利用知识。未来的知识管理将更加自动化，智能、和个性化，但这也意味着我们需要不断适应和学习，以充分利用这些新工具的潜力。

what is retrieval augmented generation	Updated	2023/11/22
Words	719
Tags	#llm #machine learning #rag	Read	3 minutes

This video gives an easy explanation about RAG(retrieval augmented generation)

https://youtu.be/T-D1OfcDW1M?si=a_06Z9VkMSwuh6To

Here is my notes about RAG

limitations of Large Language Model (LLM)

Outdated Information: LLMs are often limited by the data they were trained on, which can become outdated over time. This leads to responses that may no longer be accurate or relevant.
Lack of Fact-Checking: Traditional LLMs do not have a mechanism to fact-check or verify the information they generate, which can lead to inaccuracies in their responses.

why was RAG invented

The RAG technique was invented to enhance the accuracy and relevance of responses generated by LLMs. The key motivations for developing RAG were:

To Provide Up-to-Date Information: By integrating a retrieval system, RAG ensures that the LLM has access to the most current information, thereby improving the accuracy of its responses.
To Improve Fact-Checking Capabilities: RAG allows LLMs to cross-reference information with reliable sources, enhancing their ability to verify facts and provide more trustworthy responses.
To Address the Static Nature of LLMs: Since LLMs are trained on static datasets, they can become outdated. RAG introduces a dynamic element where the model can access and incorporate new and updated information.

how does it work

The RAG framework works by combining the capabilities of a traditional LLM with an external information retrieval system. The process involves:

Retrieval of Relevant Information: When a query is made, the RAG system first retrieves relevant information from an external data store. This data store is continuously updated with new information.
Three-Part Prompt Processing: The LLM receives a three-part prompt consisting of the instruction, the retrieved content, and the user’s question.
Generation of Informed Responses: The LLM uses both the retrieved information and its pre-existing knowledge base to generate a response. This ensures that the response is not only based on its training data but also on the most current information available.
Continuous Updating: The data store used for retrieval is continuously updated, allowing the LLM to stay current with new information and developments.

example of the three-part prompt

Instruction: This is a directive or guideline that tells the language model what kind of response is expected. It sets the context or the goal for the response.
Retrieved Content: This part includes the relevant information retrieved from a large-scale data store. This content is dynamically sourced based on the query, ensuring that the information is up-to-date and pertinent to the user’s question.
User's Question: This is the actual query or question posed by the user. It’s what the user wants to know or the problem they need solved.

Example of a Three-Part Prompt

Here’s how the three-part prompt might looks like:

Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
Use three sentences maximum and keep the answer as concise as possible.
Always say "thanks for asking!" at the end of the answer.

<An article or a set of articles from a reputable source, retrieved by the system, discussing the latest developments in AI as of 2023.>

Question: "What are the latest advancements in artificial intelligence?"

refer langchain