mirror of
https://git.mirrors.martin98.com/https://github.com/infiniflow/ragflow.git
synced 2025-04-20 05:00:01 +08:00
Updated max_tokens descriptions (#6751)
### What problem does this PR solve? #6721 ### Type of change - [x] Documentation Update
This commit is contained in:
parent
fc02929946
commit
2471a6e115
@ -33,7 +33,7 @@ Click the dropdown menu of **Model** to show the model configuration window.
|
|||||||
- **Model**: The chat model to use.
|
- **Model**: The chat model to use.
|
||||||
- Ensure you set the chat model correctly on the **Model providers** page.
|
- Ensure you set the chat model correctly on the **Model providers** page.
|
||||||
- You can use different models for different components to increase flexibility or improve overall performance.
|
- You can use different models for different components to increase flexibility or improve overall performance.
|
||||||
- **Preset configurations**: A shortcut to **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty** settings, indicating the freedom level of the model. From **Improvise**, **Precise**, to **Balance**, each preset configuration corresponds to a unique combination of **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty**.
|
- **Freedom**: A shortcut to **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty** settings, indicating the freedom level of the model. From **Improvise**, **Precise**, to **Balance**, each preset configuration corresponds to a unique combination of **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty**.
|
||||||
This parameter has three options:
|
This parameter has three options:
|
||||||
- **Improvise**: Produces more creative responses.
|
- **Improvise**: Produces more creative responses.
|
||||||
- **Precise**: (Default) Produces more conservative responses.
|
- **Precise**: (Default) Produces more conservative responses.
|
||||||
@ -52,9 +52,6 @@ Click the dropdown menu of **Model** to show the model configuration window.
|
|||||||
- **Frequency penalty**: Discourages the model from repeating the same words or phrases too frequently in the generated text.
|
- **Frequency penalty**: Discourages the model from repeating the same words or phrases too frequently in the generated text.
|
||||||
- A higher **frequency penalty** value results in the model being more conservative in its use of repeated tokens.
|
- A higher **frequency penalty** value results in the model being more conservative in its use of repeated tokens.
|
||||||
- Defaults to 0.7.
|
- Defaults to 0.7.
|
||||||
- **Max tokens**: Sets the maximum length of the model's output, measured in the number of tokens.
|
|
||||||
- Defaults to 512.
|
|
||||||
- If disabled, you lift the maximum token limit, allowing the model to determine the number of tokens in its responses.
|
|
||||||
|
|
||||||
:::tip NOTE
|
:::tip NOTE
|
||||||
- It is not necessary to stick with the same model for all components. If a specific model is not performing well for a particular task, consider using a different one.
|
- It is not necessary to stick with the same model for all components. If a specific model is not performing well for a particular task, consider using a different one.
|
||||||
|
@ -34,7 +34,7 @@ Click the dropdown menu of **Model** to show the model configuration window.
|
|||||||
- **Model**: The chat model to use.
|
- **Model**: The chat model to use.
|
||||||
- Ensure you set the chat model correctly on the **Model providers** page.
|
- Ensure you set the chat model correctly on the **Model providers** page.
|
||||||
- You can use different models for different components to increase flexibility or improve overall performance.
|
- You can use different models for different components to increase flexibility or improve overall performance.
|
||||||
- **Preset configurations**: A shortcut to **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty** settings, indicating the freedom level of the model. From **Improvise**, **Precise**, to **Balance**, each preset configuration corresponds to a unique combination of **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty**.
|
- **Freedom**: A shortcut to **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty** settings, indicating the freedom level of the model. From **Improvise**, **Precise**, to **Balance**, each preset configuration corresponds to a unique combination of **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty**.
|
||||||
This parameter has three options:
|
This parameter has three options:
|
||||||
- **Improvise**: Produces more creative responses.
|
- **Improvise**: Produces more creative responses.
|
||||||
- **Precise**: (Default) Produces more conservative responses.
|
- **Precise**: (Default) Produces more conservative responses.
|
||||||
@ -53,9 +53,6 @@ Click the dropdown menu of **Model** to show the model configuration window.
|
|||||||
- **Frequency penalty**: Discourages the model from repeating the same words or phrases too frequently in the generated text.
|
- **Frequency penalty**: Discourages the model from repeating the same words or phrases too frequently in the generated text.
|
||||||
- A higher **frequency penalty** value results in the model being more conservative in its use of repeated tokens.
|
- A higher **frequency penalty** value results in the model being more conservative in its use of repeated tokens.
|
||||||
- Defaults to 0.7.
|
- Defaults to 0.7.
|
||||||
- **Max tokens**: Sets the maximum length of the model's output, measured in the number of tokens.
|
|
||||||
- Defaults to 512.
|
|
||||||
- If disabled, you lift the maximum token limit, allowing the model to determine the number of tokens in its responses.
|
|
||||||
|
|
||||||
:::tip NOTE
|
:::tip NOTE
|
||||||
- It is not necessary to stick with the same model for all components. If a specific model is not performing well for a particular task, consider using a different one.
|
- It is not necessary to stick with the same model for all components. If a specific model is not performing well for a particular task, consider using a different one.
|
||||||
|
@ -32,7 +32,7 @@ Click the dropdown menu of **Model** to show the model configuration window.
|
|||||||
- **Model**: The chat model to use.
|
- **Model**: The chat model to use.
|
||||||
- Ensure you set the chat model correctly on the **Model providers** page.
|
- Ensure you set the chat model correctly on the **Model providers** page.
|
||||||
- You can use different models for different components to increase flexibility or improve overall performance.
|
- You can use different models for different components to increase flexibility or improve overall performance.
|
||||||
- **Preset configurations**: A shortcut to **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty** settings, indicating the freedom level of the model. From **Improvise**, **Precise**, to **Balance**, each preset configuration corresponds to a unique combination of **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty**.
|
- **Freedom**: A shortcut to **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty** settings, indicating the freedom level of the model. From **Improvise**, **Precise**, to **Balance**, each preset configuration corresponds to a unique combination of **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty**.
|
||||||
This parameter has three options:
|
This parameter has three options:
|
||||||
- **Improvise**: Produces more creative responses.
|
- **Improvise**: Produces more creative responses.
|
||||||
- **Precise**: (Default) Produces more conservative responses.
|
- **Precise**: (Default) Produces more conservative responses.
|
||||||
@ -51,9 +51,6 @@ Click the dropdown menu of **Model** to show the model configuration window.
|
|||||||
- **Frequency penalty**: Discourages the model from repeating the same words or phrases too frequently in the generated text.
|
- **Frequency penalty**: Discourages the model from repeating the same words or phrases too frequently in the generated text.
|
||||||
- A higher **frequency penalty** value results in the model being more conservative in its use of repeated tokens.
|
- A higher **frequency penalty** value results in the model being more conservative in its use of repeated tokens.
|
||||||
- Defaults to 0.7.
|
- Defaults to 0.7.
|
||||||
- **Max tokens**: Sets the maximum length of the model's output, measured in the number of tokens.
|
|
||||||
- Defaults to 512.
|
|
||||||
- If disabled, you lift the maximum token limit, allowing the model to determine the number of tokens in its responses.
|
|
||||||
|
|
||||||
:::tip NOTE
|
:::tip NOTE
|
||||||
- It is not necessary to stick with the same model for all components. If a specific model is not performing well for a particular task, consider using a different one.
|
- It is not necessary to stick with the same model for all components. If a specific model is not performing well for a particular task, consider using a different one.
|
||||||
|
@ -48,10 +48,25 @@ You start an AI conversation by creating an assistant.
|
|||||||
4. Update **Model Setting**:
|
4. Update **Model Setting**:
|
||||||
|
|
||||||
- In **Model**: you select the chat model. Though you have selected the default chat model in **System Model Settings**, RAGFlow allows you to choose an alternative chat model for your dialogue.
|
- In **Model**: you select the chat model. Though you have selected the default chat model in **System Model Settings**, RAGFlow allows you to choose an alternative chat model for your dialogue.
|
||||||
- **Preset configurations** refers to the level that the LLM improvises. From **Improvise**, **Precise**, to **Balance**, each preset configuration corresponds to a unique combination of **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty**.
|
- **Freedom**: A shortcut to **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty** settings, indicating the freedom level of the model. From **Improvise**, **Precise**, to **Balance**, each preset configuration corresponds to a unique combination of **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty**.
|
||||||
- **Temperature**: Level of the prediction randomness of the LLM. The higher the value, the more creative the LLM is.
|
This parameter has three options:
|
||||||
- **Top P** is also known as "nucleus sampling". See [here](https://en.wikipedia.org/wiki/Top-p_sampling) for more information.
|
- **Improvise**: Produces more creative responses.
|
||||||
- **Max Tokens**: The maximum length of the LLM's responses. Note that the responses may be curtailed if this value is set too low.
|
- **Precise**: (Default) Produces more conservative responses.
|
||||||
|
- **Balance**: A middle ground between **Improvise** and **Precise**.
|
||||||
|
- **Temperature**: The randomness level of the model's output.
|
||||||
|
Defaults to 0.1.
|
||||||
|
- Lower values lead to more deterministic and predictable outputs.
|
||||||
|
- Higher values lead to more creative and varied outputs.
|
||||||
|
- A temperature of zero results in the same output for the same prompt.
|
||||||
|
- **Top P**: Nucleus sampling.
|
||||||
|
- Reduces the likelihood of generating repetitive or unnatural text by setting a threshold *P* and restricting the sampling to tokens with a cumulative probability exceeding *P*.
|
||||||
|
- Defaults to 0.3.
|
||||||
|
- **Presence penalty**: Encourages the model to include a more diverse range of tokens in the response.
|
||||||
|
- A higher **presence penalty** value results in the model being more likely to generate tokens not yet been included in the generated text.
|
||||||
|
- Defaults to 0.4.
|
||||||
|
- **Frequency penalty**: Discourages the model from repeating the same words or phrases too frequently in the generated text.
|
||||||
|
- A higher **frequency penalty** value results in the model being more conservative in its use of repeated tokens.
|
||||||
|
- Defaults to 0.7.
|
||||||
|
|
||||||
5. Now, let's start the show:
|
5. Now, let's start the show:
|
||||||
|
|
||||||
|
@ -39,4 +39,4 @@ _After accepting the team invite, you should be able to view and update the team
|
|||||||
|
|
||||||
## Leave a joined team
|
## Leave a joined team
|
||||||
|
|
||||||

|

|
@ -11,6 +11,13 @@ Key features, improvements and bug fixes in the latest releases.
|
|||||||
|
|
||||||
Released on March 13, 2025.
|
Released on March 13, 2025.
|
||||||
|
|
||||||
|
### Compatibility changes
|
||||||
|
|
||||||
|
- Removes the **Max_tokens** setting from **Chat configuration**.
|
||||||
|
- Removes the **Max_tokens** setting from **Generate**, **Rewrite**, **Categorize**, **Keyword** agent components.
|
||||||
|
|
||||||
|
From this release onwards, if you still see RAGFlow's responses being cut short or truncated, check the **Max_tokens** setting of your model provider.
|
||||||
|
|
||||||
### Improvements
|
### Improvements
|
||||||
|
|
||||||
- Adds OpenAI-compatible APIs.
|
- Adds OpenAI-compatible APIs.
|
||||||
|
Loading…
x
Reference in New Issue
Block a user