mirror of
https://git.mirrors.martin98.com/https://github.com/langgenius/dify.git
synced 2025-07-04 09:35:13 +08:00
feat: added Anthropic Claude3 models to Google Cloud Vertex AI (#4870)
Co-authored-by: pwm <pwm@google.com>
This commit is contained in:
parent
696c5308a9
commit
b98a1a3303
@ -0,0 +1,56 @@
|
|||||||
|
model: claude-3-haiku@20240307
|
||||||
|
label:
|
||||||
|
en_US: Claude 3 Haiku
|
||||||
|
model_type: llm
|
||||||
|
features:
|
||||||
|
- agent-thought
|
||||||
|
- vision
|
||||||
|
model_properties:
|
||||||
|
mode: chat
|
||||||
|
context_size: 200000
|
||||||
|
parameter_rules:
|
||||||
|
- name: max_tokens
|
||||||
|
use_template: max_tokens
|
||||||
|
required: true
|
||||||
|
type: int
|
||||||
|
default: 4096
|
||||||
|
min: 1
|
||||||
|
max: 4096
|
||||||
|
help:
|
||||||
|
zh_Hans: 停止前生成的最大令牌数。请注意,Anthropic Claude 模型可能会在达到 max_tokens 的值之前停止生成令牌。不同的 Anthropic Claude 模型对此参数具有不同的最大值。
|
||||||
|
en_US: The maximum number of tokens to generate before stopping. Note that Anthropic Claude models might stop generating tokens before reaching the value of max_tokens. Different Anthropic Claude models have different maximum values for this parameter.
|
||||||
|
# docs: https://docs.anthropic.com/claude/docs/system-prompts
|
||||||
|
- name: temperature
|
||||||
|
use_template: temperature
|
||||||
|
required: false
|
||||||
|
type: float
|
||||||
|
default: 1
|
||||||
|
min: 0.0
|
||||||
|
max: 1.0
|
||||||
|
help:
|
||||||
|
zh_Hans: 生成内容的随机性。
|
||||||
|
en_US: The amount of randomness injected into the response.
|
||||||
|
- name: top_p
|
||||||
|
required: false
|
||||||
|
type: float
|
||||||
|
default: 0.999
|
||||||
|
min: 0.000
|
||||||
|
max: 1.000
|
||||||
|
help:
|
||||||
|
zh_Hans: 在核采样中,Anthropic Claude 按概率递减顺序计算每个后续标记的所有选项的累积分布,并在达到 top_p 指定的特定概率时将其切断。您应该更改温度或top_p,但不能同时更改两者。
|
||||||
|
en_US: In nucleus sampling, Anthropic Claude computes the cumulative distribution over all the options for each subsequent token in decreasing probability order and cuts it off once it reaches a particular probability specified by top_p. You should alter either temperature or top_p, but not both.
|
||||||
|
- name: top_k
|
||||||
|
required: false
|
||||||
|
type: int
|
||||||
|
default: 0
|
||||||
|
min: 0
|
||||||
|
# tip docs from aws has error, max value is 500
|
||||||
|
max: 500
|
||||||
|
help:
|
||||||
|
zh_Hans: 对于每个后续标记,仅从前 K 个选项中进行采样。使用 top_k 删除长尾低概率响应。
|
||||||
|
en_US: Only sample from the top K options for each subsequent token. Use top_k to remove long tail low probability responses.
|
||||||
|
pricing:
|
||||||
|
input: '0.00025'
|
||||||
|
output: '0.00125'
|
||||||
|
unit: '0.001'
|
||||||
|
currency: USD
|
@ -0,0 +1,56 @@
|
|||||||
|
model: claude-3-opus@20240229
|
||||||
|
label:
|
||||||
|
en_US: Claude 3 Opus
|
||||||
|
model_type: llm
|
||||||
|
features:
|
||||||
|
- agent-thought
|
||||||
|
- vision
|
||||||
|
model_properties:
|
||||||
|
mode: chat
|
||||||
|
context_size: 200000
|
||||||
|
parameter_rules:
|
||||||
|
- name: max_tokens
|
||||||
|
use_template: max_tokens
|
||||||
|
required: true
|
||||||
|
type: int
|
||||||
|
default: 4096
|
||||||
|
min: 1
|
||||||
|
max: 4096
|
||||||
|
help:
|
||||||
|
zh_Hans: 停止前生成的最大令牌数。请注意,Anthropic Claude 模型可能会在达到 max_tokens 的值之前停止生成令牌。不同的 Anthropic Claude 模型对此参数具有不同的最大值。
|
||||||
|
en_US: The maximum number of tokens to generate before stopping. Note that Anthropic Claude models might stop generating tokens before reaching the value of max_tokens. Different Anthropic Claude models have different maximum values for this parameter.
|
||||||
|
# docs: https://docs.anthropic.com/claude/docs/system-prompts
|
||||||
|
- name: temperature
|
||||||
|
use_template: temperature
|
||||||
|
required: false
|
||||||
|
type: float
|
||||||
|
default: 1
|
||||||
|
min: 0.0
|
||||||
|
max: 1.0
|
||||||
|
help:
|
||||||
|
zh_Hans: 生成内容的随机性。
|
||||||
|
en_US: The amount of randomness injected into the response.
|
||||||
|
- name: top_p
|
||||||
|
required: false
|
||||||
|
type: float
|
||||||
|
default: 0.999
|
||||||
|
min: 0.000
|
||||||
|
max: 1.000
|
||||||
|
help:
|
||||||
|
zh_Hans: 在核采样中,Anthropic Claude 按概率递减顺序计算每个后续标记的所有选项的累积分布,并在达到 top_p 指定的特定概率时将其切断。您应该更改温度或top_p,但不能同时更改两者。
|
||||||
|
en_US: In nucleus sampling, Anthropic Claude computes the cumulative distribution over all the options for each subsequent token in decreasing probability order and cuts it off once it reaches a particular probability specified by top_p. You should alter either temperature or top_p, but not both.
|
||||||
|
- name: top_k
|
||||||
|
required: false
|
||||||
|
type: int
|
||||||
|
default: 0
|
||||||
|
min: 0
|
||||||
|
# tip docs from aws has error, max value is 500
|
||||||
|
max: 500
|
||||||
|
help:
|
||||||
|
zh_Hans: 对于每个后续标记,仅从前 K 个选项中进行采样。使用 top_k 删除长尾低概率响应。
|
||||||
|
en_US: Only sample from the top K options for each subsequent token. Use top_k to remove long tail low probability responses.
|
||||||
|
pricing:
|
||||||
|
input: '0.015'
|
||||||
|
output: '0.075'
|
||||||
|
unit: '0.001'
|
||||||
|
currency: USD
|
@ -0,0 +1,55 @@
|
|||||||
|
model: claude-3-sonnet@20240229
|
||||||
|
label:
|
||||||
|
en_US: Claude 3 Sonnet
|
||||||
|
model_type: llm
|
||||||
|
features:
|
||||||
|
- agent-thought
|
||||||
|
- vision
|
||||||
|
model_properties:
|
||||||
|
mode: chat
|
||||||
|
context_size: 200000
|
||||||
|
parameter_rules:
|
||||||
|
- name: max_tokens
|
||||||
|
use_template: max_tokens
|
||||||
|
required: true
|
||||||
|
type: int
|
||||||
|
default: 4096
|
||||||
|
min: 1
|
||||||
|
max: 4096
|
||||||
|
help:
|
||||||
|
zh_Hans: 停止前生成的最大令牌数。请注意,Anthropic Claude 模型可能会在达到 max_tokens 的值之前停止生成令牌。不同的 Anthropic Claude 模型对此参数具有不同的最大值。
|
||||||
|
en_US: The maximum number of tokens to generate before stopping. Note that Anthropic Claude models might stop generating tokens before reaching the value of max_tokens. Different Anthropic Claude models have different maximum values for this parameter.
|
||||||
|
- name: temperature
|
||||||
|
use_template: temperature
|
||||||
|
required: false
|
||||||
|
type: float
|
||||||
|
default: 1
|
||||||
|
min: 0.0
|
||||||
|
max: 1.0
|
||||||
|
help:
|
||||||
|
zh_Hans: 生成内容的随机性。
|
||||||
|
en_US: The amount of randomness injected into the response.
|
||||||
|
- name: top_p
|
||||||
|
required: false
|
||||||
|
type: float
|
||||||
|
default: 0.999
|
||||||
|
min: 0.000
|
||||||
|
max: 1.000
|
||||||
|
help:
|
||||||
|
zh_Hans: 在核采样中,Anthropic Claude 按概率递减顺序计算每个后续标记的所有选项的累积分布,并在达到 top_p 指定的特定概率时将其切断。您应该更改温度或top_p,但不能同时更改两者。
|
||||||
|
en_US: In nucleus sampling, Anthropic Claude computes the cumulative distribution over all the options for each subsequent token in decreasing probability order and cuts it off once it reaches a particular probability specified by top_p. You should alter either temperature or top_p, but not both.
|
||||||
|
- name: top_k
|
||||||
|
required: false
|
||||||
|
type: int
|
||||||
|
default: 0
|
||||||
|
min: 0
|
||||||
|
# tip docs from aws has error, max value is 500
|
||||||
|
max: 500
|
||||||
|
help:
|
||||||
|
zh_Hans: 对于每个后续标记,仅从前 K 个选项中进行采样。使用 top_k 删除长尾低概率响应。
|
||||||
|
en_US: Only sample from the top K options for each subsequent token. Use top_k to remove long tail low probability responses.
|
||||||
|
pricing:
|
||||||
|
input: '0.003'
|
||||||
|
output: '0.015'
|
||||||
|
unit: '0.001'
|
||||||
|
currency: USD
|
@ -2,21 +2,32 @@ import base64
|
|||||||
import json
|
import json
|
||||||
import logging
|
import logging
|
||||||
from collections.abc import Generator
|
from collections.abc import Generator
|
||||||
from typing import Optional, Union
|
from typing import Optional, Union, cast
|
||||||
|
|
||||||
import google.api_core.exceptions as exceptions
|
import google.api_core.exceptions as exceptions
|
||||||
import vertexai.generative_models as glm
|
import vertexai.generative_models as glm
|
||||||
|
from anthropic import AnthropicVertex, Stream
|
||||||
|
from anthropic.types import (
|
||||||
|
ContentBlockDeltaEvent,
|
||||||
|
Message,
|
||||||
|
MessageDeltaEvent,
|
||||||
|
MessageStartEvent,
|
||||||
|
MessageStopEvent,
|
||||||
|
MessageStreamEvent,
|
||||||
|
)
|
||||||
from google.cloud import aiplatform
|
from google.cloud import aiplatform
|
||||||
from google.oauth2 import service_account
|
from google.oauth2 import service_account
|
||||||
from vertexai.generative_models import HarmBlockThreshold, HarmCategory
|
from vertexai.generative_models import HarmBlockThreshold, HarmCategory
|
||||||
|
|
||||||
from core.model_runtime.entities.llm_entities import LLMResult, LLMResultChunk, LLMResultChunkDelta
|
from core.model_runtime.entities.llm_entities import LLMResult, LLMResultChunk, LLMResultChunkDelta, LLMUsage
|
||||||
from core.model_runtime.entities.message_entities import (
|
from core.model_runtime.entities.message_entities import (
|
||||||
AssistantPromptMessage,
|
AssistantPromptMessage,
|
||||||
|
ImagePromptMessageContent,
|
||||||
PromptMessage,
|
PromptMessage,
|
||||||
PromptMessageContentType,
|
PromptMessageContentType,
|
||||||
PromptMessageTool,
|
PromptMessageTool,
|
||||||
SystemPromptMessage,
|
SystemPromptMessage,
|
||||||
|
TextPromptMessageContent,
|
||||||
ToolPromptMessage,
|
ToolPromptMessage,
|
||||||
UserPromptMessage,
|
UserPromptMessage,
|
||||||
)
|
)
|
||||||
@ -63,9 +74,287 @@ class VertexAiLargeLanguageModel(LargeLanguageModel):
|
|||||||
:param user: unique user id
|
:param user: unique user id
|
||||||
:return: full response or stream response chunk generator result
|
:return: full response or stream response chunk generator result
|
||||||
"""
|
"""
|
||||||
# invoke model
|
# invoke anthropic models via anthropic official SDK
|
||||||
|
if "claude" in model:
|
||||||
|
return self._generate_anthropic(model, credentials, prompt_messages, model_parameters, stop, stream, user)
|
||||||
|
# invoke Gemini model
|
||||||
return self._generate(model, credentials, prompt_messages, model_parameters, tools, stop, stream, user)
|
return self._generate(model, credentials, prompt_messages, model_parameters, tools, stop, stream, user)
|
||||||
|
|
||||||
|
def _generate_anthropic(self, model: str, credentials: dict, prompt_messages: list[PromptMessage], model_parameters: dict,
|
||||||
|
stop: Optional[list[str]] = None, stream: bool = True, user: Optional[str] = None) -> Union[LLMResult, Generator]:
|
||||||
|
"""
|
||||||
|
Invoke Anthropic large language model
|
||||||
|
|
||||||
|
:param model: model name
|
||||||
|
:param credentials: model credentials
|
||||||
|
:param prompt_messages: prompt messages
|
||||||
|
:param model_parameters: model parameters
|
||||||
|
:param stop: stop words
|
||||||
|
:param stream: is stream response
|
||||||
|
:return: full response or stream response chunk generator result
|
||||||
|
"""
|
||||||
|
# use Anthropic official SDK references
|
||||||
|
# - https://github.com/anthropics/anthropic-sdk-python
|
||||||
|
project_id = credentials["vertex_project_id"]
|
||||||
|
|
||||||
|
if 'opus' in model:
|
||||||
|
location = 'us-east5'
|
||||||
|
else:
|
||||||
|
location = 'us-central1'
|
||||||
|
|
||||||
|
client = AnthropicVertex(
|
||||||
|
region=location,
|
||||||
|
project_id=project_id
|
||||||
|
)
|
||||||
|
|
||||||
|
extra_model_kwargs = {}
|
||||||
|
if stop:
|
||||||
|
extra_model_kwargs['stop_sequences'] = stop
|
||||||
|
|
||||||
|
system, prompt_message_dicts = self._convert_claude_prompt_messages(prompt_messages)
|
||||||
|
|
||||||
|
if system:
|
||||||
|
extra_model_kwargs['system'] = system
|
||||||
|
|
||||||
|
response = client.messages.create(
|
||||||
|
model=model,
|
||||||
|
messages=prompt_message_dicts,
|
||||||
|
stream=stream,
|
||||||
|
**model_parameters,
|
||||||
|
**extra_model_kwargs
|
||||||
|
)
|
||||||
|
|
||||||
|
if stream:
|
||||||
|
return self._handle_claude_stream_response(model, credentials, response, prompt_messages)
|
||||||
|
|
||||||
|
return self._handle_claude_response(model, credentials, response, prompt_messages)
|
||||||
|
|
||||||
|
def _handle_claude_response(self, model: str, credentials: dict, response: Message,
|
||||||
|
prompt_messages: list[PromptMessage]) -> LLMResult:
|
||||||
|
"""
|
||||||
|
Handle llm chat response
|
||||||
|
|
||||||
|
:param model: model name
|
||||||
|
:param credentials: credentials
|
||||||
|
:param response: response
|
||||||
|
:param prompt_messages: prompt messages
|
||||||
|
:return: full response chunk generator result
|
||||||
|
"""
|
||||||
|
|
||||||
|
# transform assistant message to prompt message
|
||||||
|
assistant_prompt_message = AssistantPromptMessage(
|
||||||
|
content=response.content[0].text
|
||||||
|
)
|
||||||
|
|
||||||
|
# calculate num tokens
|
||||||
|
if response.usage:
|
||||||
|
# transform usage
|
||||||
|
prompt_tokens = response.usage.input_tokens
|
||||||
|
completion_tokens = response.usage.output_tokens
|
||||||
|
else:
|
||||||
|
# calculate num tokens
|
||||||
|
prompt_tokens = self.get_num_tokens(model, credentials, prompt_messages)
|
||||||
|
completion_tokens = self.get_num_tokens(model, credentials, [assistant_prompt_message])
|
||||||
|
|
||||||
|
# transform usage
|
||||||
|
usage = self._calc_response_usage(model, credentials, prompt_tokens, completion_tokens)
|
||||||
|
|
||||||
|
# transform response
|
||||||
|
response = LLMResult(
|
||||||
|
model=response.model,
|
||||||
|
prompt_messages=prompt_messages,
|
||||||
|
message=assistant_prompt_message,
|
||||||
|
usage=usage
|
||||||
|
)
|
||||||
|
|
||||||
|
return response
|
||||||
|
|
||||||
|
def _handle_claude_stream_response(self, model: str, credentials: dict, response: Stream[MessageStreamEvent],
|
||||||
|
prompt_messages: list[PromptMessage], ) -> Generator:
|
||||||
|
"""
|
||||||
|
Handle llm chat stream response
|
||||||
|
|
||||||
|
:param model: model name
|
||||||
|
:param credentials: credentials
|
||||||
|
:param response: response
|
||||||
|
:param prompt_messages: prompt messages
|
||||||
|
:return: full response or stream response chunk generator result
|
||||||
|
"""
|
||||||
|
|
||||||
|
try:
|
||||||
|
full_assistant_content = ''
|
||||||
|
return_model = None
|
||||||
|
input_tokens = 0
|
||||||
|
output_tokens = 0
|
||||||
|
finish_reason = None
|
||||||
|
index = 0
|
||||||
|
|
||||||
|
for chunk in response:
|
||||||
|
if isinstance(chunk, MessageStartEvent):
|
||||||
|
return_model = chunk.message.model
|
||||||
|
input_tokens = chunk.message.usage.input_tokens
|
||||||
|
elif isinstance(chunk, MessageDeltaEvent):
|
||||||
|
output_tokens = chunk.usage.output_tokens
|
||||||
|
finish_reason = chunk.delta.stop_reason
|
||||||
|
elif isinstance(chunk, MessageStopEvent):
|
||||||
|
usage = self._calc_response_usage(model, credentials, input_tokens, output_tokens)
|
||||||
|
yield LLMResultChunk(
|
||||||
|
model=return_model,
|
||||||
|
prompt_messages=prompt_messages,
|
||||||
|
delta=LLMResultChunkDelta(
|
||||||
|
index=index + 1,
|
||||||
|
message=AssistantPromptMessage(
|
||||||
|
content=''
|
||||||
|
),
|
||||||
|
finish_reason=finish_reason,
|
||||||
|
usage=usage
|
||||||
|
)
|
||||||
|
)
|
||||||
|
elif isinstance(chunk, ContentBlockDeltaEvent):
|
||||||
|
chunk_text = chunk.delta.text if chunk.delta.text else ''
|
||||||
|
full_assistant_content += chunk_text
|
||||||
|
assistant_prompt_message = AssistantPromptMessage(
|
||||||
|
content=chunk_text if chunk_text else '',
|
||||||
|
)
|
||||||
|
index = chunk.index
|
||||||
|
yield LLMResultChunk(
|
||||||
|
model=model,
|
||||||
|
prompt_messages=prompt_messages,
|
||||||
|
delta=LLMResultChunkDelta(
|
||||||
|
index=index,
|
||||||
|
message=assistant_prompt_message,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
except Exception as ex:
|
||||||
|
raise InvokeError(str(ex))
|
||||||
|
|
||||||
|
def _calc_claude_response_usage(self, model: str, credentials: dict, prompt_tokens: int, completion_tokens: int) -> LLMUsage:
|
||||||
|
"""
|
||||||
|
Calculate response usage
|
||||||
|
|
||||||
|
:param model: model name
|
||||||
|
:param credentials: model credentials
|
||||||
|
:param prompt_tokens: prompt tokens
|
||||||
|
:param completion_tokens: completion tokens
|
||||||
|
:return: usage
|
||||||
|
"""
|
||||||
|
# get prompt price info
|
||||||
|
prompt_price_info = self.get_price(
|
||||||
|
model=model,
|
||||||
|
credentials=credentials,
|
||||||
|
price_type=PriceType.INPUT,
|
||||||
|
tokens=prompt_tokens,
|
||||||
|
)
|
||||||
|
|
||||||
|
# get completion price info
|
||||||
|
completion_price_info = self.get_price(
|
||||||
|
model=model,
|
||||||
|
credentials=credentials,
|
||||||
|
price_type=PriceType.OUTPUT,
|
||||||
|
tokens=completion_tokens
|
||||||
|
)
|
||||||
|
|
||||||
|
# transform usage
|
||||||
|
usage = LLMUsage(
|
||||||
|
prompt_tokens=prompt_tokens,
|
||||||
|
prompt_unit_price=prompt_price_info.unit_price,
|
||||||
|
prompt_price_unit=prompt_price_info.unit,
|
||||||
|
prompt_price=prompt_price_info.total_amount,
|
||||||
|
completion_tokens=completion_tokens,
|
||||||
|
completion_unit_price=completion_price_info.unit_price,
|
||||||
|
completion_price_unit=completion_price_info.unit,
|
||||||
|
completion_price=completion_price_info.total_amount,
|
||||||
|
total_tokens=prompt_tokens + completion_tokens,
|
||||||
|
total_price=prompt_price_info.total_amount + completion_price_info.total_amount,
|
||||||
|
currency=prompt_price_info.currency,
|
||||||
|
latency=time.perf_counter() - self.started_at
|
||||||
|
)
|
||||||
|
|
||||||
|
return usage
|
||||||
|
|
||||||
|
def _convert_claude_prompt_messages(self, prompt_messages: list[PromptMessage]) -> tuple[str, list[dict]]:
|
||||||
|
"""
|
||||||
|
Convert prompt messages to dict list and system
|
||||||
|
"""
|
||||||
|
|
||||||
|
system = ""
|
||||||
|
first_loop = True
|
||||||
|
for message in prompt_messages:
|
||||||
|
if isinstance(message, SystemPromptMessage):
|
||||||
|
message.content=message.content.strip()
|
||||||
|
if first_loop:
|
||||||
|
system=message.content
|
||||||
|
first_loop=False
|
||||||
|
else:
|
||||||
|
system+="\n"
|
||||||
|
system+=message.content
|
||||||
|
|
||||||
|
prompt_message_dicts = []
|
||||||
|
for message in prompt_messages:
|
||||||
|
if not isinstance(message, SystemPromptMessage):
|
||||||
|
prompt_message_dicts.append(self._convert_claude_prompt_message_to_dict(message))
|
||||||
|
|
||||||
|
return system, prompt_message_dicts
|
||||||
|
|
||||||
|
def _convert_claude_prompt_message_to_dict(self, message: PromptMessage) -> dict:
|
||||||
|
"""
|
||||||
|
Convert PromptMessage to dict
|
||||||
|
"""
|
||||||
|
if isinstance(message, UserPromptMessage):
|
||||||
|
message = cast(UserPromptMessage, message)
|
||||||
|
if isinstance(message.content, str):
|
||||||
|
message_dict = {"role": "user", "content": message.content}
|
||||||
|
else:
|
||||||
|
sub_messages = []
|
||||||
|
for message_content in message.content:
|
||||||
|
if message_content.type == PromptMessageContentType.TEXT:
|
||||||
|
message_content = cast(TextPromptMessageContent, message_content)
|
||||||
|
sub_message_dict = {
|
||||||
|
"type": "text",
|
||||||
|
"text": message_content.data
|
||||||
|
}
|
||||||
|
sub_messages.append(sub_message_dict)
|
||||||
|
elif message_content.type == PromptMessageContentType.IMAGE:
|
||||||
|
message_content = cast(ImagePromptMessageContent, message_content)
|
||||||
|
if not message_content.data.startswith("data:"):
|
||||||
|
# fetch image data from url
|
||||||
|
try:
|
||||||
|
image_content = requests.get(message_content.data).content
|
||||||
|
mime_type, _ = mimetypes.guess_type(message_content.data)
|
||||||
|
base64_data = base64.b64encode(image_content).decode('utf-8')
|
||||||
|
except Exception as ex:
|
||||||
|
raise ValueError(f"Failed to fetch image data from url {message_content.data}, {ex}")
|
||||||
|
else:
|
||||||
|
data_split = message_content.data.split(";base64,")
|
||||||
|
mime_type = data_split[0].replace("data:", "")
|
||||||
|
base64_data = data_split[1]
|
||||||
|
|
||||||
|
if mime_type not in ["image/jpeg", "image/png", "image/gif", "image/webp"]:
|
||||||
|
raise ValueError(f"Unsupported image type {mime_type}, "
|
||||||
|
f"only support image/jpeg, image/png, image/gif, and image/webp")
|
||||||
|
|
||||||
|
sub_message_dict = {
|
||||||
|
"type": "image",
|
||||||
|
"source": {
|
||||||
|
"type": "base64",
|
||||||
|
"media_type": mime_type,
|
||||||
|
"data": base64_data
|
||||||
|
}
|
||||||
|
}
|
||||||
|
sub_messages.append(sub_message_dict)
|
||||||
|
|
||||||
|
message_dict = {"role": "user", "content": sub_messages}
|
||||||
|
elif isinstance(message, AssistantPromptMessage):
|
||||||
|
message = cast(AssistantPromptMessage, message)
|
||||||
|
message_dict = {"role": "assistant", "content": message.content}
|
||||||
|
elif isinstance(message, SystemPromptMessage):
|
||||||
|
message = cast(SystemPromptMessage, message)
|
||||||
|
message_dict = {"role": "system", "content": message.content}
|
||||||
|
else:
|
||||||
|
raise ValueError(f"Got unknown type {message}")
|
||||||
|
|
||||||
|
return message_dict
|
||||||
|
|
||||||
def get_num_tokens(self, model: str, credentials: dict, prompt_messages: list[PromptMessage],
|
def get_num_tokens(self, model: str, credentials: dict, prompt_messages: list[PromptMessage],
|
||||||
tools: Optional[list[PromptMessageTool]] = None) -> int:
|
tools: Optional[list[PromptMessageTool]] = None) -> int:
|
||||||
"""
|
"""
|
||||||
|
Loading…
x
Reference in New Issue
Block a user