Skip to main content

Cohere

Properties used to connect to Cohere.

cohere

Service Types

Chat

  • Type: true | {
         model?: string,
         temperature?: number,
         prompt_truncation?: "AUTO" | "OFF",
         connectors?: {id: string}[],
         documents?: {title: string; snippet: string}[]
    }

Connect to Cohere's chat API. You can set this property to true or configure it using an object:
model is the name of the model used to generate text.
temperature is the degree of the response randomness.
prompt_truncation dictates how the prompt will be constructed. Default is "OFF" which uses all resources. "AUTO" drops some chat history and documents to construct a prompt that fits within the model's context length limit.
connectors is an array of objects that define custom connectors.
documents is an array of objects that define relevant documents which the model can use to enrich its reply. See Document Mode for more info.

Example

<deep-chat
directConnection='{
"cohere": {
"key": "placeholder key",
"chat": {"temperature": 1}
}
}'
></deep-chat>

TextGeneration

  • Type: true | {
         model?: "command" | "base" | "base-light",
         max_tokens?: number,
         temperature?: number,
         k?: number,
         p?: number,
         frequency_penalty?: number,
         presence_penalty?: number,
         end_sequences?: string[],
         stop_sequences?: string[],
         logit_bias?: {[string]: number},
         truncate?: "NONE" | "START" | "END",
         preset?: string
    }

  • Default: {max_tokens: 1000}

Connect to Cohere's text generation API. You can set this property to true or configure it using an object:
model is the name of the model used to generate text.
max_tokens denotes the number of tokens to predict per generation.
temperature is a non-negative float that tunes the degree of randomness in generation. Lower temperatures mean less random generations.
k ensures only the top k most likely tokens are considered for generation at each step. The maximum value is 500.
p is the probability (between 0.0 and 1.0) which ensures that only the most likely tokens - with total probability mass of p are considered for generation at each step. If both k and p are set, p acts after k.
frequency_penalty (between 0.0 and 1.0) can be used to reduce repetitiveness of generated tokens. The higher the value, the stronger a penalty is applied to previously present tokens, proportional to how many times they have already appeared in the prompt or prior generation.
presence_penalty (between 0.0 and 1.0) can be used to reduce repetitiveness of generated tokens. Similar to frequency*penalty, except that this penalty is applied equally to all tokens that have already appeared, regardless of their exact frequencies.
end_sequences is used to cut the generated text at the beginning of the earliest occurence of an end sequence of strings.
stop_sequences is used to cut the generated text at the end of the earliest occurence of stop sequence strings.
logit_bias is used to prevent the model from generating unwanted tokens or to incentivize it to include desired ibes. The format is {token_id: bias} where bias is a float between -10 and 10. Tokens can be obtained from text using Tokenize. E.g. if the value {"11": -10} is provided, the model will be very unlikely to include the token 11 ("\n", the newline character) anywhere in the generated text. In contrast {"11": 10} will result in generations that nearly only contain that token.
truncate is used to specify how the API will handle inputs longer than the maximum token length. Passing "START" will discard the start of the input. "END" will discard the end of the input. "NONE" will throw an error when the input exceeds the maximum input token length.
preset is a combination of parameters, such as prompt, temperature etc. Create presets in the Cohere Playground.

Example

<deep-chat
directConnection='{
"cohere": {
"key": "placeholder key",
"textGeneration": {"model": "command"}
}
}'
></deep-chat>

Summarization

  • Type: true | {
         model?: string,
         length?: "auto" | "short" | "medium" | "long",
         format?: "auto" | "paragraph" | "bullets",
         extractiveness?: "auto" | "low" | "medium" | "high",
         temperature?: number,
         additional_command?: string
    }

Connect to Cohere's summarize API. You can set this property to true or configure it using an object:
model is the name of the model used to generate a summary.
length indicates the approximate length of the summary. "auto" chooses the best option based on the input text.
format indicates the style in which the summary will be delivered - in a free form paragraph or in bullet points.
extractiveness controls how close to the original text the summary is. "high" extractiveness summaries will lean towards reusing sentences verbatim, while "low" extractiveness summaries will tend to paraphrase more.
temperature (from 0 to 5) controls the randomness of the output. Lower values tend to generate more predictable outputs, while higher values tend to generate more creative outputs. The sweet spot is typically between 0 and 1.
additional_command is a free-form instruction for modifying how the summaries get generated. Should start with "Generate a summary _". and end with Eg. "focusing on the next steps" or "written by Yoda".

Example

<deep-chat
directConnection='{
"cohere": {
"key": "placeholder key",
"summarization": {"model": "summarize-xlarge"}
}
}'
></deep-chat>