It ranks within the top 10 models of the BigCodeBench Hard set's leaderboard sorted by the Instruct score, meaning that it performs very
well in complex code generation based on NL-oriented
instructions.
It can be consumed through
Amazon Bedrock which allows us to
use a unified API for many different, top-scoring models of the
aforementioned benchmark, and swap them in or out as needed;
add Guardrails to deal with
sensitive information;
Gemini Pro: Cannot be
consumed through the Bedrock API, though it remains a strong contender to
consider in the future.
Athene-V2: Promising fine-tuned model
based on Qwen but has little documentation and
support.
DeepSeek: Has a decent documentation though
it cannot be consumed through the Bedrock API. The base Claude model, at the
time of writing, also performs better in code generation.
Usage
We use Claude 3.5 Sonnet to power
custom and suggested vulnerability fixes, which users can request through the VS Code extension or
within our platform.