Introducing Morph: The Fastest Way to Apply Edits to Files
Learn about Morph, our new API that uses focused LLMs to apply edits to files.

Posted by Tejas Bhakta
2 minute read
Introducing Morph
Code transformation should be fast, reliable, and accurate. Whether you're adding type hints, converting between languages, or implementing new features, manually updating code is time-consuming and error-prone.
What is Morph?
Morph is a specialized Large Language Model (LLM) service designed specifically for code transformation. Our service takes:
- Your original code
- The desired update specification
- And returns the transformed code while preserving structure and formatting
Here's a simple example:
# Original code
def add(a, b):
return a + b
# After Morph (adding type hints)
def add(a: int, b: int) -> int:
return a + b
Technical Implementation
Morph leverages several key optimizations to deliver fast and accurate code transformations:
- Speculative Decoding with Input: We use your original code as speculation to accelerate the generation process
- Continuous Batching: Adaptive batch formation with dynamic SLAs for optimal throughput
- Optimized Inference: Custom CUDA kernels and inference graph optimization for maximum performance
Getting Started
Morph provides an OpenAI-compatible API that integrates seamlessly with existing tools:
import openai
import os
USER_PROMPT = """<code>{original_code}</code>
<update>{update_snippet}</update>"""
client = openai.OpenAI(
api_key=os.getenv("MORPH_API_KEY"),
base_url="https://api.morphllm.com/v1"
)
def execute_query(original_code, update_snippet):
response = client.chat.completions.create(
model="morph-v0",
messages=[
{
"role": "user",
"content": USER_PROMPT.format(
original_code=original_code,
update_snippet=update_snippet
)
}
]
)
return response.choices[0].message.content
# Example usage
original = """def add(a, b):
return a + b"""
update = """//.. existing code ...
return 2*a + 2*b"""
result = execute_query(original, update)
print(result)
Performance Features
Our architecture includes several optimizations:
-
Quantization-Aware KV Cache
- Mixed precision storage
- Intelligent cache management
- Dynamic precision switching
-
Inference Optimization
- Static analysis for redundancy elimination
- Optimal tensor layout planning
- Smart memory management
-
Request Processing
- Priority queuing for different client tiers
- Dynamic batch size adjustment
- Intelligent request routing
Next Steps
The long tail of Morph is the inference speed. We're always working on it.
- IDE integrations
- Batch processing capabilities
- Custom transformation rules