Analyzing Cursor Composer and Apply
Cursor recently published a fascinating technical deep-dive into their code editing technology. Their research reveals how they are handling large-scale code edits, achieving speeds of 1000 tokens per second.

Performance comparison showing Cursor's speculative edits achieving significant speed improvements
The Problem: Large Code Edit Challenges
According to Cursor's research, frontier models like GPT-4o and Claude struggle with large code edits in three key areas:
- 1.Latency: Traditional token-by-token generation is too slow for real-time code editing
- 2.Accuracy: Models often make mistakes on complex edits, especially with large files
- 3.Consistency: Multiple model calls can lead to infinite loops or inconsistent results
"Even small, isolated edits are plagued with bugs [...] SWE-Agent attempts a simple edit seven times before giving up due to a consistent syntactic error."
The hard part of applying is making the inference FAST and work at scale reliably.
Cursor's Context-Aware Architecture
Cursor's approach to code editing relies on a sophisticated context retrieval system with four key components:
1Embedding and Retrieval
- • Queries embedded using specialized embedding model
- • Fetches top-k relevant syntax chunks from codebase
- • Provides focused context without overwhelming the model
2Reranking Process
- • Retrieved chunks undergo reranking for relevance
- • Ensures most pertinent context appears first
- • Filters out less useful code snippets
3Prompt Framework
- • Specialized prompting framework for context prioritization
- • Structures information for maximum model understanding
- • Ensures critical context appears in optimal positions
4Apply Phase
- • Uses specialized fast-apply model (Llama-3-70b-ft)
- • Executes planned changes quickly and accurately
- • Leverages context gathered in previous steps
Technical Implementation Details
Full File Rewriting vs. Diffs
Cursor chose full file rewriting over diffs for three strategic reasons:
Token Context
More output tokens give the model more forward passes to determine the correct solution
Training Distribution
Models have seen more complete files than diffs during training
Line Number Challenges
Models struggle with accurate line number counting in diffs
Model Architecture
According to their technical blog post:
- Base Model: Llama-3-70b with fine-tuning
- Performance: ~13x speedup over vanilla inference
- Comparison: ~9x speedup over previous GPT-4 speculative edits deployment
Speculative Edits Innovation
One of Cursor's most interesting innovations is "speculative edits", which they describe as:
"With code edits, we have a strong prior on the draft tokens at any point in time, so we can speculate on future tokens using a deterministic algorithm rather than a draft model."
This approach yields remarkable improvements:
Performance Metrics Comparison
Cursor's evaluation methodology is particularly rigorous, using the formula:
speed = Num_Rewritten_Chars / Latency_for_Rewrite_in_seconds
Cursor's Results
Morph's Superior Results
Model Training Insights
Cursor's blog reveals interesting training decisions that contributed to their success:
Data Preparation
- • Downsampled files under 100 LOC for balanced training
- • Balanced training examples per filename
- • Filtered out no-op transformations
- • Curated high-quality code transformation examples
Model Selection
- • Tested Deepseek Coder Instruct vs Llama 3
- • Found Llama-3-70b-ft performed best overall
- • Outperformed GPT-4 Turbo in evaluations
- • Optimized for code-specific tasks
Future Challenges and Improvements
Cursor identifies several areas for continued improvement:
Long Context Handling
Working on handling files up to 2500 lines while maintaining performance
Model Distillation
Exploring distillation to llama-3-8b for improved efficiency
Accuracy Improvements
Investigating on-policy RL for better performance
Implications for Morph
Cursor's research validates several key principles we've been exploring at Morph:
- The importance of specialized models for code editing
- The benefits of full-file context over diff-based approaches
- The potential of speculative decoding in code transformation
- The critical role of context retrieval and reranking
Their work demonstrates the viability of high-speed code transformation while highlighting the challenges and trade-offs involved in building such systems.
Comparison with Morph v0
We've been working on a similar approach to Cursor Composer and Apply, but with a focus on speed and accuracy. Our approach uses a specialized model to plan the changes, and then a different specialized model to apply the changes.
Morph's Achievements
Technical Observations
Based on our research and development, several technical insights emerge:
RoPE Scaling Challenges
Linear scaling of RoPE ids does not scale well. This task needs to be trained on a large dataset of code edits, with long input sequence and output sequence lengths.
Memory and Compute Requirements
For large files this is incredibly memory intensive and requires significant compute resources. Optimization is crucial for practical deployment.
Process Reward Modeling
Recent process reward modeling shows promise, but it would slow applies down. The reward model would need to output diffs and prove more useful than the original code context.
Conclusion and Key Takeaways
Key Insights
- 📊Rigorous Analysis: Cursor's deep-dive is highly technical, detailing performance metrics (~1000 tokens/s, 3500 char/s) and comparing different model architectures.
- 🏗️Clear Structure: By dividing code editing into planning and apply phases, they simplify a complex problem into manageable components.
- ⚡Innovative Approach: Their use of speculative edits as a deterministic mechanism to forecast future tokens is a notable innovation yielding significant speedups.
- 🔮Transparency and Future Focus: Discussing challenges like long-context training and model distillation offers insight into ongoing research directions.
Overall, Cursor's technical blog sets a high standard for detailed and insightful analysis in the AI-driven code editing space. However, Morph's approach demonstrates that even better performance is achievable with focused optimization and innovative architectural choices.
Experience the Fastest Code Apply
Try Morph's lightning-fast code transformation with 1600+ tokens/second and see the difference for yourself.
Get Started with Morph→