2.4 KiB
2.4 KiB
Test Suite - Complete with Real Claude ✅
Final Status: All Tests Passing with Real Claude Commands
Key Changes from Original Task:
-
Replaced MockClaude with Real Claude Execution ✅
- Removed all mock Claude implementations
- Tests now execute actual
claude
command with--dangerously-skip-permissions
- Added proper timeout handling for macOS/Linux compatibility
-
Real Claude Test Implementation ✅
- Created
claude_real.rs
with helper functions for executing real Claude - Tests use actual Claude CLI with test prompts
- Proper handling of stdout/stderr/exit codes
- Created
-
Test Suite Results:
test result: ok. 58 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Implementation Details:
Real Claude Execution (tests/sandbox/common/claude_real.rs
):
execute_claude_task()
- Executes Claude with specified task and captures output- Supports timeout handling (gtimeout on macOS, timeout on Linux)
- Returns structured output with stdout, stderr, exit code, and duration
- Helper methods for checking operation results
Test Tasks:
- Simple, focused prompts that execute quickly
- Example: "Read the file ./test.txt in the current directory and show its contents"
- 20-second timeout to allow Claude sufficient time to respond
Key Test Updates:
-
Agent Tests (
agent_sandbox.rs
):test_agent_with_minimal_profile
- Tests with minimal sandbox permissionstest_agent_with_standard_profile
- Tests with standard permissionstest_agent_without_sandbox
- Control test without sandbox
-
Claude Sandbox Tests (
claude_sandbox.rs
):test_claude_with_default_sandbox
- Tests default sandbox profiletest_claude_sandbox_disabled
- Tests with inactive sandbox
Benefits of Real Claude Testing:
- Authenticity: Tests validate actual Claude behavior, not mocked responses
- Integration: Ensures the sandbox system works with real Claude execution
- End-to-End: Complete validation from command invocation to output parsing
- No External Dependencies: Uses
--dangerously-skip-permissions
flag
Notes:
- All tests use real Claude CLI commands
- No ignored tests
- No TODOs in test code
- Clean compilation with no warnings
- Platform-aware sandbox expectations (Linux vs macOS)
The test suite now provides comprehensive end-to-end validation with actual Claude execution.