# DataTools Pro API接口设计文档 ## 1. API概述 ### 1.1 设计原则 - **RESTful设计**: 遵循REST架构风格 - **统一格式**: 标准化的请求和响应格式 - **版本控制**: 支持API版本管理 - **错误处理**: 完整的错误码和错误信息 - **安全性**: 输入验证和权限控制 ### 1.2 基础信息 - **Base URL**: `http://localhost:5000` - **Content-Type**: `application/json` - **字符编码**: `UTF-8` - **API版本**: `v1.0` ### 1.3 响应格式规范 ```json { "success": true, "data": {}, "message": "操作成功", "timestamp": "2024-08-05T10:30:00Z", "request_id": "uuid-string" } ``` ## 2. 核心API端点 ### 2.1 Cassandra数据比对API #### 2.1.1 执行单表查询比对 **端点**: `POST /api/query` **功能**: 执行Cassandra单表数据查询和比对分析 **请求参数**: ```json { "pro_config": { "cluster_name": "production-cluster", "datacenter": "datacenter1", "hosts": ["10.0.1.100", "10.0.1.101"], "port": 9042, "username": "cassandra", "password": "password", "keyspace": "production_ks", "table": "user_data" }, "test_config": { "cluster_name": "test-cluster", "datacenter": "datacenter1", "hosts": ["10.0.2.100"], "port": 9042, "username": "cassandra", "password": "password", "keyspace": "test_ks", "table": "user_data" }, "keys": ["user_id"], "values": ["1001", "1002", "1003"], "fields_to_compare": ["name", "email", "status"], "exclude_fields": ["created_at", "updated_at"] } ``` **响应数据**: ```json { "success": true, "data": { "total_keys": 3, "pro_count": 3, "test_count": 2, "differences": [ { "key": {"user_id": "1001"}, "field": "email", "pro_value": "user1@prod.com", "test_value": "user1@test.com", "message": "字段值不匹配" } ], "identical_results": [ { "key": {"user_id": "1002"}, "pro_fields": {"name": "User2", "email": "user2@example.com"}, "test_fields": {"name": "User2", "email": "user2@example.com"} } ], "field_diff_count": { "email": 1 }, "raw_pro_data": [...], "raw_test_data": [...], "summary": { "overview": "查询了3个Key,发现1处差异", "percentages": { "match_rate": 66.67, "diff_rate": 33.33 }, "field_analysis": { "email": {"diff_count": 1, "diff_rate": 33.33} }, "recommendations": ["建议检查邮箱字段的数据同步"] } }, "message": "查询比对完成", "execution_time": 1.25, "timestamp": "2024-08-05T10:30:00Z" } ``` #### 2.1.2 执行分表查询比对 **端点**: `POST /api/sharding-query` **功能**: 执行Cassandra分表数据查询和比对分析 **请求参数**: ```json { "pro_config": { /* 同单表查询配置 */ }, "test_config": { /* 同单表查询配置 */ }, "keys": ["doc_id"], "values": ["wmid_1609459200", "wmid_1609545600"], "fields_to_compare": ["content", "status"], "exclude_fields": [], "sharding_config": { "use_sharding_for_pro": true, "use_sharding_for_test": false, "interval_seconds": 604800, "table_count": 14 } } ``` **响应数据**: ```json { "success": true, "data": { /* 基础比对结果同单表查询 */ "sharding_info": { "pro_shard_mapping": { "wmid_1609459200": "user_data_0", "wmid_1609545600": "user_data_1" }, "test_shard_mapping": { "wmid_1609459200": "user_data", "wmid_1609545600": "user_data" }, "failed_keys": [], "shard_stats": { "pro_tables_used": ["user_data_0", "user_data_1"], "test_tables_used": ["user_data"], "timestamp_extraction_success_rate": 100.0 } } }, "message": "分表查询比对完成", "execution_time": 2.15 } ``` ### 2.2 Redis集群比对API #### 2.2.1 执行Redis集群比对 **端点**: `POST /api/redis/compare` **功能**: 执行Redis集群数据比对分析 **请求参数**: ```json { "cluster1_config": { "name": "生产集群", "nodes": [ {"host": "10.0.1.100", "port": 6379}, {"host": "10.0.1.101", "port": 6380} ], "password": "redis_password", "socket_timeout": 3, "socket_connect_timeout": 3, "max_connections_per_node": 16 }, "cluster2_config": { "name": "测试集群", "nodes": [{"host": "10.0.2.100", "port": 6379}], "password": null }, "query_mode": "specified", "keys": ["user:1001", "user:1002", "session:abc123"], "sample_config": { "count": 100, "pattern": "*", "source_cluster": "cluster2" } } ``` **响应数据**: ```json { "success": true, "data": { "total_keys": 3, "cluster1_found": 2, "cluster2_found": 3, "differences": [ { "key": "user:1001", "cluster1_value": "{\"name\":\"John\",\"age\":25}", "cluster2_value": "{\"name\":\"John\",\"age\":26}", "value_type": "string", "difference_type": "value_mismatch" } ], "identical": [ { "key": "user:1002", "value": "{\"name\":\"Jane\",\"age\":30}", "value_type": "string" } ], "missing_in_cluster1": ["session:abc123"], "missing_in_cluster2": [], "cluster_stats": { "cluster1": { "connection_status": "connected", "response_time_avg": 0.15, "nodes_status": [ {"host": "10.0.1.100", "port": 6379, "status": "connected"}, {"host": "10.0.1.101", "port": 6380, "status": "connected"} ] }, "cluster2": { "connection_status": "connected", "response_time_avg": 0.12, "nodes_status": [ {"host": "10.0.2.100", "port": 6379, "status": "connected"} ] } }, "performance_summary": { "total_execution_time": 0.85, "keys_per_second": 3.53, "data_transferred_kb": 2.1 } }, "message": "Redis集群比对完成" } ``` ### 2.3 配置管理API #### 2.3.1 获取默认配置 **端点**: `GET /api/default-config` **功能**: 获取系统默认数据库配置 **响应数据**: ```json { "success": true, "data": { "pro_config": { "cluster_name": "production-cluster", "datacenter": "datacenter1", "hosts": ["127.0.0.1"], "port": 9042, "username": "", "password": "", "keyspace": "production_ks", "table": "table_name" }, "test_config": { "cluster_name": "test-cluster", "datacenter": "datacenter1", "hosts": ["127.0.0.1"], "port": 9042, "username": "", "password": "", "keyspace": "test_ks", "table": "table_name" } } } ``` #### 2.3.2 创建配置组 **端点**: `POST /api/config-groups` **请求参数**: ```json { "name": "生产环境配置", "description": "生产环境Cassandra配置组", "pro_config": { /* Cassandra配置 */ }, "test_config": { /* Cassandra配置 */ }, "query_config": { "keys": ["user_id"], "fields_to_compare": [], "exclude_fields": [] }, "sharding_config": { "use_sharding_for_pro": false, "use_sharding_for_test": false, "interval_seconds": 604800, "table_count": 14 } } ``` **响应数据**: ```json { "success": true, "data": { "id": 1, "name": "生产环境配置", "created_at": "2024-08-05T10:30:00Z" }, "message": "配置组创建成功" } ``` #### 2.3.3 获取配置组列表 **端点**: `GET /api/config-groups` **响应数据**: ```json { "success": true, "data": [ { "id": 1, "name": "生产环境配置", "description": "生产环境Cassandra配置组", "created_at": "2024-08-05T10:30:00Z", "updated_at": "2024-08-05T10:30:00Z" } ] } ``` #### 2.3.4 获取特定配置组 **端点**: `GET /api/config-groups/{id}` **响应数据**: ```json { "success": true, "data": { "id": 1, "name": "生产环境配置", "description": "生产环境Cassandra配置组", "pro_config": { /* 完整配置 */ }, "test_config": { /* 完整配置 */ }, "query_config": { /* 查询配置 */ }, "sharding_config": { /* 分表配置 */ }, "created_at": "2024-08-05T10:30:00Z", "updated_at": "2024-08-05T10:30:00Z" } } ``` #### 2.3.5 删除配置组 **端点**: `DELETE /api/config-groups/{id}` **响应数据**: ```json { "success": true, "data": null, "message": "配置组删除成功" } ``` ### 2.4 查询历史管理API #### 2.4.1 获取查询历史列表 **端点**: `GET /api/query-history` **查询参数**: - `limit`: 返回记录数量限制 (默认50) - `offset`: 偏移量 (默认0) - `query_type`: 查询类型 (`single`/`sharding`) **响应数据**: ```json { "success": true, "data": { "items": [ { "id": 1, "name": "用户数据比对-20240805", "description": "生产环境用户数据比对", "query_type": "single", "total_keys": 100, "differences_count": 5, "identical_count": 95, "execution_time": 2.5, "created_at": "2024-08-05T10:30:00Z" } ], "total": 1, "has_more": false } } ``` #### 2.4.2 保存查询历史 **端点**: `POST /api/query-history` **请求参数**: ```json { "name": "用户数据比对-20240805", "description": "生产环境用户数据比对", "pro_config": { /* 生产配置 */ }, "test_config": { /* 测试配置 */ }, "query_config": { /* 查询配置 */ }, "query_keys": ["1001", "1002", "1003"], "results_summary": { "total_keys": 3, "differences_count": 1, "identical_count": 2 }, "execution_time": 1.25, "query_type": "single", "sharding_config": null, "raw_results": { /* 完整查询结果 */ } } ``` #### 2.4.3 获取历史记录详情 **端点**: `GET /api/query-history/{id}` **响应数据**: ```json { "success": true, "data": { "id": 1, "name": "用户数据比对-20240805", "description": "生产环境用户数据比对", "pro_config": { /* 完整配置 */ }, "test_config": { /* 完整配置 */ }, "query_config": { /* 查询配置 */ }, "query_keys": ["1001", "1002", "1003"], "results_summary": { /* 结果摘要 */ }, "execution_time": 1.25, "query_type": "single", "created_at": "2024-08-05T10:30:00Z" } } ``` #### 2.4.4 获取历史记录完整结果 **端点**: `GET /api/query-history/{id}/results` **响应数据**: ```json { "success": true, "data": { "differences": [ /* 完整差异数据 */ ], "identical_results": [ /* 完整相同数据 */ ], "raw_pro_data": [ /* 生产原始数据 */ ], "raw_test_data": [ /* 测试原始数据 */ ], "field_diff_count": { /* 字段差异统计 */ }, "summary": { /* 详细分析报告 */ } } } ``` ### 2.5 日志管理API #### 2.5.1 获取查询日志 **端点**: `GET /api/query-logs` **查询参数**: - `limit`: 返回记录数量 (默认100) - `level`: 日志级别 (`INFO`/`WARNING`/`ERROR`) - `history_id`: 关联的历史记录ID **响应数据**: ```json { "success": true, "data": { "logs": [ { "id": 1, "batch_id": "batch-uuid-123", "history_id": 1, "timestamp": "2024-08-05T10:30:01.123Z", "level": "INFO", "message": "开始执行Cassandra查询", "query_type": "cassandra_single", "created_at": "2024-08-05T10:30:01Z" }, { "id": 2, "batch_id": "batch-uuid-123", "history_id": 1, "timestamp": "2024-08-05T10:30:02.456Z", "level": "INFO", "message": "生产环境查询完成,返回3条记录", "query_type": "cassandra_single", "created_at": "2024-08-05T10:30:02Z" } ], "total": 2 } } ``` #### 2.5.2 获取特定历史记录的日志 **端点**: `GET /api/query-logs/history/{id}` **响应数据**: ```json { "success": true, "data": { "history_id": 1, "logs": [ /* 该历史记录相关的所有日志 */ ], "log_summary": { "total_logs": 10, "info_count": 8, "warning_count": 1, "error_count": 1, "start_time": "2024-08-05T10:30:00Z", "end_time": "2024-08-05T10:30:05Z" } } } ``` #### 2.5.3 清空查询日志 **端点**: `DELETE /api/query-logs` **响应数据**: ```json { "success": true, "data": { "deleted_count": 150 }, "message": "查询日志清空成功" } ``` ### 2.6 系统管理API #### 2.6.1 初始化数据库 **端点**: `POST /api/init-db` **功能**: 初始化SQLite数据库表结构 **响应数据**: ```json { "success": true, "data": { "tables_created": [ "config_groups", "query_history", "query_logs" ] }, "message": "数据库初始化成功" } ``` #### 2.6.2 系统健康检查 **端点**: `GET /api/health` **响应数据**: ```json { "success": true, "data": { "status": "healthy", "version": "2.0.0", "uptime": "2 days, 3 hours, 45 minutes", "database": { "sqlite": { "status": "connected", "file_size_mb": 15.2 } }, "memory_usage": { "used_mb": 128.5, "available_mb": 3967.5 }, "last_check": "2024-08-05T10:30:00Z" } } ``` ## 3. 错误处理 ### 3.1 错误响应格式 ```json { "success": false, "error": { "code": "VALIDATION_ERROR", "message": "请求参数验证失败", "details": { "field": "pro_config.hosts", "issue": "hosts字段不能为空" } }, "timestamp": "2024-08-05T10:30:00Z", "request_id": "uuid-string" } ``` ### 3.2 错误码定义 | 错误码 | HTTP状态码 | 说明 | |--------|-----------|------| | `VALIDATION_ERROR` | 400 | 请求参数验证失败 | | `CONNECTION_ERROR` | 500 | 数据库连接失败 | | `QUERY_ERROR` | 500 | 查询执行失败 | | `TIMEOUT_ERROR` | 408 | 请求超时 | | `NOT_FOUND` | 404 | 资源不存在 | | `CONFLICT` | 409 | 资源冲突 | | `SYSTEM_ERROR` | 500 | 系统内部错误 | | `AUTH_ERROR` | 401 | 认证失败 | | `PERMISSION_DENIED` | 403 | 权限不足 | ### 3.3 详细错误场景 #### 3.3.1 连接错误 ```json { "success": false, "error": { "code": "CONNECTION_ERROR", "message": "无法连接到Cassandra集群", "details": { "cluster": "production-cluster", "hosts": ["10.0.1.100", "10.0.1.101"], "error_detail": "Connection refused", "suggestions": [ "检查网络连通性", "验证主机地址和端口", "确认Cassandra服务状态" ] } } } ``` #### 3.3.2 查询错误 ```json { "success": false, "error": { "code": "QUERY_ERROR", "message": "CQL查询执行失败", "details": { "query": "SELECT * FROM user_data WHERE user_id IN (?)", "error_detail": "Invalid keyspace name 'invalid_ks'", "suggestions": [ "检查keyspace名称是否正确", "确认表名拼写无误", "验证字段名是否存在" ] } } } ``` ## 4. 认证和授权 ### 4.1 认证机制 当前版本暂未实现认证机制,所有API端点均为开放访问。在生产环境中建议实现以下认证方式: - **API Key认证**: 基于API密钥的简单认证 - **JWT Token**: JSON Web Token认证 - **OAuth 2.0**: 标准OAuth认证流程 - **LDAP集成**: 企业级LDAP认证 ### 4.2 权限控制 建议实施基于角色的访问控制(RBAC): ```json { "roles": [ { "name": "admin", "permissions": ["read", "write", "delete", "config"] }, { "name": "operator", "permissions": ["read", "write"] }, { "name": "viewer", "permissions": ["read"] } ] } ``` ## 5. API版本管理 ### 5.1 版本策略 - **URL版本控制**: `/api/v1/query`, `/api/v2/query` - **Header版本控制**: `Accept: application/vnd.datatools.v1+json` - **向后兼容**: 保持旧版本API的兼容性 - **弃用策略**: 提前通知API弃用计划 ### 5.2 版本变更记录 | API版本 | 发布日期 | 主要变更 | 兼容性 | |---------|----------|----------|--------| | v1.0 | 2024-08-05 | 初始版本发布 | N/A | ## 6. 性能和限制 ### 6.1 API限制 - **请求频率**: 每分钟最多100次请求 - **并发连接**: 最多10个并发连接 - **响应大小**: 单次响应最大50MB - **查询超时**: 默认120秒超时 ### 6.2 性能优化 - **连接池**: 复用数据库连接 - **缓存策略**: 配置数据缓存 - **异步处理**: 长时间查询异步执行 - **分页处理**: 大数据集分页返回 ## 7. 监控和日志 ### 7.1 API监控指标 - **响应时间**: 平均响应时间和95分位数 - **成功率**: API调用成功率统计 - **错误率**: 各类错误的发生率 - **吞吐量**: 每秒处理的请求数 ### 7.2 日志记录 - **访问日志**: 记录所有API访问 - **错误日志**: 详细的错误信息和堆栈 - **性能日志**: 慢查询和性能瓶颈 - **审计日志**: 重要操作的审计记录 --- **版本**: v1.0 **更新日期**: 2024-08-05 **维护者**: DataTools Pro Team