# OpenProxy A lightweight, production-ready OpenAI-compatible proxy server that seamlessly forwards LLM API requests to any endpoint with comprehensive logging, cost tracking, and PostgreSQL integration. Perfect for monitoring API usage, calculating costs, and maintaining audit trails for your AI applications. cover-image ## ⚙️ Configuration | Environment Variable | Description | Default Value | |----------------------|-------------|-----------------| | `PORT` | Server port | `3007` | | `UPSTREAM_URL` | Your LLM endpoint URL | **Required** | | `DATABASE_URL` | PostgreSQL connection string for logging | **Required** | | `DATABASE_TABLE` | Name of the table to store the logs | `"llm_proxy"` | ## 💰 Cost Calculation The cost is calculated based on the model and token usage with configurable pricing per model. You'll need to add the cost configuration (in cost per million tokens) for your models in the `cost.ts` file. The default cost configuration in the project (with sample values from `z.ai` models) is: ```typescript export const MODEL_COSTS: Record = { "glm-4.5-air": { input: 0.2, cached: 0.03, output: 1.1 }, "glm-4.6": { input: 0.6, cached: 0.11, output: 2.2 }, "default": { input: 0, cached: 0, output: 0 }, }; ``` You can add more models to the `MODEL_COSTS` object to support your specific LLM providers. ## 📊 PostgreSQL Table Schema ```sql CREATE TABLE IF NOT EXISTS ( timestamp TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP, request_method VARCHAR(10) NOT NULL, request_path VARCHAR(255) NOT NULL, model VARCHAR(20) NOT NULL, completion_tokens INTEGER, prompt_tokens INTEGER, total_tokens INTEGER, cached_tokens INTEGER, total_cost NUMERIC, response_time INTEGER, request_body JSONB, response_body JSONB, response_status INTEGER, provider_url VARCHAR(500), client_ip INET, user_agent TEXT, request_size INTEGER, response_size INTEGER, stream BOOLEAN, temperature REAL, max_tokens INTEGER, request_id UUID ); CREATE INDEX IF NOT EXISTS idx__timestamp ON (timestamp); CREATE INDEX IF NOT EXISTS idx__request_id ON (request_id); ``` ## 🚀 Quick Start ### Installation ```bash npm install ``` ### Configuration Set your environment variables: ```bash export PORT=3007 export UPSTREAM_URL="https://api.example.com/v1" export DATABASE_URL="postgresql://user:password@localhost:5432/llm_logs" export DATABASE_TABLE="llm_proxy" ``` ### Running ```bash # Development mode with auto-reload npm run dev # Production build npm run build npm start ``` ## 💻 Usage The proxy works with any OpenAI-compatible endpoint. Just point your client to the proxy: ```bash curl -X POST http://localhost:3007/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer your-api-key" \ -d '{ "model": "your-model", "messages": [{"role": "user", "content": "Hello!"}] }' ``` ### Example Response with Cost Tracking All responses are logged to PostgreSQL with detailed usage and cost information: ```json { "id": "chatcmpl-123", "object": "chat.completion", "created": 1677652288, "model": "glm-4.5-air", "usage": { "prompt_tokens": 20, "completion_tokens": 30, "total_tokens": 50, "prompt_tokens_details": { "cached_tokens": 5 } }, "choices": [...] } ``` The corresponding database entry will include: - Token usage breakdown - Calculated cost based on your model pricing - Response time metrics - Complete request/response bodies for audit purposes ## 🛡️ Security - Bearer token authentication required - CORS headers configured for cross-origin requests - No sensitive data stored in logs (authentication headers are not logged) - Input validation and error handling ## 📈 Monitoring Monitor your API usage through the PostgreSQL logs: - Track costs across different models - Analyze response times and performance - Identify usage patterns and optimize costs - Maintain compliance with audit requirements ### Metrics Dashboard OpenProxy includes a lightweight Next.js dashboard for real-time metrics visualization: ```bash cd dashboard npm install cp .env.example .env # Configure DATABASE_URL in .env npm run dev ``` The dashboard (available at `http://localhost:3008`) provides: - **Real-time Overview**: Total requests, tokens, costs, and response times - **Model Breakdown**: Usage statistics grouped by LLM model - **Hourly Trends**: Visual charts showing request patterns over time - **Recent Requests**: Detailed table of recent API calls - **Auto-refresh**: Automatic updates every 30 seconds See [dashboard/README.md](./dashboard/README.md) for detailed setup instructions. ## 🤝 Contributing Feel free to submit issues and enhancement requests! ## 📄 License This project is open source and available under the MIT License.