NorthTec AI Backend

Documentación técnica para desarrolladores del backend de NorthTec AI. Sistema de chat con IA, RAG, y widgets embebidos.

Overview

El backend está compuesto por dos servicios principales:

☁️ Cloud Run (northtec-ai)

API principal que maneja chat, widgets, RAG y gestión de keys. Puerto 8080.

https://prod.northtec.io

🔥 Firebase Functions (northtec-fb)

Triggers de Firebase Auth (onUserCreate) y reglas de seguridad.

Tech Stack

Componente	Tecnología
Runtime	Node.js 20 + TypeScript
Framework	Fastify 5.x
Database	Firebase Firestore
Vector DB	Pinecone
AI Model	OpenAI GPT-4.1 (Responses API)
Storage	Google Cloud Storage
Hosting	Google Cloud Run

Architecture

Request Flow

Client Request → Auth Guard → Load Manifest → Handler → OpenAI/Tools → SSE Response

Data Flow

// Flujo de datos en una conversación de chat

1. Cliente envía mensaje con ntx-api-key
2. apiKeyAuth valida key y carga manifest desde Firestore
3. buildSystemPrompt() construye prompt con reglas del manifest
4. Si RAG habilitado: retrieveSimilarChunks() busca contexto en Pinecone
5. openai.responses.stream() genera respuesta
6. Si tool call: runMcpTool() ejecuta y retorna resultado
7. Respuesta streameada via SSE
8. Mensaje guardado en Firestore

Quick Start

Setup Local

# Clonar repositorio
git clone https://github.com/Northtec-Devs/northec-ai-backend.git
cd northec-ai-backend/northtec-ai

# Instalar dependencias
npm install

# Configurar variables de entorno
cp .env.example .env
# Editar .env con tus keys

# Desarrollo
npm run dev

# Build
npm run build

# Verificar tipos
npx tsc --noEmit

Deploy a Cloud Run

# Build y deploy
gcloud builds submit --tag gcr.io/northtec107/northtec-ia

gcloud run deploy northtec-ia \
  --image gcr.io/northtec107/northtec-ia \
  --region us-central1

Chat Endpoints

POST /chat/stream

Endpoint principal de chat. Retorna respuesta en streaming via SSE. Soporta texto e imágenes.

Authentication

ntx-api-key header (requerido)

Request Body

{
  "message": "Hola, necesito información sobre seguros",
  "conversationId": "conv_abc123",  // opcional, se crea si no existe
  "imageUrl": "https://...",        // opcional, URL o data URI
  "uid": "user_xyz"                 // opcional, ID del usuario final
}

SSE Events

// Inicio
{ "type": "start", "conversationId": "conv_abc123", "clientId": "..." }

// Deltas de texto (streaming)
{ "type": "delta", "conversationId": "...", "text": "Hola, " }
{ "type": "delta", "conversationId": "...", "text": "¿en qué " }

// Resultado de tool (si se ejecutó)
{ "type": "tool_result", "name": "validate_identity", "call_id": "...", "output": {...} }

// Respuesta final
{ "type": "final", "conversationId": "...", "assistant": "Hola, ¿en qué puedo ayudarte?" }

// Fin
{ "type": "done", "conversationId": "..." }

// Error (si ocurre)
{ "type": "error", "conversationId": "...", "message": "internal_error" }

Archivo Handler

src/agents/assistant/stream.ts

POST /chat/auth

Upgrade a nivel 2 de verificación para una conversación. Usado después de validar identidad.

Request Body

{
  "conversationId": "conv_abc123",
  "level": 2
}

Archivo Handler

src/routes/level2.route.ts

GET /health

Health check del servicio. Sin autenticación.

{ "status": "ok" }

Widget Endpoints

Endpoints para el chat widget embebido en sitios de clientes.

POST /chat/widget/init

Inicializa una sesión de widget. Retorna token de sesión y conversationId.

Authentication

ntx-embed-key header (requerido)

Request Body

{
  "visitorId": "visitor_123",  // opcional, identificador del visitante
  "metadata": { ... }          // opcional, datos adicionales
}

Response

{
  "ok": true,
  "sessionToken": "sess_abc123xyz",
  "conversationId": "conv_xyz789",
  "assistantName": "Sofia",
  "greeting": "¡Hola! ¿En qué puedo ayudarte?"
}

Archivo Handler

src/agents/widget/handleWidgetInit.ts

POST /chat/widget/stream

Chat streaming para widget. Igual que /chat/stream pero con auth de widget.

Authentication

ntx-embed-key + ntx-session-token headers

GET /chat/widget/messages

Polling de mensajes. Útil para clientes que no soportan SSE. Incluye mensajes de operadores en modo manual.

Authentication

ntx-session-token header

Query Params

?conversationId=conv_xyz&after=1706123456789&limit=50

Response

{
  "ok": true,
  "conversationId": "conv_xyz",
  "manualMode": false,
  "status": "active",  // "active" | "closed" | "resolved"
  "messages": [
    { "role": "user", "text": "Hola", "createdAt": "..." },
    { "role": "assistant", "text": "¡Hola! ¿En qué puedo ayudarte?", "createdAt": "..." }
  ]
}

Archivo Handler

src/agents/widget/handleWidgetMessages.ts

POST /chat/widget/end

Finaliza la sesión del widget. Guarda feedback y rating.

Request Body

{
  "conversationId": "conv_xyz",
  "reason": "resolved",
  "rating": 5,
  "feedback": "Muy buena atención"
}

Archivo Handler

src/agents/widget/handleWidgetEnd.ts

WhatsApp Endpoints

Endpoints para integración con WhatsApp Business API.

Webhook (Meta)

GET /webhook/whatsapp

Verificación de webhook de Meta. Se llama una sola vez al configurar el webhook en Meta.

Query Parameters

hub.mode=subscribe
hub.verify_token=YOUR_VERIFY_TOKEN
hub.challenge=CHALLENGE_STRING

Response

Retorna el challenge si el verify_token coincide.

POST /webhook/whatsapp

Recibe mensajes entrantes de WhatsApp. Meta envía mensajes aquí. Siempre retorna 200 para evitar reintentos.

Authentication

X-Hub-Signature-256 header (verificado con appSecret)

Archivo Handler

src/agents/whatsapp/handleWhatsAppWebhook.ts

Admin Endpoints

Gestión de WhatsApp keys desde el admin panel.

POST /api/whatsapp-keys/create

Crea una nueva WhatsApp key. Retorna la key completa solo una vez.

Authentication

Authorization: Bearer {firebaseIdToken}

Request Body

{
  "name": "WhatsApp Principal",
  "phoneNumberId": "123456789012345",
  "businessAccountId": "987654321098765",
  "accessToken": "EAAxxxxxxx...",
  "appSecret": "abc123...",
  "persona": {
    "role": "support",
    "tone": "friendly",
    "assistantName": "Sofia"
  },
  "rateLimit": {
    "messagesPerMinute": 30,
    "messagesPerDay": 5000,
    "messagesPerUserPerMinute": 5,
    "messagesPerUserPerHour": 50
  }
}

Response

{
  "whatsappKey": "ntx_wa_abc123...",  // Solo se muestra una vez
  "keyId": "wa_xyz789",
  "last7": "abc123",
  "webhookVerifyToken": "verify_token_here",
  "webhookUrl": "https://prod.northtec.io/webhook/whatsapp"
}

GET /api/whatsapp-keys/list

Lista todas las WhatsApp keys del usuario.

POST /api/whatsapp-keys/update

Actualiza una WhatsApp key existente.

POST /api/whatsapp-keys/revoke

Revoca (desactiva) una WhatsApp key.

POST /api/whatsapp-keys/test

Envía un mensaje de prueba para verificar la configuración.

Request Body

{
  "keyId": "wa_xyz789",
  "testNumber": "50612345678"
}

Documents/RAG Endpoints

POST /indexGeneral

Indexa un documento en Pinecone para RAG. Soporta PDF, DOCX, TXT, etc.

Authentication

Authorization: Bearer {firebaseToken} + ntx-api-key

Request (multipart/form-data)

file: [archivo]
type: "documento"
productId: "prod_123"  // opcional

Archivo Handler

src/rag/docs-indexer.ts → indexGeneralHandler()

DELETE /documents

Elimina un documento indexado de Pinecone y Storage.

Request Body

{
  "productId": "prod_123",
  "type": "documento"
}

Admin Endpoints

Endpoints para gestión desde el admin panel. Requieren Firebase ID Token.

API Keys

POST /api/keys/create

Crea una nueva API key para el usuario autenticado.

Authentication

Authorization: Bearer {firebaseIdToken}

Request Body

{
  "name": "Production Key",
  "allowedDomains": ["example.com", "*.example.com"],
  "persona": {
    "role": "advisor",
    "tone": "friendly",
    "assistantName": "Sofia"
  }
}

Response

{
  "ok": true,
  "apiKey": "ntx_live_abc123...",  // Solo se muestra una vez
  "keyId": "key_xyz"
}

GET /api/keys/list

Lista todas las API keys del usuario.

POST /api/keys/update

Actualiza una API key existente.

POST /api/keys/revoke

Revoca (desactiva) una API key.

Embed Keys

Misma estructura que API Keys pero en /api/embed-keys/*

Client Utilities

POST /api/client/cedula

Valida una cédula costarricense usando la API de gometa.org.

Request Body

{ "cedula": "113030227" }

Archivo Handler

src/routes/client.route.ts

Authentication

Tipos de Autenticación

Tipo	Header	Uso	Guard
API Key	`ntx-api-key`	Chat API principal	`apiKey.guard.ts`
Embed Key	`ntx-embed-key`	Widget embebido	`widgetSession.guard.ts`
Session Token	`ntx-session-token`	Sesión de widget	`widgetSession.guard.ts`
WhatsApp Signature	`X-Hub-Signature-256`	WhatsApp webhook	`whatsappWebhook.guard.ts`
Firebase Token	`Authorization: Bearer`	Admin panel	`firebaseAuth.guard.ts`

Flujo de API Key

// apiKey.guard.ts - Flujo de validación

1. Lee header "ntx-api-key"
2. Hashea con SHA256
3. Busca en Firestore: `api_keys/{hash}`
4. Valida estado === "active"
5. Carga manifest: `clients/{userId}/private/manifest`
6. Valida dominio contra allowedDomains
7. Asigna a request:
   - req.client = { id, tId, email, fullName }
   - req.user = { mcpId, scopes, tools, rules, rag, persona }

Niveles de Verificación

Nivel	Descripción	Cómo se alcanza
0	Sin verificación	Estado inicial
1	Verificación básica	Después de validate_identity tool
2	Identidad validada	POST /chat/auth
3	Verificación adicional	Procesos especiales

MCP/Tools System

Arquitectura

El sistema de tools permite extender las capacidades del asistente con funciones personalizadas.

// Flujo de ejecución de tools

1. Manifest define tools disponibles para el cliente
2. getToolsForUser() formatea tools para OpenAI
3. OpenAI decide ejecutar un tool
4. runMcpTool(name, args, emitProgress, ctx) ejecuta
5. Handler procesa y retorna resultado
6. Resultado se envía a OpenAI para continuar

Tools Disponibles

Tool	Descripción	Handler
`get_context_for_query`	Busca contexto RAG en Pinecone	`getContextForQuery()`
`validate_identity`	Valida cédula con API externa	`validateIdentity()`
`emit_policy_*`	Emite pólizas de seguros	`emitPolicyExternal()`
`get_policies_by_personal_id`	Obtiene pólizas por cédula	`getPoliciesByPersonalId()`
`get_receipts_by_policy_number`	Obtiene recibos de póliza	`getReceiptsByPolicyNumber()`

McpContext

interface McpContext {
  firestore: Firestore;
  conversationId: string;
  conversation: {
    id: string;
    exists: boolean;
    verificationLevel: 0 | 1 | 2 | 3;
    identityValidated: boolean;
    manualMode: boolean;
    operatorUid?: string;
    status: string;
  };
  user: {
    mcpId: string;
    scopes: string[];
    tools: Tool[];
    rules: { must: string[]; mustNot: string[] };
    rag: { enabled: boolean; namespace: string; ... };
    persona?: { role: string; tone: string; assistantName: string };
  };
  manifest: object;
}

Archivos Principales

src/mcp/index.ts - getToolsForUser(), runMcpTool()
src/mcp/tools.handlers.ts - Implementación de handlers
src/mcp/types.ts - Tipos e interfaces

RAG System

Flujo de Indexación

Upload File → Extract Text → Chunk → Embed → Pinecone

Flujo de Query

User Query → Embed Query → Pinecone Search → Top K Chunks → Context

Configuración RAG en Manifest

{
  "rag": {
    "enabled": true,
    "namespace": "client_abc123",
    "pineconeIndex": "production",
    "topK": 5,
    "minScore": 0.3
  }
}

Archivos Principales

src/rag/docs-indexer.ts - Handlers de indexación
src/rag/utils/extractAndChunk.ts - Extracción y chunking
src/rag/utils/textSplitter.ts - Algoritmo de splitting
src/integrations/pinecone.ts - Cliente Pinecone
src/integrations/embeddingService.ts - Generación de embeddings

Rate Limiting

Overview

El sistema de rate limiting protege contra abuso y garantiza disponibilidad del servicio. Implementado en src/security/widgetRateLimit.ts.

Configuración por Embed Key

// Estructura en Firestore: embed_keys/{hash}
{
  "rateLimit": {
    "requestsPerMinute": 20,   // Default: 20
    "requestsPerDay": 1000     // Default: 1000
  }
}

Rate Limit Store (In-Memory)

interface RateLimitEntry {
  minuteCount: number;           // Requests en ventana actual
  minuteWindowStart: number;     // Timestamp inicio ventana
  dayCount: number;              // Requests del día
  dayWindowStart: number;
  lastRequest: number;
  blocked: boolean;              // Si está bloqueado
  blockUntil: number;            // Hasta cuándo
  // Detección de abuso
  newConversationsInWindow: number;  // Conversaciones nuevas en 5 min
  conversationWindowStart: number;
  abuseStrikes: number;              // Strikes acumulados
}

Constantes

Constante	Valor	Descripción
`DEFAULT_LIMIT_PER_MINUTE`	20	Requests por minuto por defecto
`DEFAULT_LIMIT_PER_DAY`	1000	Requests por día por defecto
`BLOCK_DURATION_MS`	5 min	Duración del bloqueo temporal
`ABUSE_WINDOW_MS`	5 min	Ventana para detectar abuso
`MAX_NEW_CONVERSATIONS_PER_WINDOW`	50	Máximo conversaciones nuevas en ventana
`MAX_ABUSE_STRIKES`	3	Strikes antes de pausar key

Flujo de Rate Limiting

// widgetRateLimit() - Middleware para /chat/widget/*

1. Extraer ntx-embed-key del header
2. Hashear con SHA256
3. Obtener límites de Firestore (con cache de 5 min)
4. Verificar si está bloqueado:
   - Si blocked && now < blockUntil → 429
5. Verificar límite por minuto:
   - Si minuteCount > limit → bloquear 5 min → 429
6. Verificar límite diario:
   - Si dayCount > limit → bloquear hasta reset → 429
7. Incrementar contadores y continuar

Detección de Abuso

// recordNewConversation() - Llamado en handleWidgetInit

1. Verificar ventana de 5 minutos
2. Incrementar newConversationsInWindow
3. Si > 50 conversaciones nuevas:
   a. Incrementar abuseStrikes
   b. Si abuseStrikes >= 3:
      - Pausar embed key en Firestore
      - Crear alerta para admin
      - Retornar { blocked: true, reason: "embed_key_paused_abuse" }
   c. Si no:
      - Bloquear temporalmente 5 min
      - Retornar { blocked: true, reason: "too_many_new_conversations" }

Pausar Key por Abuso

// pauseEmbedKeyForAbuse() - Acciones automáticas

1. Actualizar embed_keys/{hash}:
   - status: "paused"
   - pausedAt: Date
   - pauseReason: "Demasiadas conversaciones..."

2. Actualizar clients/{userId}/private/embedInfo/embedkeys/{keyId}:
   - Mismos campos

3. Crear alerta en clients/{userId}/alerts:
   {
     type: "embed_key_abuse",
     severity: "high",
     title: "Embed Key pausada por abuso",
     embedKeyId: keyId,
     stats: { newConversationsIn5Min, abuseStrikes },
     read: false
   }

4. Log estructurado: slog.event("widget.rate_limit.exceeded", {...})

Estados de Embed Key

Estado	Descripción	Cómo resolver
`active`	Funcionando normalmente	-
`paused`	Pausada automáticamente por abuso	Cambiar status a "active" desde admin
`revoked`	Revocada permanentemente	Crear nueva key

Logs de Rate Limiting

// Eventos de logging - Widget
widget.rate_limit.blocked     // Request bloqueado
widget.rate_limit.exceeded    // Límite excedido (incluye pausa)
widget.rate_limit.suspicious  // Patrón sospechoso detectado

// Eventos de logging - WhatsApp
whatsapp.rate_limit.blocked     // Request bloqueado
whatsapp.rate_limit.exceeded    // Límite global excedido
whatsapp.rate_limit.user_exceeded // Límite por usuario excedido

// Buscar en Cloud Run logs:
gcloud run logs read northtec-ia --region us-central1 | grep "rate_limit"

Archivos Principales

src/security/widgetRateLimit.ts - Rate limiting widget y detección de abuso
src/security/whatsappRateLimit.ts - Rate limiting WhatsApp (global + por usuario)
src/agents/widget/handleWidgetInit.ts - Integra recordNewConversation()
src/routes/widget.route.ts - preHandler: [widgetRateLimit]

WhatsApp Rate Limiting

El rate limiting de WhatsApp es de dos niveles: global por key y por usuario (phoneNumber).

Configuración por WhatsApp Key

// Estructura en Firestore: whatsapp_keys/{hash}
{
  "rateLimit": {
    "messagesPerMinute": 30,           // Global por key
    "messagesPerDay": 5000,            // Global por key
    "messagesPerUserPerMinute": 5,     // Por phoneNumber
    "messagesPerUserPerHour": 50       // Por phoneNumber
  }
}

Flujo de Rate Limiting WhatsApp

// checkWhatsAppRateLimit() - En handleWhatsAppWebhook

1. Verificar si key está bloqueada globalmente
2. Verificar límite global por minuto (30/min)
3. Verificar límite global por día (5000/día)
4. Verificar límite por usuario por minuto (5/min)
5. Verificar límite por usuario por hora (50/hora)
6. Si cualquier límite excede:
   - Retornar { allowed: false, reason: "..." }
7. Si abuso detectado (3 strikes):
   - Pausar WhatsApp key automáticamente
   - Crear alerta para admin

Constantes WhatsApp

Constante	Valor	Descripción
`DEFAULT_MESSAGES_PER_MINUTE`	30	Mensajes globales por minuto
`DEFAULT_MESSAGES_PER_DAY`	5000	Mensajes globales por día
`DEFAULT_USER_MESSAGES_PER_MINUTE`	5	Mensajes por usuario por minuto
`DEFAULT_USER_MESSAGES_PER_HOUR`	50	Mensajes por usuario por hora
`MAX_ABUSE_STRIKES`	3	Strikes antes de pausar key

Logging System

Structured Logger

El sistema usa logging estructurado con categorías y eventos tipados.

import { slog } from "./config/logger.js";

// Log de request de chat
slog.chatRequest({
  clientId: "client_123",
  conversationId: "conv_abc",
  source: "widget",
  platform: "web",
  role: "advisor",
  ragEnabled: true,
});

// Log de validación de identidad
slog.identityValidation({
  conversationId: "conv_abc",
  idType: "cedula",
  idNumber: "113030227",
  endpoint: "primary",
  success: true,
  httpStatus: 200,
});

// Log de RAG query
slog.ragQuery({
  conversationId: "conv_abc",
  namespace: "client_ns",
  chunksCount: 5,
  contextLength: 2500,
  success: true,
  durationMs: 234,
});

Categorías

Categoría	Eventos
CHAT	chat.request, chat.response, chat.error, chat.manual_mode
IDENTITY	identity.validate.primary_ok, identity.validate.primary_fail, etc.
RAG	rag.query.start, rag.query.success, rag.query.error
TOOL	tool.invoke.start, tool.invoke.success, tool.invoke.error
AUTH	auth.token.valid, auth.token.invalid, etc.
WIDGET	widget.init.success, widget.init.error, widget.rate_limit.*

💡 Referencia Completa

Ver logs-guide.html para guía completa de búsqueda de logs.

Folder Structure

northtec-ai/src/
├── agents/
│   ├── assistant/
│   │   ├── stream.ts              # Handler principal de chat
│   │   └── buildSystemPrompt.ts   # Constructor de system prompt
│   ├── widget/
│   │   ├── handleWidgetInit.ts    # Iniciar sesión widget
│   │   ├── handleWidgetEnd.ts     # Cerrar sesión
│   │   └── handleWidgetMessages.ts # Polling de mensajes
│   ├── whatsapp/
│   │   └── handleWhatsAppWebhook.ts # Procesar mensajes WhatsApp
│   └── utils/
│       └── locks.ts               # Mutex para race conditions
│
├── routes/
│   ├── chat.route.ts             # POST /chat/stream
│   ├── widget.route.ts           # /chat/widget/*
│   ├── whatsapp.route.ts         # /webhook/whatsapp
│   ├── whatsappKeys.route.ts     # /api/whatsapp-keys/*
│   ├── docs.route.ts             # /indexGeneral, /documents
│   ├── level2.route.ts           # POST /chat/auth
│   ├── apiKeys.route.ts          # /api/keys/*
│   ├── embedKeys.route.ts        # /api/embed-keys/*
│   └── client.route.ts           # /api/client/cedula
│
├── security/
│   ├── apiKey.guard.ts           # Valida ntx-api-key
│   ├── widgetSession.guard.ts    # Valida sesiones widget
│   ├── widgetRateLimit.ts        # Rate limiting widget
│   ├── whatsappWebhook.guard.ts  # Verifica firma Meta
│   ├── whatsappRateLimit.ts      # Rate limiting WhatsApp
│   ├── firebaseAuth.guard.ts     # Valida Firebase tokens
│   ├── docsAuth.guard.ts         # Auth para docs
│   └── planLimits.guard.ts       # Límites de planes
│
├── mcp/
│   ├── index.ts                  # getToolsForUser(), runMcpTool()
│   ├── tools.handlers.ts         # Implementación de tools
│   └── types.ts                  # McpContext, interfaces
│
├── rag/
│   ├── docs-indexer.ts           # Indexación de documentos
│   ├── services/
│   │   └── gemini.ts             # Integración Gemini
│   └── utils/
│       ├── extractAndChunk.ts    # Extracción de texto
│       └── textSplitter.ts       # Chunking
│
├── integrations/
│   ├── openai.client.ts          # Cliente OpenAI
│   ├── pinecone.ts               # Cliente Pinecone
│   ├── embeddingService.ts       # Servicio de embeddings
│   └── whatsapp.client.ts        # Cliente WhatsApp API
│
├── config/
│   ├── firebase.ts               # Inicialización Firebase
│   ├── logger.ts                 # Pino + StructuredLogger
│   └── toolLogger.ts             # Logging de tools
│
├── domain/
│   ├── domainAllowed.ts          # Validación de dominios
│   └── loadConversationContext.ts # Carga contexto conversación
│
├── app.ts                        # Setup Fastify, rutas, CORS
└── main.ts                       # Entry point (puerto 8080)

Security Guards

apiKey.guard.ts

Valida API keys para el endpoint principal de chat.

// Campos que agrega al request
req.client = {
  id: string,      // ID del cliente
  tId: string,     // Tenant ID
  email: string,
  fullName: string
};

req.user = {
  mcpId: string,
  scopes: string[],  // ["ntx.chat", "ntx.docs", ...]
  tools: Tool[],
  rules: { must: string[]; mustNot: string[] },
  rag: RagConfig,
  persona: PersonaConfig
};

widgetSession.guard.ts

Validación en cadena para sesiones de widget.

// Flujo de validación
1. Valida ntx-embed-key en embed_keys/{hash}
2. Valida ntx-session-token en conversación
3. Verifica expiración de sesión
4. Valida matching entre embed key y sesión
5. Valida dominio origen

firebaseAuth.guard.ts

Valida Firebase ID Tokens para endpoints de admin.

// Campos que agrega al request
req.firebaseUser = {
  uid: string,
  email: string,
  name: string
};

planLimits.guard.ts

Valida límites de planes antes de crear recursos.

// Funciones disponibles
canCreateDocument(uid, fileSizeBytes, fileExtension)
canCreateProduct(uid)
canInviteUser(uid)
getUsageStats(uid)
getTenantPlan(uid)

Main Handlers

stream.ts - handleStream()

Handler principal de chat con SSE streaming.

// Flujo principal
async function handleStream(req, reply) {
  // 1. Normalizar request
  const { message, imageUrl, conversationId } = normalizeRequest(req.body);

  // 2. Lock para evitar race conditions
  await withConversationLock(convId, async () => {

    // 3. Construir system prompt
    const systemPrompt = buildSystemPrompt(user, user.persona);

    // 4. Cargar contexto de conversación
    const conversation = await loadConversationContext({...});

    // 5. Ejecutar chat con tools
    const { text } = await runChatWithToolsStream({
      openai,
      messages: userInput,
      sendJSON,
      system: systemPrompt,
      toolsForUser,
      ctx: mcpCtx,
    });

    // 6. Guardar mensaje
    await conversationRef.collection("messages").add({...});
  });
}

handleWidgetInit.ts

// Flujo de inicialización de widget
async function handleWidgetInit(req, reply) {
  // 1. Validar embed key
  const embedKeyDoc = await db.collection("embed_keys").doc(hash).get();

  // 2. Validar dominio
  if (!isDomainAllowed(origin, embedKey.allowedDomains)) { ... }

  // 3. Generar session token
  const sessionToken = `sess_\${crypto.randomUUID()}`;

  // 4. Crear conversación
  await conversationRef.set({
    session: { token: sessionToken, expiresAt, embedKeyId },
    ...
  });

  // 5. Retornar credenciales
  return { sessionToken, conversationId, assistantName, greeting };
}

Deployment

Cloud Run Deploy

# Build image
gcloud builds submit --tag gcr.io/northtec107/northtec-ia

# Deploy
gcloud run deploy northtec-ia \
  --image gcr.io/northtec107/northtec-ia \
  --region us-central1 \
  --allow-unauthenticated \
  --memory 1Gi \
  --cpu 1 \
  --timeout 300 \
  --concurrency 80

Firebase Functions Deploy

# Deploy solo functions
cd northtec-fb
firebase deploy --only functions

# Deploy function específica
firebase deploy --only functions:onUserCreate

⚠️ Importante

Siempre verificar que el build compile antes de deploy:

npm run build

Environment Variables

Cloud Run

Variable	Descripción	Requerida
`OPENAI_API_KEY`	API key de OpenAI	✅
`PINECONE_API_KEY`	API key de Pinecone	✅
`GOOGLE_CLOUD_PROJECT`	ID del proyecto GCP	✅
`CLIENTS_API_KEY`	Key para API de clientes externa	✅
`LOG_LEVEL`	Nivel de logging (info, debug)	❌
`NODE_ENV`	production / development	❌

Configurar Secrets en Cloud Run

# Crear secret
gcloud secrets create OPENAI_API_KEY --data-file=- <<< "sk-..."

# Dar acceso al service account
gcloud secrets add-iam-policy-binding OPENAI_API_KEY \
  --member="serviceAccount:PROJECT_NUMBER-compute@developer.gserviceaccount.com" \
  --role="roles/secretmanager.secretAccessor"

# Deploy con secret
gcloud run deploy northtec-ia \
  --set-secrets="OPENAI_API_KEY=OPENAI_API_KEY:latest"

Troubleshooting

Errores Comunes

❌ Container failed to start on PORT 8080

Causa: Error en el código que impide que el servidor inicie.

Solución:

# Verificar logs de inicio
gcloud run logs read northtec-ia --region us-central1 --limit 50

# Probar localmente
npm run build && npm start

❌ 401 Unauthorized - Invalid API key

Causa: API key no encontrada o revocada.

Verificar:

# En Firestore, verificar que existe:
api_keys/{SHA256_HASH_OF_KEY}
  status: "active"
  userId: "..."

❌ RAG no retorna contexto

Verificar:

Manifest tiene rag.enabled: true
Namespace correcto en Pinecone
minScore no es muy alto (default: 0.3)

# Buscar logs de RAG
gcloud run logs read northtec-ia --region us-central1 | grep "rag.query"

❌ Identity validation fails

Verificar logs:

gcloud run logs read northtec-ia --region us-central1 | grep "IDENTITY"

Los logs mostrarán:

identity.validate.primary_ok/fail - Resultado del endpoint primario
identity.validate.failover_ok/fail - Resultado del failover
rawResponse - Respuesta completa del API

Comandos Útiles

# Ver logs en tiempo real
gcloud beta run logs tail northtec-ia --region us-central1

# Ver errores recientes
gcloud run logs read northtec-ia --region us-central1 --limit 100 | grep -i error

# Ver estado del servicio
gcloud run services describe northtec-ia --region us-central1

# Rollback a revisión anterior
gcloud run services update-traffic northtec-ia \
  --region us-central1 \
  --to-revisions=northtec-ia-00042-abc=100