Surveiller la performance des proxies : latence, taux de succès et alertes

Apprenez comment instrumenter, surveiller et alerter les performances de proxy – suivre les percentiles de latence, les taux de réussite, les modèles d'erreur et la bande passante. Exemples de code dans Python, Node.js et Go.

Surveiller la performance des proxies : latence, taux de succès et alertes

Pourquoi surveiller la performance proxy

L'infrastructure proxy échoue silencieusement. Votre racleur pourrait courir pendant des heures avec un taux de réussite de 40% avant que quelqu'un ne remarque. Les temps de réponse augmentent, les taux de blocage augmentent et la qualité des données se dégrade — tout cela sans déclencher d'erreurs évidentes. La surveillance transforme ces problèmes invisibles en alertes actionnables.

Ce guide vous montre comment instrumenter vos demandes de procuration, recueillir des mesures significatives, construire des tableaux de bord, et mettre en place une alerte que la dégradation des captures avant qu'elle n'affecte votre pipeline de données. Tous les exemples utilisent Proxies résidentielles de ProxyHat et sont prêts à la production.

Si vous ne mesurez pas votre performance par procuration, vous devinez. Deviner à l'échelle coûte de l'argent et produit des données peu fiables.

Chiffres clés à suivre

Chiffres clés à suivre
métriqueCe que ça te ditSeuil d'alerte
Taux de réussitePourcentage de demandes de retour de statut 2xxMoins de 90 %
Latence de réponse (p50/p95/p99)La rapidité avec laquelle les demandes proxiées sont complétéesp95 supérieur à 10s
Taux d'erreur par typeQuelles erreurs dominent (timeout, 403, 429, connexion)Tout type unique supérieur à 15%
Demandes par secondeLe débit de votre pipeline de démolitionInférieur au niveau de référence prévu
Utilisation de la largeur de bandeDonnées transférées par proxyApproche de la limite du plan
Taux par objectifQuelles cibles vous bloquent le plusPlus de 20 % pour tout objectif
Taux de réessayerCombien de demandes ont besoin de récupérationPlus de 30%
Efficacité de la réutilisation des séancesCombien de temps les séances collantes surviventMoins de 5 demandes en moyenne

Python : Client mandataire instrumenté

Ce client enveloppe chaque demande avec le timing, le suivi de l'état et l'enregistrement structuré.

import time
import uuid
import logging
import statistics
from dataclasses import dataclass, field
from collections import defaultdict
from typing import Optional
import requests
logger = logging.getLogger("proxy_monitor")
@dataclass
class ProxyMetrics:
    """Collects and aggregates proxy performance metrics."""
    total_requests: int = 0
    successful: int = 0
    failed: int = 0
    retries: int = 0
    latencies: list = field(default_factory=list)
    status_codes: dict = field(default_factory=lambda: defaultdict(int))
    errors_by_type: dict = field(default_factory=lambda: defaultdict(int))
    bytes_transferred: int = 0
    requests_by_target: dict = field(default_factory=lambda: defaultdict(lambda: {"success": 0, "failed": 0}))
    @property
    def success_rate(self) -> float:
        return (self.successful / self.total_requests * 100) if self.total_requests > 0 else 0.0
    @property
    def p50_latency(self) -> float:
        return statistics.median(self.latencies) if self.latencies else 0.0
    @property
    def p95_latency(self) -> float:
        if not self.latencies:
            return 0.0
        sorted_lat = sorted(self.latencies)
        idx = int(len(sorted_lat) * 0.95)
        return sorted_lat[min(idx, len(sorted_lat) - 1)]
    @property
    def p99_latency(self) -> float:
        if not self.latencies:
            return 0.0
        sorted_lat = sorted(self.latencies)
        idx = int(len(sorted_lat) * 0.99)
        return sorted_lat[min(idx, len(sorted_lat) - 1)]
    def summary(self) -> dict:
        return {
            "total_requests": self.total_requests,
            "success_rate": f"{self.success_rate:.1f}%",
            "p50_latency": f"{self.p50_latency:.3f}s",
            "p95_latency": f"{self.p95_latency:.3f}s",
            "p99_latency": f"{self.p99_latency:.3f}s",
            "retries": self.retries,
            "bytes_transferred": self.bytes_transferred,
            "top_errors": dict(sorted(
                self.errors_by_type.items(),
                key=lambda x: x[1], reverse=True
            )[:5]),
            "status_distribution": dict(self.status_codes),
        }
class MonitoredProxyClient:
    """HTTP client with built-in proxy monitoring."""
    def __init__(self, max_retries: int = 3):
        self.metrics = ProxyMetrics()
        self.max_retries = max_retries
        self._alert_callbacks = []
    def on_alert(self, callback):
        """Register a callback for metric alerts."""
        self._alert_callbacks.append(callback)
    def _check_alerts(self):
        if self.metrics.total_requests < 10:
            return
        alerts = []
        if self.metrics.success_rate < 90:
            alerts.append(f"Low success rate: {self.metrics.success_rate:.1f}%")
        if self.metrics.p95_latency > 10:
            alerts.append(f"High p95 latency: {self.metrics.p95_latency:.1f}s")
        if self.metrics.retries / max(self.metrics.total_requests, 1) > 0.3:
            alerts.append(f"High retry rate: {self.metrics.retries}/{self.metrics.total_requests}")
        for alert in alerts:
            logger.warning(f"ALERT: {alert}")
            for cb in self._alert_callbacks:
                cb(alert)
    def fetch(self, url: str, country: Optional[str] = None) -> Optional[requests.Response]:
        from urllib.parse import urlparse
        target_domain = urlparse(url).netloc
        for attempt in range(self.max_retries + 1):
            session_id = uuid.uuid4().hex[:8]
            username = f"USERNAME-session-{session_id}"
            if country:
                username += f"-country-{country}"
            proxy = f"http://{username}:PASSWORD@gate.proxyhat.com:8080"
            self.metrics.total_requests += 1
            if attempt > 0:
                self.metrics.retries += 1
            start = time.time()
            try:
                response = requests.get(
                    url,
                    proxies={"http": proxy, "https": proxy},
                    timeout=30,
                )
                latency = time.time() - start
                self.metrics.latencies.append(latency)
                self.metrics.status_codes[response.status_code] += 1
                if response.status_code >= 400:
                    self.metrics.errors_by_type[f"HTTP_{response.status_code}"] += 1
                    self.metrics.requests_by_target[target_domain]["failed"] += 1
                    if response.status_code in (403, 429, 503) and attempt < self.max_retries:
                        time.sleep(2 ** attempt)
                        continue
                    self.metrics.failed += 1
                else:
                    self.metrics.successful += 1
                    self.metrics.bytes_transferred += len(response.content)
                    self.metrics.requests_by_target[target_domain]["success"] += 1
                self._check_alerts()
                return response
            except requests.exceptions.Timeout:
                self.metrics.errors_by_type["timeout"] += 1
                self.metrics.latencies.append(time.time() - start)
                self.metrics.requests_by_target[target_domain]["failed"] += 1
            except requests.exceptions.ConnectionError:
                self.metrics.errors_by_type["connection_error"] += 1
                self.metrics.latencies.append(time.time() - start)
                self.metrics.requests_by_target[target_domain]["failed"] += 1
            except Exception as e:
                self.metrics.errors_by_type[type(e).__name__] += 1
                self.metrics.latencies.append(time.time() - start)
            if attempt < self.max_retries:
                time.sleep(2 ** attempt)
        self.metrics.failed += 1
        self._check_alerts()
        return None
# Usage
client = MonitoredProxyClient(max_retries=3)
client.on_alert(lambda msg: print(f"[ALERT] {msg}"))
urls = [f"https://example.com/product/{i}" for i in range(100)]
for url in urls:
    response = client.fetch(url)
print(client.metrics.summary())

Node.js: Client mandataire instrumenté

const crypto = require('crypto');
const { HttpsProxyAgent } = require('https-proxy-agent');
const { EventEmitter } = require('events');
class ProxyMetrics {
  constructor() {
    this.totalRequests = 0;
    this.successful = 0;
    this.failed = 0;
    this.retries = 0;
    this.latencies = [];
    this.statusCodes = {};
    this.errorsByType = {};
    this.bytesTransferred = 0;
    this.requestsByTarget = {};
  }
  get successRate() {
    return this.totalRequests > 0
      ? ((this.successful / this.totalRequests) * 100).toFixed(1)
      : '0.0';
  }
  percentile(p) {
    if (this.latencies.length === 0) return 0;
    const sorted = [...this.latencies].sort((a, b) => a - b);
    const idx = Math.min(
      Math.floor(sorted.length * (p / 100)),
      sorted.length - 1
    );
    return sorted[idx];
  }
  summary() {
    return {
      totalRequests: this.totalRequests,
      successRate: `${this.successRate}%`,
      p50Latency: `${this.percentile(50).toFixed(3)}s`,
      p95Latency: `${this.percentile(95).toFixed(3)}s`,
      p99Latency: `${this.percentile(99).toFixed(3)}s`,
      retries: this.retries,
      bytesTransferred: this.bytesTransferred,
      statusDistribution: { ...this.statusCodes },
      topErrors: Object.entries(this.errorsByType)
        .sort(([, a], [, b]) => b - a)
        .slice(0, 5)
        .reduce((obj, [k, v]) => ({ ...obj, [k]: v }), {}),
    };
  }
}
class MonitoredProxyClient extends EventEmitter {
  constructor({ maxRetries = 3 } = {}) {
    super();
    this.metrics = new ProxyMetrics();
    this.maxRetries = maxRetries;
  }
  _checkAlerts() {
    if (this.metrics.totalRequests < 10) return;
    if (parseFloat(this.metrics.successRate) < 90) {
      this.emit('alert', `Low success rate: ${this.metrics.successRate}%`);
    }
    if (this.metrics.percentile(95) > 10) {
      this.emit('alert', `High p95 latency: ${this.metrics.percentile(95).toFixed(1)}s`);
    }
  }
  async fetch(url, { country } = {}) {
    const targetDomain = new URL(url).hostname;
    for (let attempt = 0; attempt <= this.maxRetries; attempt++) {
      const sessionId = crypto.randomBytes(4).toString('hex');
      let username = `USERNAME-session-${sessionId}`;
      if (country) username += `-country-${country}`;
      const agent = new HttpsProxyAgent(
        `http://${username}:PASSWORD@gate.proxyhat.com:8080`
      );
      this.metrics.totalRequests++;
      if (attempt > 0) this.metrics.retries++;
      const startTime = Date.now();
      try {
        const response = await fetch(url, {
          agent,
          signal: AbortSignal.timeout(30000),
        });
        const latency = (Date.now() - startTime) / 1000;
        this.metrics.latencies.push(latency);
        this.metrics.statusCodes[response.status] =
          (this.metrics.statusCodes[response.status] || 0) + 1;
        if (response.status >= 400) {
          this.metrics.errorsByType[`HTTP_${response.status}`] =
            (this.metrics.errorsByType[`HTTP_${response.status}`] || 0) + 1;
          if ([403, 429, 503].includes(response.status) && attempt < this.maxRetries) {
            await new Promise(r => setTimeout(r, 1000 * Math.pow(2, attempt)));
            continue;
          }
          this.metrics.failed++;
        } else {
          this.metrics.successful++;
          const body = await response.text();
          this.metrics.bytesTransferred += body.length;
        }
        this._checkAlerts();
        return response;
      } catch (err) {
        const latency = (Date.now() - startTime) / 1000;
        this.metrics.latencies.push(latency);
        this.metrics.errorsByType[err.name] =
          (this.metrics.errorsByType[err.name] || 0) + 1;
        if (attempt < this.maxRetries) {
          await new Promise(r => setTimeout(r, 1000 * Math.pow(2, attempt)));
          continue;
        }
        this.metrics.failed++;
      }
    }
    this._checkAlerts();
    return null;
  }
}
// Usage
const client = new MonitoredProxyClient({ maxRetries: 3 });
client.on('alert', msg => console.warn(`[ALERT] ${msg}`));
const urls = Array.from({ length: 100 }, (_, i) =>
  `https://example.com/product/${i + 1}`
);
for (const url of urls) {
  await client.fetch(url);
}
console.log(client.metrics.summary());

Aller : Client mandataire instrumenté

package main
import (
	"crypto/rand"
	"encoding/hex"
	"fmt"
	"io"
	"math"
	"net/http"
	"net/url"
	"sort"
	"sync"
	"time"
)
type Metrics struct {
	mu             sync.Mutex
	TotalRequests  int
	Successful     int
	Failed         int
	Retries        int
	Latencies      []float64
	StatusCodes    map[int]int
	ErrorsByType   map[string]int
	BytesTransferred int64
}
func NewMetrics() *Metrics {
	return &Metrics{
		StatusCodes:  make(map[int]int),
		ErrorsByType: make(map[string]int),
	}
}
func (m *Metrics) RecordSuccess(latency float64, status int, bytes int) {
	m.mu.Lock()
	defer m.mu.Unlock()
	m.TotalRequests++
	m.Successful++
	m.Latencies = append(m.Latencies, latency)
	m.StatusCodes[status]++
	m.BytesTransferred += int64(bytes)
}
func (m *Metrics) RecordFailure(latency float64, errType string) {
	m.mu.Lock()
	defer m.mu.Unlock()
	m.TotalRequests++
	m.Failed++
	m.Latencies = append(m.Latencies, latency)
	m.ErrorsByType[errType]++
}
func (m *Metrics) Percentile(p float64) float64 {
	m.mu.Lock()
	defer m.mu.Unlock()
	if len(m.Latencies) == 0 {
		return 0
	}
	sorted := make([]float64, len(m.Latencies))
	copy(sorted, m.Latencies)
	sort.Float64s(sorted)
	idx := int(math.Min(float64(len(sorted)-1), float64(len(sorted))*p/100))
	return sorted[idx]
}
func (m *Metrics) SuccessRate() float64 {
	m.mu.Lock()
	defer m.mu.Unlock()
	if m.TotalRequests == 0 {
		return 0
	}
	return float64(m.Successful) / float64(m.TotalRequests) * 100
}
func (m *Metrics) Summary() string {
	return fmt.Sprintf(
		"Requests: %d | Success: %.1f%% | p50: %.3fs | p95: %.3fs | p99: %.3fs | Retries: %d",
		m.TotalRequests, m.SuccessRate(),
		m.Percentile(50), m.Percentile(95), m.Percentile(99),
		m.Retries,
	)
}
type MonitoredClient struct {
	metrics    *Metrics
	maxRetries int
}
func NewMonitoredClient(maxRetries int) *MonitoredClient {
	return &MonitoredClient{
		metrics:    NewMetrics(),
		maxRetries: maxRetries,
	}
}
func (c *MonitoredClient) Fetch(target string) (*http.Response, error) {
	for attempt := 0; attempt <= c.maxRetries; attempt++ {
		b := make([]byte, 4)
		rand.Read(b)
		sessionID := hex.EncodeToString(b)
		proxyStr := fmt.Sprintf(
			"http://USERNAME-session-%s:PASSWORD@gate.proxyhat.com:8080",
			sessionID,
		)
		proxyURL, _ := url.Parse(proxyStr)
		client := &http.Client{
			Transport: &http.Transport{Proxy: http.ProxyURL(proxyURL)},
			Timeout:   30 * time.Second,
		}
		if attempt > 0 {
			c.metrics.mu.Lock()
			c.metrics.Retries++
			c.metrics.mu.Unlock()
		}
		start := time.Now()
		resp, err := client.Get(target)
		latency := time.Since(start).Seconds()
		if err != nil {
			c.metrics.RecordFailure(latency, "connection_error")
			if attempt < c.maxRetries {
				time.Sleep(time.Duration(math.Pow(2, float64(attempt))) * time.Second)
				continue
			}
			return nil, err
		}
		body, _ := io.ReadAll(resp.Body)
		resp.Body.Close()
		if resp.StatusCode >= 400 {
			c.metrics.RecordFailure(latency, fmt.Sprintf("HTTP_%d", resp.StatusCode))
			if attempt < c.maxRetries {
				time.Sleep(time.Duration(math.Pow(2, float64(attempt))) * time.Second)
				continue
			}
		} else {
			c.metrics.RecordSuccess(latency, resp.StatusCode, len(body))
		}
		return resp, nil
	}
	return nil, fmt.Errorf("all retries exhausted for %s", target)
}
func main() {
	client := NewMonitoredClient(3)
	for i := 0; i < 50; i++ {
		url := fmt.Sprintf("https://example.com/product/%d", i+1)
		client.Fetch(url)
	}
	fmt.Println(client.metrics.Summary())
}

Logage structuré pour les demandes de procuration

Les registres structurés par JSON permettent d'agréger et d'analyser facilement les performances des proxys sur les racleurs distribués.

import json
import logging
import time
import uuid
import requests
class JSONProxyLogger:
    """Logs every proxy request as structured JSON."""
    def __init__(self, log_file: str = "proxy_requests.jsonl"):
        self.logger = logging.getLogger("proxy_json")
        handler = logging.FileHandler(log_file)
        handler.setFormatter(logging.Formatter("%(message)s"))
        self.logger.addHandler(handler)
        self.logger.setLevel(logging.INFO)
    def log_request(self, entry: dict):
        self.logger.info(json.dumps(entry))
    def fetch(self, url: str, country: str = None) -> requests.Response:
        session_id = uuid.uuid4().hex[:8]
        username = f"USERNAME-session-{session_id}"
        if country:
            username += f"-country-{country}"
        proxy = f"http://{username}:PASSWORD@gate.proxyhat.com:8080"
        start = time.time()
        try:
            response = requests.get(
                url,
                proxies={"http": proxy, "https": proxy},
                timeout=30,
            )
            latency = time.time() - start
            self.log_request({
                "timestamp": time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()),
                "url": url,
                "status": response.status_code,
                "latency_ms": round(latency * 1000),
                "bytes": len(response.content),
                "session_id": session_id,
                "country": country,
                "success": response.status_code < 400,
            })
            return response
        except Exception as e:
            latency = time.time() - start
            self.log_request({
                "timestamp": time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()),
                "url": url,
                "error": str(e),
                "error_type": type(e).__name__,
                "latency_ms": round(latency * 1000),
                "session_id": session_id,
                "country": country,
                "success": False,
            })
            raise
# Usage — logs produce JSONL like:
# {"timestamp":"2026-02-26T10:30:00Z","url":"https://...","status":200,"latency_ms":1234,...}
proxy_logger = JSONProxyLogger("proxy_requests.jsonl")
response = proxy_logger.fetch("https://example.com/data", country="us")

Rapports périodiques sur la santé

Pour les racleurs de longue durée, produire des rapports périodiques sur la santé qui résument les performances sur les fenêtres fixes.

import time
import threading
from datetime import datetime
class PeriodicReporter:
    """Generates periodic performance reports from proxy metrics."""
    def __init__(self, metrics: ProxyMetrics, interval_seconds: int = 60):
        self.metrics = metrics
        self.interval = interval_seconds
        self._running = False
        self._thread = None
        self._last_snapshot = None
    def start(self):
        self._running = True
        self._last_snapshot = self._snapshot()
        self._thread = threading.Thread(target=self._report_loop, daemon=True)
        self._thread.start()
    def stop(self):
        self._running = False
    def _snapshot(self) -> dict:
        return {
            "total": self.metrics.total_requests,
            "success": self.metrics.successful,
            "failed": self.metrics.failed,
            "retries": self.metrics.retries,
            "time": time.time(),
        }
    def _report_loop(self):
        while self._running:
            time.sleep(self.interval)
            current = self._snapshot()
            prev = self._last_snapshot
            elapsed = current["time"] - prev["time"]
            requests_delta = current["total"] - prev["total"]
            success_delta = current["success"] - prev["success"]
            failed_delta = current["failed"] - prev["failed"]
            rps = requests_delta / elapsed if elapsed > 0 else 0
            window_success_rate = (
                (success_delta / requests_delta * 100)
                if requests_delta > 0 else 0
            )
            report = {
                "window": f"{self.interval}s",
                "timestamp": datetime.utcnow().isoformat(),
                "requests": requests_delta,
                "rps": round(rps, 1),
                "success_rate": f"{window_success_rate:.1f}%",
                "failed": failed_delta,
                "cumulative_success_rate": f"{self.metrics.success_rate:.1f}%",
                "p95_latency": f"{self.metrics.p95_latency:.3f}s",
            }
            print(f"[REPORT] {report}")
            self._last_snapshot = current
# Usage with MonitoredProxyClient
client = MonitoredProxyClient(max_retries=3)
reporter = PeriodicReporter(client.metrics, interval_seconds=30)
reporter.start()
# Scrape away — reports print every 30 seconds
for url in urls:
    client.fetch(url)
reporter.stop()

Règles et seuils d'alerte

Mettre en place une alerte intelligente qui évite les faux positifs pendant les périodes d'échauffement et les blips transitoires.

Règles et seuils d'alerte
AlerteÉtatRefroidissementDécision
Faible taux de réussiteMoins de 90% sur 5 minutes10 minEnquêter sur les blocs cibles, vérifier le pool mandataire
Haute latencep95 au-dessus de 10s sur 2 minutes fenêtre5 minRéduire la convergence, vérifier la santé cible
Erreur SpikeType d'erreur unique dépasse 20% des requêtes5 minVérifier si la cible a changé, tourner géo position
La largeur de bande SpikeLe taux de transfert double de la valeur de référence15 minVérifier le comportement attendu, vérifier les boucles de redirection
Débit zéroAucune demande acceptée en 2 minutes2 minVérifier la connectivité proxy, vérifier les identifiants
Une bonne surveillance est la différence entre un pipeline de démolition qui fonctionne de façon fiable pendant des mois et un pipeline qui produit silencieusement des données sur les ordures. Investissez dès le départ dans l'instrumentation — elle se paie sur le premier incident de production que vous attrapez tôt.

Pour construire l'intergiciel qui alimente ces mesures, voir Bâtir un calque mandataire. Pour optimiser le débit à côté de la surveillance, lire Élargissement des demandes de procuration avec contrôle de l'actualité. Pour la conception complète du système, voir Concevoir une architecture de scraping fiable.

Explorer Python SDK, Numéro SDKet Allez au SDK pour l'intégration de proxy, ou vérifier Prix ProxyHat et la documentation pour commencer.

Foire aux questions

Prêt à commencer ?

Accédez à plus de 50M d'IPs résidentielles dans plus de 148 pays avec filtrage IA.

Voir les tarifsProxies résidentiels
← Retour au Blog