Ever debugged a production issue by running grep "ERROR" across 50GB of text logs only to find thousands of matches with zero context? JSON logging transforms those unstructured walls of text into queryable data structures that monitoring tools can actually filter, index, and search in seconds. This guide shows you how to configure JSON logging in Python using python-json-logger and structlog, with examples for adding trace IDs, custom fields, and request context that make debugging distributed systems way less painful.
Implementing Structured Python Logging with JSON Output

JSON logging turns your log messages into key-value pairs that monitoring tools can parse and index automatically. Instead of “User john@example.com logged in successfully” as plain text, you get {"timestamp": "2024-01-15T10:30:45Z", "level": "INFO", "message": "User logged in", "email": "john@example.com", "action": "login"}.
Here’s the simplest way to get started with python-json-logger:
import logging
from pythonjsonlogger import jsonlogger
logger = logging.getLogger()
logHandler = logging.StreamHandler()
formatter = jsonlogger.JsonFormatter()
logHandler.setFormatter(formatter)
logger.addHandler(logHandler)
logger.info('test message', extra={'user': 'john', 'ip': '192.168.1.1'})
This imports what you need, creates a logger, attaches a StreamHandler that outputs to console, applies the JsonFormatter, and logs a message with extra context. The output looks like {"message": "test message", "user": "john", "ip": "192.168.1.1", "levelname": "INFO", "name": "root"} instead of traditional text.
Key benefits:
- Machine-readable format that log platforms can parse without custom regex patterns or grok filters
- Easy parsing by tools like AWS CloudWatch, Datadog, and Grafana Loki without preprocessing
- Better monitoring integration through structured field queries instead of full-text search
- Simplified analysis with queries like “show all ERROR logs where user_id equals 12345” without complex text parsing
Use JSON logging when you’re running apps in production where logs feed into centralized monitoring. Stick with text logging for quick local debugging scripts or when you’re the only person reading logs in terminal where human readability matters more.
Configuration Approaches and Libraries for Python JSON Logging

Python offers multiple ways to set up JSON logging, each suited for different app architectures and complexity levels.
Programmatic Configuration with python-json-logger
The most straightforward approach configures loggers directly in your code:
import logging
from pythonjsonlogger import jsonlogger
logger = logging.getLogger(__name__)
logHandler = logging.StreamHandler()
formatter = jsonlogger.JsonFormatter('%(timestamp)s %(level)s %(name)s %(message)s')
logHandler.setFormatter(formatter)
logger.addHandler(logHandler)
logger.setLevel(logging.INFO)
logger.info('Application started', extra={'version': '1.2.3'})
This works best for small apps or microservices where you want a drop-in replacement without complex config files. The python-json-logger library wraps Python’s built-in logging module, making migration from text to JSON a matter of changing the formatter.
Dictionary-Based Configuration with dictConfig
For complex apps needing different handlers, formatters, and loggers per module, dictConfig provides structure:
import logging.config
from pythonjsonlogger import jsonlogger
LOGGING_CONFIG = {
'version': 1,
'disable_existing_loggers': False,
'formatters': {
'json': {
'()': jsonlogger.JsonFormatter,
'format': '%(timestamp)s %(level)s %(name)s %(message)s'
}
},
'handlers': {
'console': {
'class': 'logging.StreamHandler',
'formatter': 'json',
'level': 'INFO'
}
},
'loggers': {
'': {
'handlers': ['console'],
'level': 'INFO'
}
}
}
logging.config.dictConfig(LOGGING_CONFIG)
logger = logging.getLogger(__name__)
logger.info('Service initialized')
This scales well when you need different log levels per module, multiple outputs, or environment-specific settings from config files.
Basic Configuration with basicConfig
For quick setup in small scripts or testing:
import logging
from pythonjsonlogger import jsonlogger
logging.basicConfig(level=logging.INFO)
logging.root.handlers[0].setFormatter(jsonlogger.JsonFormatter())
logging.info('Quick test message')
While python-json-logger handles most cases, structlog offers a different approach through processor pipelines. Structlog builds log entries by passing them through a chain of processors that add fields, format timestamps, and convert to JSON:
import structlog
structlog.configure(
processors=[
structlog.processors.TimeStamper(fmt="iso"),
structlog.processors.StackInfoRenderer(),
structlog.processors.format_exc_info,
structlog.processors.JSONRenderer()
]
)
logger = structlog.get_logger()
logger.info('user_action', user_id=42, action='login')
Structlog’s processor architecture gives you fine-grained control over formatting and supports advanced features like log filtering, field renaming, and custom serialization. Processors execute in order, and JSONRenderer must come last since it converts the dictionary to a JSON string.
If you need complete control over JSON structure or want to add computed fields to every entry, build a custom JSONFormatter:
import logging
import json
import datetime
class CustomJSONFormatter(logging.Formatter):
def format(self, record):
log_data = {
'timestamp': datetime.datetime.utcnow().isoformat(),
'level': record.levelname,
'message': record.getMessage(),
'module': record.module,
'app_name': 'my-service'
}
return json.dumps(log_data)
Choose python-json-logger for straightforward JSON conversion with minimal overhead. Pick structlog when you need flexible processing pipelines or complex custom processors. Build a custom formatter only when you have specific requirements around field names, computed values, or JSON structure that existing libraries don’t support.
Customizing JSON Log Structure with Fields and Contextual Information

Consistent log structure across your microservices enables powerful queries and faster debugging when tracking requests through distributed systems.
Adding custom fields to individual messages happens through the extra parameter:
import logging
from pythonjsonlogger import jsonlogger
logger = logging.getLogger(__name__)
handler = logging.StreamHandler()
handler.setFormatter(jsonlogger.JsonFormatter())
logger.addHandler(handler)
logger.info(
'User performed action',
extra={
'user_id': 12345,
'request_id': 'req-abc-123',
'endpoint': '/api/users/profile',
'duration_ms': 145
}
)
Python’s LogRecord object provides standard attributes automatically included in each entry: levelname (INFO, ERROR, etc.), message (the log text), timestamp (when it occurred), module (which Python module generated it), funcName (the function name), lineno (line number), and pathname (full file path).
Common fields to add for production microservices:
- request_id for tracking individual API requests from start to finish
- correlation_id for tracing requests across multiple services
- user_id to identify which user triggered the logged action
- environment to distinguish logs from dev, staging, and production
- service_name when aggregating logs from multiple microservices
- version to correlate logs with specific releases
- hostname to identify which server or container generated the log
- trace_id for distributed tracing integration with tools like Jaeger
Creating a custom formatter that adds static fields to every log automatically:
import logging
import json
import socket
class StaticFieldFormatter(logging.Formatter):
def __init__(self, static_fields=None):
super().__init__()
self.static_fields = static_fields or {}
def format(self, record):
log_data = {
'message': record.getMessage(),
'level': record.levelname,
'timestamp': self.formatTime(record),
}
log_data.update(self.static_fields)
return json.dumps(log_data)
formatter = StaticFieldFormatter({
'service': 'user-api',
'environment': 'production',
'hostname': socket.gethostname()
})
Correlation IDs and request tracking become critical in microservice architectures. When a single API request triggers calls across five different services, debugging which service caused an error requires checking logs in multiple places and guessing which entries relate to the same user action. Without request tracing, you’re stuck.
Request tracing using threading.local for storing trace IDs:
import threading
import uuid
import logging
from pythonjsonlogger import jsonlogger
class RequestContext:
_context = threading.local()
@classmethod
def set_request_id(cls, request_id=None):
cls._context.request_id = request_id or str(uuid.uuid4())
@classmethod
def get_request_id(cls):
return getattr(cls._context, 'request_id', None)
class ContextFilter(logging.Filter):
def filter(self, record):
record.request_id = RequestContext.get_request_id()
return True
logger = logging.getLogger(__name__)
logger.addFilter(ContextFilter())
Generate unique trace IDs using Python’s uuid module: trace_id = str(uuid.uuid4()) produces values like “f47ac10b-58cc-4372-a567-0e02b2c3d479” that are globally unique and sortable.
Middleware that extracts trace IDs from incoming HTTP requests and injects them into logging context:
import uuid
class TracingMiddleware:
def process_request(self, req, resp):
trace_id = req.get_header('X-Trace-ID')
if not trace_id:
trace_id = str(uuid.uuid4())
RequestContext.set_request_id(trace_id)
resp.set_header('X-Trace-ID', trace_id)
def process_response(self, req, resp, resource, req_succeeded):
RequestContext.set_request_id(None)
The LoggerAdapter class offers another way to add contextual fields without custom formatters. Create an adapter with a dictionary of extra fields, and those fields appear in all log calls through that adapter:
logger = logging.getLogger(__name__)
contextual_logger = logging.LoggerAdapter(logger, {'user_id': 12345, 'session_id': 'abc123'})
contextual_logger.info('Action performed')
Maintain consistent field naming across all microservices. If one service logs user_id and another logs userId, queries that filter by user become unreliable. Standardize on snake_case or camelCase and document required fields in a shared spec.
Query logs by correlation ID in platforms like Grafana Loki using {service="user-api"} | json | request_id="f47ac10b-58cc-4372-a567-0e02b2c3d479" to see every log entry related to a single request as it flows through your entire system. This turns debugging distributed systems from guesswork into a straightforward trace of exactly what happened.
Configuring Handlers and Managing Serialization in JSON Logs

Handlers determine where your JSON logs end up (console, file, remote service), while proper serialization ensures complex Python objects convert cleanly to JSON without errors.
Configure multiple handlers to output logs to different destinations simultaneously:
import logging
from logging.handlers import RotatingFileHandler
from pythonjsonlogger import jsonlogger
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
console_handler = logging.StreamHandler()
console_handler.setLevel(logging.INFO)
console_handler.setFormatter(jsonlogger.JsonFormatter())
file_handler = RotatingFileHandler(
'/var/log/app.json',
maxBytes=10485760,
backupCount=5
)
file_handler.setLevel(logging.WARNING)
file_handler.setFormatter(jsonlogger.JsonFormatter())
logger.addHandler(console_handler)
logger.addHandler(file_handler)
File rotation prevents individual log files from growing indefinitely and filling disk space. RotatingFileHandler rotates based on file size (when app.json reaches 10MB, it renames to app.json.1 and starts a new app.json). TimedRotatingFileHandler rotates based on time intervals (daily, hourly, weekly). In production, logs written to /var/log/ get picked up by collection agents like Promtail or Fluentd.
QueueHandler enables asynchronous logging for apps that handle thousands of requests per second where synchronous file writes would slow down processing:
from logging.handlers import QueueHandler
import queue
log_queue = queue.Queue(-1)
queue_handler = QueueHandler(log_queue)
logger.addHandler(queue_handler)
Python’s default JSON serializer chokes on datetime objects, bytes, and custom classes. Trying to log {"timestamp": datetime.now()} raises “TypeError: Object of type datetime is not JSON serializable.”
Custom JSON encoder integrated with handlers to handle datetime objects and timestamps:
import logging
import json
from datetime import datetime
class DateTimeEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, datetime):
return obj.isoformat()
return super().default(obj)
class CustomJSONFormatter(logging.Formatter):
def format(self, record):
log_data = {
'timestamp': datetime.utcnow(),
'level': record.levelname,
'message': record.getMessage(),
'module': record.module
}
if hasattr(record, 'extra'):
log_data.update(record.extra)
return json.dumps(log_data, cls=DateTimeEncoder)
handler = logging.StreamHandler()
handler.setFormatter(CustomJSONFormatter())
Always store timestamps in UTC rather than local time zones to avoid ambiguity when aggregating logs from servers in different regions. Convert datetimes to ISO 8601 format like “2024-01-15T14:30:00Z” which most log platforms parse automatically.
UTF-8 encoding handles special characters and international text in log messages. Structlog’s UnicodeEncoder processor handles this automatically, but processor order matters. JSONRenderer must come last in the pipeline since it converts dictionaries to strings, and processors after it receive strings instead of dictionaries.
Handler and serialization best practices for production:
- Always set formatters on handlers, not loggers (handlers control output format)
- Configure rotation policies for file handlers to prevent disk space issues
- Use appropriate log levels per handler (DEBUG to console in dev, WARNING to files in prod)
- Ensure proper datetime serialization with custom encoders or ISO 8601 strings
- Handle encoding for international characters using UTF-8 consistently
Don’t log objects with circular references (objects that reference themselves or create reference cycles). JSON serialization traverses object properties, and circular references cause infinite loops. If you must log complex objects, extract only the fields you need: logger.info('Object saved', extra={'user_id': user.id, 'email': user.email}) instead of logger.info('Object saved', extra={'user': user}).
Exception Handling and Stack Traces in JSON Logging

Structured exception logging captures errors with full stack traces in searchable JSON format instead of unstructured text blocks scattered across stderr.
The logger.exception() method automatically includes exception information and stack traces in your JSON output:
import logging
from pythonjsonlogger import jsonlogger
logger = logging.getLogger(__name__)
handler = logging.StreamHandler()
handler.setFormatter(jsonlogger.JsonFormatter())
logger.addHandler(handler)
try:
result = 10 / 0
except ZeroDivisionError:
logger.exception('Division error occurred')
This produces JSON containing the exception type, message, and full stack trace as fields you can query. The exc_info parameter controls whether exception details get added to log records. logger.error('Failed', exc_info=True) includes exception info, while logger.error('Failed', exc_info=False) skips it.
Uncaught exceptions normally write to stderr in plain text, bypassing your JSON logging configuration. Override sys.excepthook to capture them:
import sys
import logging
from pythonjsonlogger import jsonlogger
logger = logging.getLogger(__name__)
handler = logging.StreamHandler()
handler.setFormatter(jsonlogger.JsonFormatter())
logger.addHandler(handler)
def handle_exception(exc_type, exc_value, exc_traceback):
if issubclass(exc_type, KeyboardInterrupt):
sys.__excepthook__(exc_type, exc_value, exc_traceback)
return
logger.critical("Uncaught exception", exc_info=(exc_type, exc_value, exc_traceback))
sys.excepthook = handle_exception
Edge cases for exception handling in different execution contexts:
- Threading exceptions use threading.excepthook (Python 3.8.0+) to catch exceptions in threading.Thread.run() that would otherwise print to stderr
- Garbage collection exceptions require sys.unraisablehook to handle exceptions during del methods that can’t be caught with try-except
- Multiprocessing limitations remain unresolved in Python 3.8.1 since the multiprocessing module bypasses sys.excepthook by writing directly to stderr
- Asyncio task exceptions need custom exception handlers set with loop.setexceptionhandler() to capture unhandled exceptions in coroutines
Stack traces can be formatted as single strings or as structured arrays of frame objects. String formatting keeps traces readable when viewing individual entries: "traceback": "Traceback (most recent call last):\n File \"app.py\", line 42...". Structured formatting as JSON arrays enables queries like “find all errors in module auth.py” but makes individual traces harder to read. Most teams prefer string formatting since grep and log viewers handle multi-line strings fine, and structured frame data rarely provides enough value to justify the complexity.
Setting Log Levels and Filtering in Python JSON Logging

Log levels control which messages actually get written, preventing DEBUG spam in production while ensuring critical errors always get captured.
Python defines five standard log levels with increasing severity:
| Level | Numeric Value | When to Use |
|---|---|---|
| DEBUG | 10 | Detailed diagnostic info for developers troubleshooting issues locally |
| INFO | 20 | General informational messages about normal application flow |
| WARNING | 30 | Unexpected situations that don’t prevent operation but should be reviewed |
| ERROR | 40 | Errors that caused a specific operation to fail but app keeps running |
| CRITICAL | 50 | Severe errors that may cause the entire application to stop functioning |
Setting different levels for different loggers or handlers provides granular control:
import logging
app_logger = logging.getLogger('myapp')
app_logger.setLevel(logging.INFO)
db_logger = logging.getLogger('myapp.database')
db_logger.setLevel(logging.WARNING)
console_handler = logging.StreamHandler()
console_handler.setLevel(logging.DEBUG)
This logs INFO and above from the main app, but only WARNING and above from database operations. Best practice uses logging.getLogger(__name__) in each module, which creates a logger hierarchy matching your package structure and allows configuring entire packages at once.
Environment-based configuration switches log levels based on where code runs. Set DEBUG in development for maximum visibility, and WARNING or ERROR in production to reduce volume and costs:
import os
import logging
log_level = os.getenv('LOG_LEVEL', 'INFO')
logging.basicConfig(level=getattr(logging, log_level))
Deploy with LOG_LEVEL=DEBUG python app.py locally and LOG_LEVEL=WARNING python app.py in production containers.
Custom filters exclude specific messages based on content or attributes. This filter blocks logs from the noisy health-check endpoint:
class HealthCheckFilter(logging.Filter):
def filter(self, record):
return '/health' not in record.getMessage()
logger.addFilter(HealthCheckFilter())
Logging everything at DEBUG level in production creates performance problems and cost issues. Each log entry requires serialization to JSON, I/O to write the file or send over network, and storage in your log aggregation platform. Apps handling 1000 requests/second with 10 DEBUG logs per request generate 10,000 log entries per second, which quickly becomes expensive when you’re paying per GB ingested. Keep production at INFO or WARNING, and enable DEBUG temporarily when investigating specific issues.
Integrating Python JSON Logs with Cloud and Monitoring Platforms

JSON-formatted logs integrate with log aggregation platforms without requiring custom parsing rules or regex patterns.
Major platforms that natively parse JSON logs:
- AWS CloudWatch ingests JSON from CloudWatch Logs agents and Lambda functions without preprocessing
- Datadog automatically extracts fields from JSON log entries for filtering and analytics
- Grafana Loki indexes JSON labels and makes fields queryable through LogQL
- Logstash/ELK stack parses JSON logs directly without grok patterns
- Google Cloud Logging structures JSON fields as searchable attributes
Shipping logs to these platforms typically uses agents that read log files or capture stdout. Promtail daemon tails log files in /var/log/ and forwards them to Loki. CloudWatch agents on EC2 instances send logs to CloudWatch Logs. Container environments like Kubernetes capture stdout and stderr from containers automatically.
Configure handlers to output JSON to stdout for container environments where orchestrators collect logs:
import logging
import sys
from pythonjsonlogger import jsonlogger
logger = logging.getLogger()
handler = logging.StreamHandler(sys.stdout)
handler.setFormatter(jsonlogger.JsonFormatter())
logger.addHandler(handler)
logger.setLevel(logging.INFO)
logger.info('Application started', extra={'version': '2.1.0'})
In Docker containers, logs written to stdout appear in docker logs <container> and get forwarded to whatever logging driver you’ve configured (json-file, syslog, fluentd).
JSON structure enables powerful queries using platform-specific query languages. LogQL for Grafana Loki looks like {service="api"} | json | user_id="12345" and level="ERROR" to find all errors for a specific user. AWS CloudWatch Insights uses fields @timestamp, message, user_id | filter level = "ERROR" and user_id = 12345 for the same query. The JSON structure makes these queries possible because platforms parse field names and values automatically instead of treating logs as unstructured text.
For more on integrating with cloud platforms, check the AWS CloudWatch Integration documentation.
Logstash integration becomes simpler with JSON input since you skip complex grok patterns. A basic Logstash config for JSON logs:
input {
file {
path => "/var/log/app.json"
codec => "json"
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
}
}
The codec => “json” line tells Logstash to parse each line as JSON instead of plain text, eliminating the pattern matching that usually breaks when log formats change.
Learn more about log aggregation with Loki in the Grafana Loki Documentation.
Centralized logging collects logs from all your services into one queryable location. JSON formatting makes this practical because you can query service="auth" AND user_id=12345 across logs from auth-service, user-service, and payment-service simultaneously. Without structured fields, you’d need three different regex patterns to extract user_id from each service’s unique format.
Production Best Practices for Python JSON Logging

Running JSON logging at scale requires attention to performance, security, and operational consistency across your infrastructure.
Logging introduces overhead through JSON serialization, file I/O, and handler processing. This becomes a bottleneck when your app processes thousands of requests per second and logs multiple entries per request. Profile your application under load to measure logging impact. If logs consume more than 5-10% of request processing time, consider asynchronous handlers or reduced volume.
Production logging best practices:
- Use appropriate log levels (INFO or WARNING in prod, DEBUG only for troubleshooting)
- Use asynchronous handlers for high-throughput apps
- Standardize JSON schema across all services (same field names and types)
- Include essential fields only (don’t log entire request/response objects)
- Don’t log sensitive data (passwords, tokens, credit cards, SSNs)
- Use log sampling for high-volume events (log 1% of successful requests, 100% of errors)
- Use structured exceptions instead of full stack traces for expected errors
Asynchronous logging prevents I/O operations from blocking request threads:
import logging
from logging.handlers import QueueHandler, QueueListener
import queue
log_queue = queue.Queue(-1)
queue_handler = QueueHandler(log_queue)
root_logger = logging.getLogger()
root_logger.addHandler(queue_handler)
root_logger.setLevel(logging.INFO)
file_handler = logging.FileHandler('/var/log/app.json')
listener = QueueListener(log_queue, file_handler, respect_handler_level=True)
listener.start()
The QueueHandler adds log records to a queue without blocking, and QueueListener processes them in a background thread. This keeps request handling fast while logs get written asynchronously.
Thread safety matters when multiple threads log simultaneously. Python’s logging module is thread-safe by default, but multiprocessing remains problematic in Python 3.8.1 because the multiprocessing module bypasses sys.excepthook and writes directly to stderr. If you’re using multiprocessing, configure handlers at process startup rather than relying on exception hooks.
Log volume directly impacts cloud costs. AWS CloudWatch charges per GB ingested and stored. Logging 100 GB/month at DEBUG level costs way more than 10 GB/month at INFO level. Monitor your ingestion rates and set up alerts when volume spikes unexpectedly.
Log sampling reduces volume for high-frequency events while maintaining visibility into errors:
import random
import logging
def should_sample(level, sample_rate=0.01):
if level >= logging.WARNING:
return True
return random.random() < sample_rate
if should_sample(logging.INFO):
logger.info('Request processed', extra={'endpoint': '/api/users'})
This logs all warnings and errors but only 1% of info-level messages, cutting volume by 99% for routine operations while preserving error visibility.
Security considerations for production logging:
Never log passwords, API tokens, session cookies, credit card numbers, social security numbers, or other personally identifiable information (PII). Attackers who gain access to log files or log aggregation platforms shouldn’t find credentials they can reuse. If you must log sensitive data for debugging, implement log scrubbing that redacts specific fields before writing to disk.
A simple scrubbing filter:
import re
class SensitiveDataFilter(logging.Filter):
def filter(self, record):
message = record.getMessage()
record.msg = re.sub(r'password=\S+', 'password=***REDACTED***', message)
return True
Monitor logging system health by setting up alerts for logging failures. If your app can’t write logs, you lose visibility into production issues. Track metrics like log handler errors, queue depth for asynchronous handlers, and disk space on servers writing log files. Set up alerts when log ingestion to your aggregation platform stops or when error rates spike beyond normal thresholds.
Final Words
Python’s logging module combined with JSON formatting gives you machine-readable logs that work smoothly with modern monitoring platforms.
Start with python-json-logger for quick setup, or reach for structlog when you need advanced features like processor pipelines and context managers.
Add correlation IDs and custom fields to track requests across microservices. Configure handlers for rotation and asynchronous output. Set appropriate log levels per environment.
The json logging format python approach pays off fast when you’re debugging production issues or building queries in CloudWatch, Loki, or the ELK stack.
Your logs become data you can actually use instead of walls of text you have to grep through.
FAQ
What is JSON logging in Python and why should I use it?
JSON logging in Python formats log messages as key-value pairs in JSON structure, making logs machine-readable and automatically parseable by log management tools. This approach enables faster analysis, better integration with monitoring systems, and simplified debugging compared to plain text logs.
How do I implement basic JSON logging in Python?
JSON logging in Python is implemented by installing the python-json-logger library and configuring a logger with JSONFormatter attached to a handler. This typically requires 5-7 lines of code including imports, logger setup, handler configuration, and formatter attachment.
What’s the difference between python-json-logger and structlog?
Python-json-logger serves as a drop-in replacement for standard logging formatters, while structlog uses a processor pipeline approach offering more flexibility for advanced use cases. Python-json-logger works better for simple implementations, whereas structlog excels in complex applications requiring extensive customization.
When should I use dictConfig versus basicConfig for JSON logging?
DictConfig is best suited for complex applications requiring detailed handler, formatter, and logger configurations, while basicConfig works well for quick setup in small scripts. DictConfig provides more granular control through dictionary-based configuration, making it ideal for production environments.
How do I add custom fields to JSON log messages?
Custom fields are added to JSON log messages using the extra parameter in logging calls or by creating a custom formatter class that injects static fields. This allows you to include metadata like requestid, userid, service_name, and environment consistently across all log entries.
What are correlation IDs and why do they matter in logging?
Correlation IDs uniquely identify requests across distributed systems, enabling you to trace a single request through multiple microservices. They’re typically generated using UUID and passed via request headers, then stored in threading.local context for automatic inclusion in all related log entries.
How do I properly log exceptions with stack traces in JSON format?
Exceptions with stack traces are logged in JSON format using logger.exception() within try-except blocks, which automatically includes exception details in the log output. For uncaught exceptions, override sys.excepthook to capture and format them as JSON before the program terminates.
What handlers should I use for production JSON logging?
Production JSON logging typically uses StreamHandler for console output and RotatingFileHandler for file-based logs with automatic rotation. For high-throughput applications, QueueHandler enables asynchronous logging to prevent blocking, while multiple handlers can target different destinations simultaneously.
How do I handle datetime serialization in JSON logs?
Datetime serialization in JSON logs requires a custom JSON encoder that converts datetime objects to ISO 8601 format strings. Configure handlers with this custom encoder and standardize on UTC timezone to ensure consistent timestamps across distributed systems.
What log levels should I use in different environments?
Development environments typically use DEBUG level for maximum visibility, while production environments should use WARNING or INFO to reduce volume and performance overhead. Set log levels per handler or logger based on environment variables to easily adjust verbosity without code changes.
How do JSON logs integrate with AWS CloudWatch and other platforms?
JSON logs integrate natively with AWS CloudWatch, Datadog, Grafana Loki, and ELK stack without requiring complex transformations or grok patterns. Configure handlers to output JSON to stdout in container environments where log collectors automatically scrape and forward logs to these platforms.
What’s the difference between LogQL and JSMPath for querying JSON logs?
LogQL and JSMPath are query languages that enable advanced filtering and analysis of JSON-formatted log data in log aggregation systems. They leverage the structured nature of JSON to perform powerful queries like counting actions by user_id, which requires reliably structured field names.
How do I implement asynchronous JSON logging for high-throughput applications?
Asynchronous JSON logging is implemented using QueueHandler and QueueListener to prevent logging from blocking application threads. This setup requires 10-12 lines of code configuring a queue, attaching handlers to the listener, and connecting the queue handler to your loggers.
Should I log to files or stdout in containerized environments?
Containerized environments should log to stdout, allowing container orchestration platforms and log collectors to automatically capture and forward logs. File-based logging complicates container management and requires mounting volumes, while stdout logging integrates seamlessly with tools like Promtail and CloudWatch agents.
How do I prevent sensitive data from appearing in JSON logs?
Sensitive data is prevented from appearing in JSON logs by explicitly excluding passwords, tokens, and PII from log messages and implementing log scrubbing functions. Never log authentication credentials, implement field-level filtering for sensitive objects, and use placeholder values when logging user-related data.
What’s the performance impact of JSON logging in production?
JSON logging adds minimal performance overhead for normal applications but can become a bottleneck in high-throughput scenarios with excessive DEBUG-level logging. Use appropriate log levels, implement asynchronous handlers, and consider log sampling for high-frequency events to minimize production impact.
How do I maintain consistent JSON log structure across microservices?
Consistent JSON log structure across microservices is maintained by standardizing field names, using the same logging library configuration, and sharing formatter classes. Document required fields like requestid, servicename, and environment, then enforce this schema through shared configuration modules.
Why does processor order matter in structlog?
Processor order matters in structlog because JSONRenderer must always be last, as it converts dictionaries to strings that later processors can’t modify. Place processors like TimeStamper, formatexcinfo, and UnicodeEncoder before JSONRenderer to ensure proper log formatting.
How do I handle exceptions in threads and multiprocessing with JSON logging?
Thread exceptions are handled using threading.excepthook (Python 3.8+), while garbage collection exceptions use sys.unraisablehook for JSON formatting. Multiprocessing in Python 3.8.1 has limitations as it bypasses sys.excepthook by writing directly to stderr.
What file rotation strategy should I use for JSON logs?
File rotation strategies for JSON logs typically use RotatingFileHandler for size-based rotation or TimedRotatingFileHandler for time-based rotation. Configure rotation policies based on log volume and retention requirements, typically rotating daily or when files reach 10-50 MB.
How do I query JSON logs by correlation ID in log management systems?
JSON logs are queried by correlation_id in log management systems using the platform’s native query language like LogQL or CloudWatch Insights. The structured JSON format enables filtering by specific field values without complex regex patterns or grok filters.
What’s the advantage of LoggerAdapter for adding contextual fields?
LoggerAdapter provides an alternative approach for adding contextual fields by wrapping loggers and automatically injecting extra parameters into all log calls. This eliminates the need to pass extra dictionaries manually, ensuring consistent context inclusion throughout a request lifecycle.
