Python has a great built-in logging library. It allows to avoid a lot of boilerplate code and provides great features. Here are some examples. Logging allows to set up a unified formatting for logs:
2023-10-03 12:57:22,377|classifier.py| 62|522130|MainThread|INFO|Classifying messages from ./messages.ndjson (images ./save_path)...
2023-10-03 12:57:22,378|classifier.py| 44|522130|MainThread|INFO|Start reading input messages from ./messages.ndjson
2023-10-03 12:57:22,378|classifier.py| 48|522130|MainThread|INFO|Loaded event {'id': 104162, 'date': '2023-09-27T07:12:08+00:00', 'entities': []}
2023-10-03 12:57:22,378|classifier.py| 48|522130|MainThread|INFO|Loaded event {'id': 58934, 'date': '2023-09-27T07:12:09+00:00', 'entities': [{'_': 'MessageEntityBold', 'offset': 92, 'length': 8}]}
2023-10-03 12:57:22,378|classifier.py| 48|522130|MainThread|INFO|Loaded event {'id': 82100, 'date': '2023-09-27T07:12:13+00:00', 'entities': [{'_': 'MessageEntityMentionName', 'offset': 8, 'length': 7, 'user_id': 5678410505}]}
2023-10-03 12:57:22,378|classifier.py| 54|522130|MainThread|INFO|Reading complete
Here you can easily see the time, module name and the line that produced the log message, also log level.
It provides a convenient way to log exceptions.
With
|
|
you get a nice stacktrace
2023-10-03 13:00:32,089|classifier.py| 49|522434|MainThread|ERRO|Decoding failed
Traceback (most recent call last):
File "/home/classifier.py", line 47, in classify
event = json.loads("{")
File "/usr/lib/python3.8/json/__init__.py", line 357, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.8/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
Traceback (most recent call last):
File "/home/classifier.py", line 66, in <module>
classify()
File "/home/classifier.py", line 50, in classify
tmp = {k: event[k] for k in ['id', 'date', 'entities']}
File "/home/classifier.py", line 50, in <dictcomp>
tmp = {k: event[k] for k in ['id', 'date', 'entities']}
NameError: free variable 'event' referenced before assignment in enclosing scope
Also you can set up logging to a file, as well as to stderr.
Simple logger setup for standalone scripts
I use logging even for small standalone scripts.
I change the default format so that service information (time, module, error level, etc.)
has a fixed length. It helps me visually parse important information (log messages) from the stream of logs.
It can be achieved by using fixed width modifiers in the format string:
logging.basicConfig(
level=logging.INFO,
format="{asctime}|{module:12.12s}|{lineno:4d}|{process}|{threadName}|{levelname:4.4s}|{message}",
style='{'
)
Here I use str.format()
style.
Output line looks like this:
2023-10-03 13:37:04,700|log_ex | 12|526911|MainThread|INFO|running f()
log_ex
is the name of the module. You won’t get a meaningful name for logging from __init__.py
. If that is critical
you can use custom logger names:
logger = logging.getLogger("my-custom-logger-name")
and then modify the format string: {name:12.12s}
instead of {module:12.12s}
.
More on logging format templates here.
Global configuration of the logging can be set with logging.basicConfig
.
Usually I don’t call logging.basicConfig
at the top level of a module. That would mess up configs in case
another module imports it. I usually put basicConfig
under if __name__ == '__main__'
or in the main function of the
script:
|
|
Output:
|
|
Also I don’t want 3rdparty libraries to pollute my logs. Let’s say I want to see debug messages from my code,
but I don’t need to see debug messages from requests
library.
How can I disable log messages from the Requests library? Here is an example:
|
|
Best practices
Good logging practice in python
Logging exceptions logging.exception
Setup logging in Flask
In this example we replace default Flask logging with manually configured one. It includes logging to stdio and a file. Also it has extended formatting.
logging_config.py:
logging_config = {
"version": 1,
"disable_existing_loggers": False,
"formatters": {
"simple": {
"format": "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
}
},
"handlers": {
"console": {
"class": "logging.StreamHandler",
"level": "INFO",
"formatter": "simple",
"stream": "ext://sys.stdout"
},
"info_file_handler": {
"class": "logging.handlers.RotatingFileHandler",
"level": "INFO",
"formatter": "simple",
"filename": "log.txt",
"maxBytes": 10485760,
"backupCount": 20,
"encoding": "utf8"
}
},
"loggers": { # here you can add specific configuration for other libraries
"werkzeug": {
"level": "INFO", # change to "ERROR" if you want to see less logs from flask
"handlers": ["console", "info_file_handler"],
"propagate": False # required to avoid double logging
}
},
"root": {
"level": "INFO",
"handlers": ["console", "info_file_handler"]
}
}
app.py:
|
|
Logging in ipython notebooks (jupyter)
Let’s say I want log messages to be written in the jupyter and saved to a file notebook.log
.
|
|
I recall a few years ago there was a bug, and jupyter didn’t work correctly with the logging. I had to find some workarounds.
In the newest versions it seems working fine.