How to improve the monitoring of your Lambda functions with Telegram
Send Cloudwatch logs to Telegram using bots. Code examples of a Python web scraping process in AWS Lambda.
**Final architecture diagram at the end**
Cloudwatch is a very powerful AWS tool that allow us to do not have to worry about storing or cataloguing our logs for later uses, as it automatically generates a file with all the messages generated by our execution (even normal prints to screen).
Although a very nice tool, sometimes it adds a lot of friction when we want to easily know the status or results of our executions. For example, these are the steps needed to access Cloudwatch logs for the use case that I’m going to describe in next section:
- Having access to a laptop.
- Log-in to AWS (2 factors auth).
- Go to CloudWatch, search the latest execution of our process for which there are logs.
- Open or download the file.
- Search the exception or error message in a file of near 4 MB of size
Of course, for enterprise or productive environments there are tools that can be used as log aggregators with direct connectors to Cloudwatch, such as: DataDog, Splunk…
But in this article I want to focus on more humble use cases like the type of projects we can build with the AWS Free tier:
How to Deploy and Automate Projects with AWS Free Tier — Part 1
AWS Lambda, S3, AWS Cloudwatch, SQLite, Telegram
aws.plainenglish.io
Description of the use case
Lambda function that periodically scraps data from different web pages. That data is then processed, structured and inserted into AWS Aurora. As we depend on external systems, our logging is a key component to identify possible errors, possible data quality problems or changes in the webs that we are scraping.
As this is a web-scraping process, a failure could be due to the state of a web at a particular point in time and not be reproducible, which is why the logs that this process generates are quite extensive as they contain enough information to know what was happening at any given time.
As we have seen in the previous section, accessing Cloudwatch logs is generating a lot of friction (and even more if we want to do it every day). In the next sections we will define a logging system that facilitates such access to the following steps:
- Having a smartphone/laptop.
- Accessing Telegram app/web.
- View summary of results/metrics of the last run.
- If an error is detected in previous summary, the complete log file could be inspected without leaving the Telegram app.
Requirements:
- Telegram Bot (talks to the BotFather) to be used to send messages to Telegram.
Bots: An introduction for developers
Bots are third-party applications that run inside Telegram. Users can interact with bots by sending them messages…
core.telegram.org
- Telegram Group in which previous bot will be added (where the log messages are going to be written).
- ID of previous Telegram group. The ‘GetIDs Bot’ could be added to the Telegram group, it will answer with a message like this:
After noting the ID, the ‘GetIDs Bot’ could be removed from group
- Python Telegram Bot package. Python SDK to deal with Telegram API:
GitHub – python-telegram-bot/python-telegram-bot: We have made you a wrapper you can't refuse
We have made you a wrapper you can't refuse We have a vibrant community of developers helping each other in our…
github.com
- AWS SDK for Python. Not needed in Lambda as its part of its Python runtime:
GitHub – boto/boto3: AWS SDK for Python
Boto3 is the Amazon Web Services (AWS) Software Development Kit (SDK) for Python, which allows Python developers to…
github.com
Write messages to a Telegram group
All logs are being generated automatically in AWS Cloudwatch, we have seen before why logging into AWS Cloudwatch to see those logs is a process with a lot of friction but then why not send the logs to Telegram every time they are written to Cloudwatch (1–1 mapping)?
- Avoid network bottlenecks with lots of requests to Telegram.
- Reduce the execution time (as we are using Lambda, we are paying for each second of execution). If we have lots of calls to an external server (Telegram) the odds of having to deal with network errors increase. Some examples of exeptions that I have encountered when trying to send many messages to Telegram:
- telegram.error.RetryAfter: Flood control exceeded. Retry in 44.0 seconds
- telegram.error.TimedOut: Timed out
- Reduce noise when checking the logs (the log file is very big, as it contains a lot of intermediate results to facilitate debugging). Less time needed to detect errors or data quality problems.
Code to send messages (and exceptions properly formatted) to Telegram:
So, every time that a exception is raised in the code, besides the usual treatment of that exception, a message will be sent to previous Telegram group (by calling to method ‘send_exception_to_telegram’ in the exception treatment).
Cloudwatch to Telegram
In the previous section we have seen that I had encountered problems when sending each logging message to Telegram (1–1 Cloudwatch message to Telegram). To solve this I thought about including several logging messages in the same Telegram call, but the number of characters that can be sent in a single call is limited:
telegram.error.BadRequest: Entities_too_long
It is a hard limit in the Telegram API (not an issue of the SDK). It’s not documented, but according to this Stackoverflow comment:
“9500 characters is the string length limit for sending markdown messages via
send_message
, found out using a brute force test”
As there are cases in which the logging system reports an error but it is necessary to investigate in depth the origin of that error, from time to time, It’s still needed to access the complete Cloudwatch Logs file.
To facilitate that process, I have created a Github repo with the needed methods to:
- Filter logs from Cloudwatch.
- Download those logs.
- Create a TXT file with them and finally send that file to a Telegram group.
I have included the ‘lambda_handler’ method to easily deploy it as a Lambda function (as in my use case), the README contains the details on how to configure it on Lambda:
GitHub – ivgomezarnedo/cloudwatch_logs_to_telegram:
This repository contains the methods needed to send logs from AWS Cloudwatch to Telegram groups. github.com
Conclusion
In this article we have seen how to monitor the execution of web-scraping processes in AWS Lambda without having to access AWS Cloudwatch after each execution. This allows us to react quickly to possible changes in the webs (our data sources) to avoid data quality problems in subsequent processes.
We have described possible implementations, possible problems of these implementations, until determining the approach to follow, whose example architecture is described in the following diagram:
We have added more steps to the original diagram because when the web-scraping Lamba (to which we have added methods to send the metrics/summaries of the execution directly to a Telegram channel), finishes, another Lambda is executed that extracts the Cloudwatch logs of the last execution, generates a file and sends it to the Telegram group.
This double approach makes it easier to review the execution status (summary/metrics) and the debugging process when a failure is detected.
“The success formula: solve your own problems and freely share the solutions.”
― Naval Ravikant
Want to Connect?
@data_cyborg
https://www.linkedin.com/in/ivan-gomez-arnedo/
Ten articles before and after
Выходные с книгой. Рассказываем. Вып. 253 – Telegram 中文版
Create your beautiful sticker pack in Telegram using Figma – Telegram 中文版
1097+ Best Telegram Group Join Links List Updated – Telegram 中文版
Let’s Make A Telegram Bot With Node.js – Telegram 中文版
Cryptocurrency payment gateway for telegram bots and websites – Telegram 中文版
#CryptoUpdates: #Issue3. Telegram users can now send crypto… – Telegram 中文版
Создаем бота Telegram для повышения качества обслуживания клиентов – Telegram 中文版
Nftstickerco Basics. Registration to nftstickerco starts… – Telegram 中文版
Send Telegram messages with weather forecast for tomorrow – Telegram 中文版