Sorry, the language specified is not available for this page

    How To Build A Serverless Alert Email Notification System Using Machine Learning

    April 9, 2018

    As an AWS partner, we receive a lot of email messages from AWS including many that need prompt attention. But with so many emails, it is very hard to detect which emails need immediate attention and which ones can be handled at a later time. To deal with the volume, I’d like to present a serverless system that sends an alert to a Slack channel whenever an urgent AWS email arrives.

    Before we get started, here is the overall architecture of the system:

    First, I trained an LDA (Latent Dirichlet Allocation) model using all previous emails and stored it in a S3 bucket. In the process of training, the model groups all previous mails into as many topic groups as specified, and you can review the generated topic groups to find out which group(s) should be considered to be alerted.

    Now, a Lambda function regularly checks to see if there are any new email messages in the mailbox. If so, it sends the content of the new email to another Lambda function that feeds the content into the trained model. The model generates a proportion value for each group that shows how likely the email belongs to each group. Based on that information, the Lambda function sends an alert to Slack if the returned proportion values of groups to be alerted are high enough.

    This system also provides a RESTful interface using API Gateway that invokes the same Lambda function to run the trained model against the given email content and sends an alert to Slack if the given email content is considered to be urgent.

    Next, let’s take a look at the Jupyter notebook to go over how to train the model with scikit-learn library.

    1. Let’s create a DataFrame using all previous emails. 
    2. Transform all email contents in the above DataFrame to a “document-term matrix” that represents how frequently each word is used in each document.
    3. Create a LDA model and train it with the above “document-term matrix”. Here, we choose 10 for ‘num_topics’ and set a fixed value for “random_state” seed to guarantee same results from repeated training.
    4. We can check topic groups generated by the model to see what words are mostly used in each group.
    5. Process the trained model against all emails to generate a proportion value for each group that shows how likely the email belongs to each group.
    6. Store the trained model in a S3 bucket for future use against new emails.
    7. Create another DataFrame using all emails with their generated group proportion values.
    8. Now merge two DataFrames, #1 & #7. You can see each email has topic groups (t00~t09) with their proportion values (p00~p09).
    9. This is a code snippet of how to use the trained model to get the topic group proportions against new email.
    10. The function “predict” receives input values of the trained model object and a new email message to be processed, and returns a list of topic groups along with their proportion values like below. It shows that the given email message most likely belongs to topic group 4.
    11. Finally, you can send the email as an alert to Slack if the topic group considered to be alerted has the highest proportion value. Here, “Possibility” is the proportion value of the topic group.

    In summary, this is how to set up a serverless notification system that sends alerts to a designated Slack channel whenever an urgent email arrives in your mailbox. I used scikit-learn LDA model to train a model and picked up (a) specific topic group(s) for alert notification after reviewing the classification result. I stored the trained model in S3 bucket to use for new emails in a Lambda function, which supports serverless architecture.

    Other Posts You Might Be Interested In

    A Journey into Container Security

    Recently, someone came to me with a new challenge – “Our team was already working on a technology strategy around containers, Docker, and orchestration with Kubernetes. ... Learn More

    CloudFormation Scoping for Beginners

    When most people begin working with CloudFormation, they usually start with examples or tutorials they find online. After that, they quickly start combining and adding their... Learn More

    Oracle is Positioning itself for a PaaS and SaaS Grand Slam

    Oracle’s cloud, with it’s well rounded service availability and the continuing investments made by Oracle, is definitely in the rear view mirror and at a distance from... Learn More