Sunday 23 October 2016

How to set up e-mail notifications for Cron

Summary

Last week, I found myself setting up some jobs in a Unix environment (Centos 7) for which I wanted to get error email notifications and I ran into some troubles setting up the proper configuration, so I would like to take the chance and share the actions I took in case anyone finds him/herself in the same situation.

Please note that we are going to use Postfix as mail service here!

Initial Crontab setup

This was the setup I was facing, some jobs scheduled that might fail sometimes and one that will always fail as it was badly written:
$crontab -l

30 15 * * * java -jar /path_to_your_jar_app/job.jar
45 13 * * * /path_to_some_script/script.sh
45 19 * * * obviously_wrong_command
I wanted to be notified by email of any failure (and as I was using Cron, the expected behavior was to get an email  notification when either process returns a non-zero exit code).

This is an example of an email that should be generated for the above setup:
Message  1:
From cron_user@homepc  Sun Oct 23 19:45:01 2016
Delivered-To: john.doe.the.sysadmin@mailservice.com
From: "(Cron Daemon)" 
Subject: Cron  obviously_wrong_command
Content-Type: text/plain; charset=UTF-8
Auto-Submitted: auto-generated
Precedence: bulk
X-Cron-Env: XDG_SESSION_ID=3289
X-Cron-Env: XDG_RUNTIME_DIR=/run/user/3001
X-Cron-Env: LANG=en_US.UTF-8
X-Cron-Env: MAIL_TO=admin_user_alias
X-Cron-Env: SHELL=/bin/sh
X-Cron-Env: HOME=/home/cron_user
X-Cron-Env: PATH=/usr/bin:/bin
X-Cron-Env: LOGNAME=cron_user
X-Cron-Env: USER=cron_user
Date: Sun, 23 Oct 2016 19:45:01 +0200 (CEST)
Status: RO

/bin/sh: obviously_wrong_command: command not found
Keep reading if you want to get the same setup! 

Sunday 16 October 2016

Introduction to Spring Cloud Dataflow (I)

Introduction

This is the first of a series of posts on how to develop data-driven micro-services with Spring Cloud Dataflow (SCDF from now on).  For now, we will see what is the approach proposed by this framework and how to build locally the basic components: Source, Sink and Processor.

Also, if you are familiar with Spring, we will take a look to the already-made components that are available for you to use so you don´t have to reinvent the wheel.

Contents
  • What is Spring Cloud DataFlow?
  • Introduction to the API: A simple Producer (Source), Consumer (sink) setup.
  • Prerequisites: Kafka and Zookeeper.
  • Coding a simple Source: HTTP Poller that retrieves live stock prices.
  • Coding a simple Sink that store the results in a file.
  • Writing an additional Batch Source.
  • Summary & Resources.

Sunday 28 August 2016

How to clean up ElasticSearch with Curator 4.0.6

Summary

Today, I would like to share with you a quick introduction to a tool that cleans and maintains your ElasticSearch cluster clean and free from old data: Curator (thanks flopezlasanta for the tip on this!)

This tool, along with ElasticSearch itself, evolves very quickly, so when I was looking for information on the subject and I found this blog entry from 2014, I noticed how much the way of working with the tool has changed. Anyway, kudos to @ragingcomputer!

Installation instructions

Before you install Curator, you need to get the Python package Installer (Pip), so Python is another requirement, Note that if you are running a version of Python newer than 3.4 (version 3) or 2,7 (version 2), Pip comes already installed with Python. More info here.

Note: You need to be super user to perform the installation.

And you are ready to go! Now let´s check the other two files needed to make it work.

Tuesday 5 July 2016

Docker images creation in a development environment

Summary


Sorry, It´s been a while since my last post, a mixture of not having anything worth writing, holidays and some online training that drains most of my time.
Here is something I would like to share that might reduce the pain for those who want to dockerize their Spring-Boot applications (or any Java application) and then test and tune them while running inside a Docker container.

Former setup

  • A commit was pushed to Jenkins, who compiled and unit-tested it.
  • After some code analysis, a Docker image was created and pushed to a repository
    With set-up, testing was highly inefficient.
  • To test it in a server, I had to pull it and execute it and, while I was focused in the performance and memory consumption of my application, tuning the JVM parameters was essential... Unfortunately, they were given at compilation time (a Maven plugin) and could not be changed (or I did not know how to).
The drawbacks I see of this approach:
  • It was slow to test new changes, as the whole CI flow needed to be executed.
  • Unnecessary docker-related commits to the branch generating the Docker images.
The solution I found? Building the image myself before testing it.

Saturday 2 April 2016

Testing & error handling with Spring + Reactor

Introduction

As we saw in the previous post, the whole idea of reactive programming floats around two concepts: Data comes in asynchronous streams and those streams are immutable, so any modification results in new streams being created.
These facts makes the code not very intuitive and it is even worse when it comes to testing (even more if we are interacting with external resources, like a Web Service).
To help in the task, Spring and Reactor come with some handy tools that can be applied to simplify the development and testing.

First, let´s take a look at the scenario being tested:

Sequence diagram of the "production" code

Saturday 12 March 2016

Intro to Reactive Programming with Reactor and Spring

Introduction

What did I know last week about reactive programming? Nothing. That´s a fact. However, I started reading about and I find the topic really really interesting.

Unfortunately, there is not much documentation around (or at least, I had a hard time finding it) so I would like to share some resources with you along with an example that I have adapted to perform like our module Stokker (the module that used to fetch stock quotations from Yahoo Finance for us using Spring Integration).

What is Reactive Programming?

The best definition I could find was this amazing GitHub gist written by @andrestaltz that summarizes all the theoretical definitions to this line:

Reactive programming is programming with asynchronous data streams.

I really recommend to you its reading, but we can summarize it in:

  • You will work with immutable event streams to which you will have to subscribe.
  • Some operations will be provided to you, so you can transform these streams into new ones by filtering them, merging them o create new events from the previous ones.

The best way to understand this is, is to represent these operations graphically. For that purpose, we have this really cool web application implemented using JavaScript and RxObservables: RxMarbles (a must see!). For instance:

Two event streams: One with numbers, the other with each element multiplied by 10

The first line, represents an initial event stream. In the middle you can see the mapping function that is applied to every event in the stream and finally, the second stream with the results:

Note: Pay attention to the immutability here: Any transformation you apply to a stream creates a new one!

Let´s check out another transformation:

Two event streams: One with numbers, other with the average of the elements received

Look who is here! Our old friend, the moving average. Whenever a new event arrives in the original stream, a new value for the average will be calculated in the resulting stream.

Monday 22 February 2016

Building a basic stock trading system with Drools (I)

Introduction

In my previous post, I explained how can we calculate a number of indicators based on Stock prices and that, the value on those indicators is better exploited in algorithms rather that just plotted in charts. What kind of algorithm might that be?

I mentioned that one possibility is that each indicator has a "voice" or "opinion" and that the output of the algorithm is the sum of all those opinions, some sort of consensus. Then we can translate that consensus in actual actions: Buy and Sell orders.

How can we model those opinions? Basically, they are rules responding to certain mathematical expressions. One option to evaluate is a Java-based rule engine: Drools.

But first, take a look at the result of our calculations, so you can understand better. This graph shows a "score" given by our system on how bullish or bearish (that is, whether it has an upward/downward chance) a stock is. This score is calculated using simple indicators (Simple Moving Average, New MAX, New Min, etc.)
The price and indicators above, the SCORE indicator below

In this case, the rules said:

  • + 1 when the price is over the SMA(200).
  • + 1 when the price is making new MAX taking into account the last 200 days.
  • -1 when the price is under the SMA(200).
  • - 1 when the price is making new MIN taking into account the last 200 days.

So we can extrapolate that:

  • If the score was negative, and it has become positive; it´s likely for the price to go up.
  • If the score was +1 and becomes +2, it gets even better.
On the other hand:

  • If the score becomes negative or makes new lows, it´s time to close our position and avoid more losses (see right part of the graph).

Note: Please keep in mind that the main purpose here is learning the technology and have fun, don´t expect Warren-Buffet-like returns!

Sunday 14 February 2016

Calculating and showing stock indicators with Spark & AngularJS

Summary

In a previous post, I showed how simple is to calculate a SMA (Simple Moving Average) indicator from your list of stock quotations. Today, let´s see how can we modify the code to calculate multiple indicators at once (for the moment, I could not implement it using several different sliding windows at once).

Also, let´s introduce a quick prototype coded in AngularJS in order to show the different indicators in a chart with the possibility of switching stocks and windows sizes live. This is just a debug feature as our aim is to use the indicators in our future home-made analysis algorithm.

Note: If you don´t know what SMAs are, you should read this before.

Live example?

This is much easier to understand if you see an actual example running, but unfortunately I'm having problems injecting an iframe here that shows the content properly. Instead, you can clone the repository and run the application with the test files included in it:
git clone https://github.com/victor-ferrer/sparkker
Note: The application requires a Java 8 JRE installed in your machine. Also, the JAR file is rather heavy as Spring Boot adds an embedded Tomcat Server and all the Spark dependencies...

Here is a screenshot of the application that will be accessible at http://localhost:8988/#/indicators

Chart showing the upward trend followed by one stock.
The different lines are the indicators calculated on a 200 days window.

Thursday 4 February 2016

Video: Lambda architecture with Spring XD and Spark

Summary


Today I do not have any piece of code worth showing you. Instead, I would like to quickly comment on this YouTube video, which I have seen recently, that is coming from one the SpringOne2GX 2015 event (and later released to the public).



Note: The fact the these talks are published more and more often (like what happens with the DockerCon talks) kinda puts me off from paying the high premium involved in physically assisting to these events...

Sunday 31 January 2016

Calculating Moving Averages with Spark MLLib

Update on Friday. 13rd of January, 2017

This post is about the implementation using the Java API. If you want to see the same example of calculating Moving Averages with Spark MLLib but using Scala, please check out this post.

Introduction

Since I began learning Spark, one of the things that given me the most pain is the Java API that wraps the Scala API used in Spark. So, I would like to share this problem I had using the Spark Machine Learning library in case is useful for anybody learning Spark. Also, it´s nice to take a look to the possibilities that the sliding windows give us when working on timed data.

The problem

As I wrote in my previous post, I created a new independent module to make studies on Stock data using Spark. One of the most interesting (and basic) studies one can make over Stock data, is to calculate the simple moving averages.

What is a simple moving average (SMA)?

If we take a window of time of length N, the SMA is:
Source Wikipedia: https://en.wikipedia.org/wiki/Moving_average
Wait, what? In plain English: For every element, we take the N precedent elements and calculate the average. The aim is to smooth the data in order to easily detect underlying patterns in the data.

One possible solution: Spark MLLib

Searching around in internet, I found that the Spark Machine Learning Libray has a sliding window operation over an RDD. Using this, it would be very easy to calculate the SMA as described in this StackOverflow answer.


Note: The RDD must be sorted before appliying the sliding function!

The caveat is that the previous code is written in Scala. How would it look like in Java?

Saturday 30 January 2016

Orchestrating your microservices with Eureka and Feign


Introduction


This is a well known scenario: It´s time to add some new functionality to your application and you decide to add a new member to the family of micro-services.
In this example, we are going to use a service called "Sparkker" that will consume stock quotations from anothed service called "Stokker" and will publish the results to another service called "Portfolio-Manager"

This is a diagram showing the whole picture:



It´s obvious that the number of lines connecting the services increase and increase, making the maintenance and configuration of the system very tedious. How can we solve it? We can´t spend the whole day coding integration code, we need to deliver added value!

One possible solution to all this mess is to use some of the components from the Netflix Stack that have been incorporated to the Spring Cloud project and that are very simple to use within Spring Boot. Interested?