Monitoring and Restarting Apache webserver Automatically using AWS CloudWatch

 When your webserver goes down, you need find out why the webserver went down and fix the issue.

But even before you troubleshoot and fix the issue, you need to make sure your webserver is available to your users so they can continue using your website.

There are lot of sophisticated solutions are there, such as failing over to a redundant web server, or creating new instance using AWS AutoScaling. However this solutions comes with their own cost and complexity and if you are small shop like us you may not be able to afford it. And sometimes the simplest solution is to restart the webserver, if you don't have a failover server or AutoScaling configured for your server.

If your infrastructure is on AWS, you can use couple of AWS services to restart the webserver automatically without any manual intervention.

You need to use AWS CloudWatch Logs, CloudWatch Metrics, CloudWatch Alarm and AWS SystemManager RunCommand. Note that if you are using some Metrics published directly by AWS services(Such as EC2 CPU Utilization) than use of CloudWatch Logs is not necessary. You can directly skip to step 2 below.

Here are the steps:

1. Make sure that your web application logs are published to CloudWatch Logs. For this you need to install CloudWatch Logs Agent on the EC2 Instance and configure it as described here.

2. Configure Metrics based on the some entries of Logs published as per the step 1 as described here

3. Configure a CloudWatch Alarm based on the metric filter created above as described here. When if ask for selecting a metric, select the metric created above.

4. During the creation of the Alarm, select AWS RunCommand as the Action. Provide the following shell command as command to execute.

service apache2 restart

Or for that matter, any command you want to execute.

 


Summary of Paper: RSync Algorithm

 Rsync Algorithm: TR-CS-96-05.dvi (cmu.edu)

RSync is a utility which is used to transfer files or folders from computer to another in unix based systems. The key advantage of Rsync is once the full data is transferred to destination, only the changed bytes are transferred next time onwards, saving time and network bandwidth.

The above mentioned paper describes the algorithm used to find the changed bytes and how to file data is transferred from source and recreated at destination.

Here is the summary.

Instead of transferring complete file, divide the file into chunks of bytes and calculate its hashes. First time all the bytes are transferred to the destination and then for each chunk of bytes a weak rolling checksum (inspired by Adler-32 Checksum) and strong checksum of each block of bytes is calculated and shared with the source.

At source, the combination of rolling hash and strong hash are used to find the indexes of chunks of bytes that are same as some block of bytes destination. This way the bytes are different in source also can be found. Then source sends to destination those bytes that are different, the index of the previous block of bytes which were matching at both source and destination and the index of the current chunk of bytes before which the new data has to be inserted. At destination the file is recreated using the received bytes.

Only the chunks which are changed are transferred in this scheme.

Required courses for Software Developer

These are the courses that I have recommended to my team members. I think these courses are minimum that you should go through to become a complete software developer.

Due to current corona virus crisis if somebody is looking for courses to know about programming and to become a software developer, then they can certainly look for these courses.

However, this list is not complete. I will continue to add new courses which will greatly enhance the skill of a software developer. Let me know if something is missing here.


Fundamentals

Mathematics For Computer Science: https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-042j-mathematics-for-computer-science-fall-2010/index.htm
Fundamentals of Computing: https://www.coursera.org/specializations/computer-fundamentals
Introduction to Computer Science and Programming using Python: https://www.edx.org/course/introduction-to-computer-science-and-programming-using-python
Algorithms: https://www.coursera.org/specializations/algorithms
Data structure and Algorithms: https://www.coursera.org/specializations/data-structures-algorithms


Web Application Development

Web Applications(PHP): https://www.coursera.org/specializations/web-applications
Web Design(HTML5, CSS3): https://www.coursera.org/specializations/web-design
Progressive Web Apps: https://developers.google.com/web/progressive-web-apps/


Software Engineering

Software Development Lifecycle: https://www.coursera.org/specializations/software-development-lifecycle
Software Design and Architecure: https://www.coursera.org/specializations/software-design-architecture
Secure Software Design: https://www.coursera.org/specializations/secure-software-design
Agile: https://www.coursera.org/specializations/agile-development
Git: https://www.coursera.org/learn/git-distributed-development
Open source Software development: https://www.coursera.org/specializations/oss-development-linux-git


Others

Learning How to Learn: https://www.coursera.org/learn/learning-how-to-learn
Conversatinal Design: https://designguidelines.withgoogle.com/conversation/conversation-design/welcome.html
AWS: https://www.coursera.org/learn/aws-fundamentals-going-cloud-native
Requirements Engineering: https://www.coursera.org/specializations/requirements-engineering-secure-software

In addition to this, there are very good podcast I have list here that every software engineer should listen to.

Mental Models

Decision making is one of the most difficult task one face in the life. And almost every day we have to make one or other decisions, some simple, some complex. 
Many problems in world, there are few ready made templated solutions that we can apply to solve these problems. For example, In software development, we have Design Patterns, which is a "reusable solution to a commonly ocurring problem within a given context in software design" . 
Mental models are similar concepts when it comes to decision makings. Actually it is more than that. You can apply mental models not just in decision making, but every aspect in life, to understand the world, to reason about how and why certain events happen the way they happened. 
Mental models are general thinking principles which can help to think in a structured manner. By stream lining out thoughts while making decisions, we are less prone to miss those information, that would help us in better decision making. Same way, we can make better use of the available information by applying the mental models.
You can learn more about mental models and few great mental models from the book Great Mental Model - Volume 1 and Great Mental Model - Volume 2 by Shane Parrish.

    

Continuous Integration and Delivery pipeline using Bitbucket, AWS CodeBuild, AWS CodePipline and AWS CodeDeploy - Part 3

This is the third article in the series where I explain how to setup Continuous Integration and Delivery pipeline using Bitbucket, AWS CodeBuild, AWS CodePipline and AWS CodeDeploy. You can read previous two articles here and here.

According to Amazon Web Services website:
AWS CodePipeline is a fully managed continuous delivery service that helps you automate your release pipelines for fast and reliable application and infrastructure updates. CodePipeline automates the build, test, and deploy phases of your release process every time there is a code change, based on the release model you define.
So now, let's see how we can set up AWS CodePipeline.

Once you login to you AWS account, go to AWS CodePipeLine and click on "Create pipeline".
Here, provide the name of the pipleline and select "New Service Role" option.

 Choose all default settings under Advanced Settings and click "Next".

On the next step, select source provider, from where you have stored the input artifacts. Here I have selected S3 as source provider, because that is where I am storing output of the codebuild, which I created in the first article.


Click "Next" to go to the next step which is to add a build step. We will skip this step as we have already build our artifacts using code build. But you can add this step here if your artifacts are not built yet. Currently AWS supports AWS CodeCommit and Jenkins as its build providers.

After skipping the build stage, you will get to the "Deploy" stage where you will need to select deployment provider - which essentially means which tool you want to use to deploy your application. Here AWS CodePipeline gives us lot of flexibility as we can select one of the many deployment tools such as CodeDeploy, CloudFormation, Elastic BeanStalk, ECS, Amazon S3 etc. Here we will select AWS CodeDeploy as we have already created a deployment application and deployment group as mentioned in the second part of this series.

Click "Next" and review your changes and click on "Create Pipeline" button.

Your continuous delivery pipeline is ready and it will start automatically. If any artifact is already available in the S3 bucket which we have selected as source provider, then it will get deployed automatically.

One thing to note here is, in CodePipeline so far we have created only two stage: source and deploy. But you can create as many stages as possible, such as unittest stage, integration test state, staging test etc. This way you can create a full fledge continuous integration, continouos delivery as well as continuous deployment pipeline by using AWS CodePipeline.

Steve Jobs

Here is some of the qoutes from the audio book about Steve Jobs by Walter Isaacson:

Goal should be to build great products and lasting company, rather than just make profit.
Good user experience can only be given by controlling end to end user experience.
It is good to have some taste and class. Engage more with artists.
Always hire A players as A player will work with A players only. If you hire B player, then eventually you will be surrounded by C players. 
 
 

Continuous Integration and Delivery pipeline using Bitbucket, AWS CodeBuild, AWS CodePipline and AWS CodeDeploy - Part 2

In the previous post we looked at how to configure AWS Codebuild for CI/CD pipeline.
In this post we will look at configuring AWS CodeDeploy.
First Let's see what is CodeDeploy.
According to Amazon Web Services website, 
AWS CodeDeploy is a fully managed deployment service that automates software deployments to compute services such as Amazon EC2, AWS Lambda, and your on-premises servers.
So now let's see how we can set up AWS CodeDeploy.
Once you login to your AWS account, go to AWS CodeDeploy and click on "Create Application".
You need to first provide Application name (such as "MyFirstDeployment") and choose your compute platform (EC2, AWS Lambda, Amazon ECS). Here we will choose EC2 Instance as our comput platform.



Then click "Create Application".

Once the application is created you must create a deployment group. You can have multiple deployment group per application. Deployment group allows to have different deployment settings per different environment such as Production, Staging,Development or different deployment settings per different types of application in save environments such as Frontend Web application vs backend microservices.
Click  on the "Create Deployment Group" button.
Here first provide the deployment group name and choose an existing service role which has required deployment permissions (such as EC2 access permission, cloudwatch log creation permission, S3 permission if you are going to download artifacts from S3).



Now, in the "Deployment Type" section choose how you want to deployment to happen. "In-place" means every time you deploy the application, previous version of the application will be removed and new one will be deployed. This means that your application will not be available during the duration of the deployment.
In the "Blue/Green" deployment, each revision of the application is deployed to a different instance(existing or new), which then brought online after the deployment. Existing instance keeps running during the course of deployment and taken offline after the deployment. This way your application is always available even during the deployment.



Here we will stick to In-Place deployment for now.
After this you need to provide the Tags to select the environment in which you want to deploy.



Next you need to provide Deployment Settings and if you use Load Balancer.



We can skip the Trigger and Alarm section and create the deployment group.

In next post we will look at how to setup the AWS Codepipeline and connect the AWS CodeBuild, which we setup previously and AWS CodeDeploy which we setup in this post.






Read Part 3: Configuring AWS CodePipeline

Enjoy!!

Continuous Integration and Delivery pipeline using Bitbucket, AWS CodeBuild, AWS CodePipline and AWS CodeDeploy - Part 1

In this post I am going to show you how to develop a continuous integration and continuous delivery pipeline using AWS CodeBuild, AWS CodePipline and AWS CodeDeploy. I will use bitbucket as our source repository. But any other repository such as Github or Gitlab also can be used. Ofcourse, certain steps may defer as AWS Code Pipeline can pull the code directly from Github, but not from bitbucket.

So here are the steps:

1. Configure AWS CodeBuild to build the code by directly pulling from BitBucket and upload the build artifacts in S3.
2. Configure AWS CodeDeploy to pull the build package from S3(configured in above step) and deploy the application.
3. Configure AWS CodePipeline to get the build artifacts from S3 and deploy them using the AWS CodeDeploy application configured in step 2.


This will be three part series, and in this part-1 we will see how to configure AWS CodeBuild.

Part 2: Configuring AWS Code Deploy
Part 3: Configuring AWS CodePipeLine

Step 1: Configure AWS CodeBuild

Login to your AWS console and go to AWS CodeBuild.
Most of the steps to create a CodeBuild project are self explainatory. So here I will mention here the critical steps that you may want to get right.
Under the Source section, choose BitBucket and select "Repository In My BitBucket Account.".
Now choose your "BitBucket Repository" in which you have your code to build.
Now under the "Primary source webhook Events", check the box named "Rebuild every time a code change is pushed to this repository". Additional options will be available where you can configure the events on which code will be built.
For example, you can select whether the code should be built on every push, or on every pull request created or every pull request updated.
As depicted in below screenshot, I have configured my build project to build on everey push on the dev branch, but not to build when any tag is created or updated.



 Now it is time to configure the Environment under which the code is built. Here you have to select the whether you want to use AWS provided Image ("AWS Managed docker images") or custom image ("custom docker image"). 
For most of the common programming language runtimes and environments(such as dotnet, php, nodejs, java, golang) AWS provides, so choose "Managed Image", and then choose the operating system. Here, I have selected Ubuntu, Standard and aws/codebuild/standard:2.0 as Operating System, Runtime and Image respectively.



Now, as with any AWS service, you have to select a service role so that CodeBuild can build you project and upload the artifacts to S3 or use any other AWS services required.




Under the "Additional Configuration" section you can select timeout and specify if your build requires certificates, connection to VPC and compute requirements. Also, you can specify the Environment Variables here. Environment variables are helpful if you want to include custom build step depending on the environment or naming the build artifact based on the environment.




Now, comes the most important step, which is BuildSpec. You can write the commands to build your project using the BuildSpec file. In this section you can choose if build commands are included in a file (name buildspec.yml) in your project, or you can specify the build commands directly in the editor provided by AWS console.




Below is sample of the buildspec.yml, which includes commands to build nodejs project and package it as zip file.


As you can see above, in the first few lines, I provide the runtime environment(here, nodejs, version 10), in the "runtime-version".

In the build section, I have provided commands to build the project. For this node project, after running npm install and npm run build, I am copying the "node_module" folder to the "dist" folder, which contains all other project files except the node_modules folder.
In the artifacts section, I have specified the name of the zip file which contains all the file I want to include in the package. The artifacts section also allows me to specify the files I want to include or exclude in the package. Here I am specifying all the content of the dist folder, which is the output of the build commands mentioned above. I have included appspec.yml and deployment scripts which will be usefull for deploying the application. We will them later in the section about AWS CodeDeploy.

Next comes the Artifacts section, where you can provide the details about the S3 location where your artifacts will be saved.




As shown above, you have to select the S3 bucket(which should be pre-existing) where you want to save the artifacts and the name of the artifact zip file. You will notice that the artifact zip file name matches with the name I specified in the artifact section of the buildspec.yml file.

The Path option is the folder name in the S3 bucket where your artifact will be saved. So in this case, the artifact will be save as "my-codebuild-artifact-1/dev/my-api-dev.zip".

Once this configuration is done, you can click the "Create Build Project" button and your CodeBuild project will be created. Now start the build and once it is finished, you will see your artifact is saved in the above mentioned path.


Enjoy!!



Bugs error and software quality

Some notes:
  • Bugs are experienced failures, failures comes from faults within the software.
  • Faults are introduced in software when some process is skipped during the SDL, such as code review
  • Software cannot be tested 100%.
  • Test automation just makes software testing faster, it does not improve software quality.
  • Test automation should be context driven and adaptive.
  • People and managers should put time and money to improve skills of the people involved in software development to improve software quality.
  • Early feedback from actual users is important.
  • Tools and technology does not improve quality, it is how way use them affects the quality.

How to generate Random numbers

If you want to generate a random number for some business logic you are implementing, what would you do?

You would use Random class if you use Java or C#. Most programming language has some library function or class to give you random number.

But suppose you need to produce random numbers by your own without using any library function what would you do?

There are many algorithms to use to produce random number and here I will demonstrate a very basic algorithm which uses the modulo operator (%) in C#. The goal is not to come up with a foolproof algorithm to generate random numbers, the goal is to just use simple math trick to understand how random numbers can be generated. If you really need to generate random numbers in your programs then you should use the inbuilt library functions provided by the language or framework you are using.

You know what is modulo (%) operator is, right? It gives you the remainder when you divide the left hand number by right hand number.

so doing 20 % 20 will give you 0. And 20 % 19 will give you 1 and 20 % 18 will give to 2 and so on.

20 % 20 = 0
20 % 19 = 1
20 % 18 = 2
20 % 17 = 3
20 % 16 = 4
20 % 15 = 5
20 % 14 = 6
20 % 13 = 7
20 % 12 = 8
20 % 11 = 9
20 % 10 = 0
20 % 9 = 2
20 % 8 = 4
20 % 7 = 6
20 % 6 = 2
20 % 5 = 0
20 % 4 = 0
20 % 3 = 2
20 % 2 = 0
20 % 1 = 0

You can see when you divide 20 by numbers from 1 to 20 you get number 0 to 9 as remainders. The trick is the larger the dividend, larger the range of number you get as remainders.

So let’s set dividend d to a some large number, for the purpose of this post I will choose 10000. This will give you the range of 0 to 4999 as remainders.

But as you can see, this method produces sequential numbers, not random numbers. Well, on every iteration you use the remainder to produce a new dividend and you can see that instead of sequential numbers you are getting the random numbers.

So, let’s use below equation to produce a new dividend on every iteration, which uses the current remainder as new dividend:

remainder = a * current remainder + b % divisor

Using the above equation, we get below result, when a = 100, b = 100 and divisor is set to 19:

2100 % 19 = 10
1100 % 19 = 17
1800 % 19 = 14
1500 % 19 = 18
1900 % 19 = 0
100 % 19 = 5
600 % 19 = 11
1200 % 19 = 3
400 % 19 = 1
200 % 19 = 10
1100 % 19 = 17
1800 % 19 = 14
1500 % 19 = 18
1900 % 19 = 0
100 % 19 = 5
600 % 19 = 11
1200 % 19 = 3
400 % 19 = 1
200 % 19 = 10

Notice that the above equation produces a random number on every step but it repeats after few iteration. This is because the chosen values of the a, b and divisor.

Setting the divisor to a really large value can give us random numbers which may not repeat to soon. Such as below:

100 % 12345 = 100
10100 % 12345 = 10100
1010100 % 12345 = 10155
1015600 % 12345 = 3310
331100 % 12345 = 10130
1013100 % 12345 = 810
81100 % 12345 = 7030
703100 % 12345 = 11780
1178100 % 12345 = 5325
532600 % 12345 = 1765
176600 % 12345 = 3770
377100 % 12345 = 6750
675100 % 12345 = 8470
847100 % 12345 = 7640
764100 % 12345 = 11055
1105600 % 12345 = 6895 
689600 % 12345 = 10625
1062600 % 12345 = 930
93100 % 12345 = 6685
668600 % 12345 = 1970

Here, a and b is set to 100 and divisor is set to 12345, while current remainder is initialized to 0 at the start. Note that setting divisor to 12345 did not produce repeated numbers in the first 20 iterations, but it can still produce repeated numbers after few hundred iterations. But you get the idea, right?

The equation a + b * currentRemainder  % divisor is called Linear Congruential Generator. You can read more about it here.