Posted in API Development, Software Engineering

Master cURL: A Detailed Guide to API Testing

In this blog post, we will take a deep dive into cURL which stands for “Client URL”. It is a command line tool that helps in transferring data to and from a server to a client. cURL supports a lot of different protocols such as HTTP, HTTPS, FTP etc. cURL uses the libcURL URL transfer library.

This blog post contains the following sections.

  1. Introduction to API Testing
  2. cURL Installation
  3. Commonly Used cURL Options
  4. cURL Methods
  5. Practical Example – Google Books API

Introduction to API Testing

API stands for Application Programming Interface which is a set of rules and protocols that allows a client to interact with a software. An API contains

  • Endpoints – these are specific URLS that a client can use to access different methods and resources. Ex: /users, /texts
  • Requests and Responses – Clients send requests to a server endpoint which in turn sends back responses.
  • Data format – APIs can use various formats such as JSON or XML etc for the data being transferred between the client and the server.
  • Authentication – APIs are more secure when they verify identity of a user and require authorization before exchange of data. This can be achieved using API Keys, OAuth, Tokens etc.
  • Methods – Includes methods such as GET, DELETE, PUT, POST etc which perform various actions.

There are different types of API protocols but REST and SOAP are some of the most popular ones. In this blog post, we will use REST protocol to understand cURL commands.

There are various popular API Testing Tools such as Postman, SoapUI, REST Assured, Katalon Studio etc.

cURL Installation

I have a macOS so I use brew. But please follow the instructions here to install cURL based on your operating system.

Commonly Used cURL Options

X: Specify request method (GET, POST etc.)

-d: Send data in POST request

The “&” is used to separate the key=value pairs. You can also use -F parameter to pass form data as name=value pairs.

-H: Request headers

Users can use a Bearer token for authorization as well. The bearer token is an encrypted string that provides authentication for a client to get access to protected resources.

-I: Fetch only the HTTP Headers

This is the HTTP HEAD method to get only a resource’s HTTP headers.

-o: Save output to a file (files with same name will be overwritten)

-L: Redirect

Tell curl to follow redirect requests since curl does not perform 300 redirect requests.

-u: Username and password specification

-c: Save cookies to a file

This will save the cookies returned by the server in the cookies.txt file.

-b: Read cookies from a file

To activate the cookie engine and read cookies. cURL will see the “=” and know that cookies are specified.

-x: Use proxies

cURL Methods

There are four important cURL methods – GET, POST, PUT and DELETE.

GET Request

This is the default method when making HTTP calls with curl. The following example fetches all the users from the url.

POST Request

POST is used to send data to a receiving service. We can do this using the data or -d option.

PUT Request

PUT is used to update an existing resource. In this case update the user id #1.

DELETE Request

This example deletes the user id #1.

Practical Example – Google Books API

Now let’s walk through an actual example in the code below. In the following example, let’s use the Google Books API.

Requests need to be authenticated using the API key or OAuth. Follow the instructions in the link above to get an API key that you can use for the request below.

#!/bin/bash

# Replace with your actual API key
API_KEY="{API_KEY}"

# Define the base URL for the Google Books API
BASE_URL="https://www.googleapis.com/books/v1"

# URL-encode the query term. Harry Potter becomes "Harry%20Potter"
QUERY="Harry Potter"
ENCODED_QUERY=$(echo "$QUERY" | jq -sRr @uri)

# Function to search for books by a query
search_books() {
  echo "Searching for books with query: $QUERY"
  curl -s "${BASE_URL}/volumes?q=${ENCODED_QUERY}&key=${API_KEY}" | jq '.items[] | {title: .volumeInfo.title, authors: .volumeInfo.authors, publisher: .volumeInfo.publisher, publishedDate: .volumeInfo.publishedDate}'
}

# After URL encoding, J.K. Rowling becomes "J.K.%20Rowling"
AUTHOR="J.K. Rowling"
ENCODED_AUTHOR=$(echo "$AUTHOR" | jq -sRr @uri)

# Function to list books by a specific author
list_books_by_author() {
  echo "Listing books by author: $AUTHOR"
  curl -s "${BASE_URL}/volumes?q=inauthor:${ENCODED_AUTHOR}&key=${API_KEY}" | jq '.items[] | {title: .volumeInfo.title, publisher: .volumeInfo.publisher, publishedDate: .volumeInfo.publishedDate}' | head -n 20
}

# Main script execution
search_books

# List books by author
list_books_by_author

The output of the above bash script will be as follows.

Searching for books with query: Harry Potter
{
  "title": "Harry Potter and the Sorcerer's Stone",
  "authors": [
    "J.K. Rowling"
  ],
  "publisher": "Pottermore Publishing",
  "publishedDate": "2015-12-08"
}
{
  "title": "Harry Potter and the Chamber of Secrets",
  "authors": [
    "J.K. Rowling"
  ],
  "publisher": "Pottermore Publishing",
  "publishedDate": "2015-12-08"
}
{
  "title": "Harry Potter and the Prisoner of Azkaban",
  "authors": [
    "J.K. Rowling"
  ],
  "publisher": "Pottermore Publishing",
  "publishedDate": "2015-12-08"
}
{
  "title": "Harry Potter and the Cursed Child",
  "authors": [
    "J. K. Rowling",
    "Jack Thorne",
    "John Tiffany"
  ],
  "publisher": null,
  "publishedDate": "2017"
}
{
  "title": "The Irresistible Rise of Harry Potter",
  "authors": [
    "Andrew Blake"
  ],
  "publisher": "Verso",
  "publishedDate": "2002-12-17"
}
{
  "title": "The Psychology of Harry Potter",
  "authors": [
    "Neil Mulholland"
  ],
  "publisher": "BenBella Books, Inc.",
  "publishedDate": "2007-04-10"
}
{
  "title": "Harry Potter and the Half-Blood Prince",
  "authors": [
    "J.K. Rowling"
  ],
  "publisher": "Pottermore Publishing",
  "publishedDate": "2015-12-08"
}
{
  "title": "Fantastic Beasts and Where to Find Them: The Illustrated Edition",
  "authors": [
    "J. K. Rowling",
    "Newt Scamander"
  ],
  "publisher": "Arthur A. Levine Books",
  "publishedDate": "2017-11-07"
}
{
  "title": "JK Rowling's Harry Potter Novels",
  "authors": [
    "Philip Nel"
  ],
  "publisher": "A&C Black",
  "publishedDate": "2001-09-26"
}
{
  "title": "The Magical Worlds of Harry Potter",
  "authors": [
    "David Colbert"
  ],
  "publisher": "Penguin",
  "publishedDate": "2008"
}


Listing books by author: J.K. Rowling
{
  "title": "Conversations with J. K. Rowling",
  "publisher": null,
  "publishedDate": "2002-01-01"
}
{
  "title": "Harry Potter and the Walls of America",
  "publisher": "Createspace Independent Publishing Platform",
  "publishedDate": "2017-01-01"
}
{
  "title": "Harry Potter and the Philosopher's Stone - Ravenclaw Edition",
  "publisher": "Bloomsbury Children's Books",
  "publishedDate": "2017-06"
}
{
  "title": "Harry Potter and the Philosopher's Stone",
  "publisher": "Bloomsbury Harry Potter",
  "publishedDate": "2001"
}

Hope you are now confident with making cURL requests! In the next blog post, let us look into Postman in detail – an API testing tool that makes things way easier for API development.

Posted in AI/ Machine Learning, Search and Ranking, Software Engineering

K-Nearest Neighbors (KNN) Algorithm

We learnt about Semantic Search in my previous post. Here is the link if you missed it. K-Nearest Neighbors is one of the most popular Machine Learning algorithms used in semantic search to find documents or data that is semantically similar to a user’s query.

To recap, documents are represented as vector embeddings in a vector space model that captures semantic similarities based on the context. A user query is then converted to a vector embedding and efficient algorithms such as KNN, ANN (Approximate Nearest Neighbor) search are used to find the nearest neighbors to the query. This is where K-Nearest Neighbors algorithm is used where “K” refers to the number of nearest neighbors we want to consider before prediction.

Nearest Neighbor Search

How does KNN work?

KNN works by finding the “K” nearest neighbors in a vector space model. The value of K needs to be chosen carefully to balance between noise and over smoothing. One way is to test for accuracy of the algorithm by evaluating on different values of K using cross-validation where data is divided into training and validation datasets. The validation data set is used to evaluate the performance of the algorithm. A plot with K values and the corresponding error rate can help determine the best K value with the minimum error rate.

KNN finds the nearest neighbors using distance metrics. This can be anything such as Euclidean distance, Manhattan distance etc. We will look at the various distance metrics in detail in the next section.

KNN can be used for both Classification and Regression problems.

  • For Classification problems, the majority class is chosen among the K nearest neighbors.
  • For Regression problems, the average (or the weighted average) of the K nearest neighbors is used.

Distance Metrics

Various distance metrics are used for KNN, Euclidean distance being one of the most common ones.

Euclidean Distance

{\displaystyle d(p,q)={\sqrt {(p_{1}-q_{1})^{2}+(p_{2}-q_{2})^{2}+\cdots +(p_{n}-q_{n})^{2}}}.}
Source: Wikipedia

Euclidean Distance gives the straight line distance between two coordinates in a vector space. This works great for numerical features in the input data.

Manhattan Distance

{\displaystyle \sum _{i=1}^{n}|p_{i}-q_{i}|}
Source: Wikipedia

Manhattan Distance calculates the sum of absolute differences between the vector data points. This is good with categorical features. It is not as sensitive to outliers as Euclidean Distance.

Mikowski Distance

{\displaystyle D\left(X,Y\right)={\biggl (}\sum _{i=1}^{n}|x_{i}-y_{i}|^{p}{\biggr )}^{\frac {1}{p}}.}
Source: Wikipedia

Mikowski Distance is a generalization of the above distances.

  • If p = 1, we get Manhattan Distance
  • If p = 2, we get Euclidean Distance

Hamming Distance

Hamming Distance between two vectors of same length is the number of positions where the corresponding vector values are different. This is suitable for categorical and even binary data.

Cosine Similarity

{\displaystyle {\text{cosine similarity}}=S_{C}(A,B):=\cos(\theta )={\mathbf {A} \cdot \mathbf {B}  \over \|\mathbf {A} \|\|\mathbf {B} \|}={\frac {\sum \limits _{i=1}^{n}{A_{i}B_{i}}}{{\sqrt {\sum \limits _{i=1}^{n}{A_{i}^{2}}}}\cdot {\sqrt {\sum \limits _{i=1}^{n}{B_{i}^{2}}}}}},}
Source: Wikipedia

This is commonly used for high dimensional vector data or text data and it calculates the distance using the cosine of the angle between two vectors.

KNN Algorithm Example for Classification

We have a good understanding of KNN. Now let us look at an actual code example of KNN for a Classification problem.

As a coding exercise, try to implement the averaging functionality for a regression KNN algorithm on your own.

from algorithms.knn import distances

METRICS = {
    "euclidean": distances.EuclideanDistance(),
    "manhattan": distances.ManhattanDistance(),
    "mikowski": distances.MikowskiDistance(),
    "hamming": distances.HammingDistance(),
    "jaccard": distances.JaccardDistance(),
    "cosine": distances.CosineDistance()
}


def get_majority_element(computed_distances):
    """
    Takes an iterable of tuples (distance, class_type)
    and finds the majority class_type
    :param computed_distances: iterable of tuples
    :return: majority element: string
    """
    freq = {}
    for _, class_type in computed_distances:
        if class_type not in freq:
            freq[class_type] = 1
        else:
            freq[class_type] += 1
    return max(freq, key=freq.get)


def knn(training_data, k, distance_metric, test_value):
    """
    Find the k nearest neighbors in training_data for the
    given test value to determine the class type
    :param training_data: Dictionary of class type keys and list of vectors
    as values
    :param k: Integer
    :param distance_metric: string
    :param test_value: query vector
    :return:
    """

    if distance_metric not in METRICS:
        raise ValueError(distance_metric + "is not a valid input")

    distance_calculator = METRICS[distance_metric]
    computed_distances = []

    for class_type, points in training_data.items():
       for point in points:
           distance = distance_calculator.calculate(point, test_value)
           computed_distances.append((distance, class_type))

    # sort the tuples by the computed distance
    computed_distances.sort()
    return get_majority_element(computed_distances[:k])
from abc import abstractmethod, ABC
from math import pow, sqrt

ROUND_TO_DECIMAL_DIGITS = 4

class Distances(ABC):
    @abstractmethod
    def calculate(self, x1, x2):
        pass

class EuclideanDistance(Distances):
    def calculate(self, x1, x2):
        """
        Compute the Euclidean distance between two points.

        Parameters:
        x1 (iterable): First set of coordinates.
        x2 (iterable): Second set of coordinates.

        Returns:
        float: Euclidean Distance
        """
        if len(x1) != len(x2):
            raise TypeError("The dimensions of two iterables x1 and x2 should match")
        return round(sqrt(sum(pow(p1 - p2, 2) for p1, p2 in zip(x1, x2))), ROUND_TO_DECIMAL_DIGITS)

Real World Applications of KNN

KNN is used in a variety of applications such as

  • Recommendation Systems – to recommend the most popular choices among users for collaborative filtering by identifying users with similar behavior.
  • Disease Classification – it can predict the likelihood of certain conditions using majority voting.
  • Semantic Search – to find semantically similar documents or items for a user’s query.
  • Text Classification – spam detection is a good example.
  • Anomaly Detection – It can signify data points that differ from the rest of the data in a system and many many more.

I hope you enjoyed reading about K-Nearest Neighbors algorithm. We will continue to explore various topics related to Semantic Search as well as Search and Ranking in general over the next few weeks.

If you like learning about these types of concepts, don’t forget to subscribe to my blog below to get notified right away when there is a new post 🙂 Have an amazing week ahead!

Posted in Software Engineering

HTML and CSS basics for Backend Engineers

HTML (Hyper Text Markup Language) and CSS (Cascading Style Sheets) are both important technologies used in Web Development but when I came into Backend Development I did not know much about these. I still haven’t had the need to work with them except for a newsletter that I was putting together every month for a year but that made me feel interested to learn more. This article will go over some of the basics that are useful to know as a Backend Developer with a lot of useful resources for further study.

HTML

A basic HTML document is a text file made up of elements and tags. The first page is named as index.html. There are various tags but the following make the overall outline of the document.

  • <!DOCTYPE html> as the first line of an html document so that browsers can render the page correctly.
  • <head></head> contains any machine readable information
  • <html></html> that contains all the document content
  • <body></body> which contains all the parts visible in a browser
<!DOCTYPE html>
<html>
<title>Title</title>
<head>
</head>
<body>
    <h1>This is the first heading</h1>
    <p>This is the first paragraph</p>
</body>
</html>

HTML Tags

Here are a few other important html tags.

TagDescription
<h1></h1> to <h6></h6>Heading tags
<p></p>Paragraph tag
<img src=”image.jpeg” alt=”Name of the image” width=”100″ height=”100″>Image tag
<a href=”www.wordpress.com”>Text for this link</a>HTML links
<table></table>Table
<th></th>Table header
<tr></tr>Table row
<td></td>Table data
<div></div>Container for HTML block level elements and styling
<span></span>Used to organize inline elements
<link>Link a CSS stylesheet

Here is a simple webpage using some of the above elements. Try it in your own browser as well!

<!DOCTYPE html>
<html>
<title>HTML AND CSS</title>
<head>
</head>
<body>
<h1>HTML and CSS basics</h1>
<p1>Differences between HTML and CSS</p1>
<br></br>
<table border="1">
    <tr>
        <th>HTML</th>
        <th>CSS</th>
    </tr>
    <tr>
        <td>Stands for Hyper Text Markup Language</td>
        <td>Stands for Cascading Style Sheets</td>
    </tr>
    <tr>
        <td>Used to build static web pages and applications</td>
        <td>Used to enhance the presentation of an HTML document</td>
    </tr>
</table>
<a href="thatgirlcoder.com">Learn more here!</a>
</body>
</html>

This is the web page as rendered by the browser.

You can find a list of all HTML elements here.

Block vs Inline Elements

InlineBlock
Does not start in a new lineAlways starts in a new line
Takes up as much width as needed onlyTakes up full width
Examples: <span>, <a>, <br>, <img> etc.Examples: <p>, <div>, <table>, <form>,

CSS

CSS stands for Cascading Style Sheets and a styles.css file can be defined to link to an HTML document to change its styling.

CSS can be applied to an HTML page either internally or externally. To use it internally, the <style> tag is used as follows.

<!DOCTYPE html>
<html>
<head>
    <title>Title</title>
    <style>
        body {
            background-color: #e1b382;
        }
    </style>
</head>
</html>

To apply an external stylesheet, the <link> tag is used as shown below.

<head>
    <link rel="stylesheet" type="text/css" href="<CSS_FILE_NAME">
</head>

Selectors

In CSS, different types of selectors are used to select various HTML elements to apply styling rules to. Some of the important type of selectors are as follows.

Element Selectors

This is used to select HTML elements based on type such as <p>, <h1> etc.

<h1>This is a heading</h2>
.h1{
    color: #xyz;
}

ID Selectors

This uses the ID attribute of an HTML element.

<div id="section"></div>
#section{
    color: #xyz;
}

Class Selectors

Selects using the class attribute of HTML elements.

<p class="paragraph">Start reading!</p>
.paragraph {
    color: #xyz;
}

Descendant Selectors

This is used to select HTML elements contained within another selector. In the below example, the color will be applied to all the <h1> elements that are descendants of the span ID element.

<span id="span">
    <h1>Heading 1</h1>
    <h1>Heading 2</h1>
</span>
#span h1{
    color: #xyz;
}

Child Selectors

This is more specific that descendant selectors. This will select the immediate descendants (ie., children) of a parent selector. In the below example, the color will be applied only to the immediate children of the “section” Id which are “Heading 1” and “Heading 2”

<div id="section">
    <h1>Heading 1</h1>
    <h1>Heading 2</h1>
    <div>
        <h1>This is a nested heading</h1>
    </div>
</div>
#section > h1 {
    color: #xyz;
}

CSS Box Model

The box model consists of the following properties to represent an HTML element as a box.

  • Content – text, images etc
  • Padding – area around the content
  • Border – non-transparent area around the content and padding
  • Margin – area around the border

I like to remember this as MBCP from outer most Margin to inner most Content.

div {
  width: 200px;
  border: 10px blue;
  padding: 20px;
  margin: 20px;
}

Now let’s design a super simple food menu as shown below. I have included the css code in the same file so that you can use it in any online code editor.

<!DOCTYPE html>
<html>
<head>
    <title>Veg Paradise</title>
    <style>
        body {
            background-color: #e1b382;
        }
        h1 {
            color: #12343b;
        }
        h2 {
            color: #c89666;
        }
        .center-text {
            margin-left: auto;
            margin-right: auto;
            text-align: center;
            padding-top: 12px;
            padding-bottom: 12px;
            background-color: #2d545e;
        }
        p {
            color: #12343b;
        }
        h2 > span {
            color: #FA9F42;
            font-size: 0.75em;
        }
        #copyright {
            font-size: 0.75em;
        }
    </style>
</head>
<body>
    <div class="center-text">
        <h1>Food Menu</h1>
        <h2>Moong dal cheela <span>New!</span></h2>
        <p>Yellow split lentils, ginger, green chillies.</p>
        <h2>Pesarattu</h2>
        <p>Whole green gram, ginger, green chillies, cumin seeds.</p>
        <h2>Idly</h2>
        <p>Fermented rice batter.</p>
        <h2>Idiyappam</h2>
        <p>Rice noodles, shredded coconut served with peanut curry.</p>
        <h2>Dosa</h2>
        <p>Fermented rice batter.</p>
        <h2>Millet upma</h2>
        <p>Pearl millet, mixed vegetables.</p>
        <h2>Vegetable poha</h2>
        <p>Flattened rice, chickpeas, mixed vegetables flavored with lemon juice.</p>
    </div>
    <div class="center-text">
        <p id="copyright">
            Copyright Veg Paradise
        </p>
    </div>
</body>
</html>

Hope you learned the basics of HTML and CSS to navigate web development as a Backend Engineer. I recommend the following resources when you are programming.

Resources

Posted in Better Programming, Java Programming, Software Engineering

Basics of Java Threads

What is a thread?

A Thread in Java is a unit of execution within a process. Every Java program has atleast one thread (the main() thread). If we do not create a thread explicitly, our program runs on the main thread.

A process can therefore contain multiple threads. For this reason, creating threads is a more lightweight action compared to the resources it takes for the creation of a process. Threads terminate quickly as well compared to processes.

Why to use Multithreading?

  • To execute two or more threads at the same time and take advantage of multicore architectures
  • To run async background tasks such as logging, IO tasks etc
  • Run isolated code in parallel to increase computation speed for CPU bound processes
  • To create watchers for configuration changes

How to create threads?

Let’s look at some common ways of creating Threads in Java. There are a couple of simple ways to create threads in Java, namely

  • Implement the java.lang.Runnable interface and override the run() method
  • Extend the java.lang.Thread class and override the run() method


Method 1: Extend the java.lang.Thread class and override the run() method

public class Main {
    System.out.println("Running in main thread.");
    Thread myThread = new MyThread();
    myThread.setName("--- MyThread ---");
    myThread.start(); // This runs the thread
}

public class MyThread extends Thread {
    @Override
    public void run() {
        System.out.println("Hello from " + currentThread().getName());

        try {
            Thread.sleep(2000); 
        } catch (InterruptedException e) {
            System.out.println("MyThread was interrupted.");
            return;
        }
    }       
}
Running in main thread.
Hello from --- MyThread ---

Method 2: Implement the java.lang.Runnable interface and override the run() method

In this method, we create an instance of the class implementing the Runnable interface and pass it to the Thread() constructor.

public class Main {

    public static void main(String[] args) {
        System.out.println("Running in main thread.");
        Thread myRunnableThread = new Thread(new MyRunnable());
        myRunnableThread.start();
    }
}

public class MyRunnable implements Runnable {

    @Override
    public void run() {
        System.out.println("Hello from MyRunnable's run() method");
    }
}
Running in main thread.
Hello from MyRunnable's run() method

Method 3: Anonymous class overriding the run() method

public class Main {

    public static void main(String[] args) {
        System.out.println("Running in main thread.");

        new Thread() {
            @Override
            public void run() {
                System.out.println("Hello from the anonymous class run() method.");
            }
        }.start();
    }
}
Running in main thread.
Hello from the anonymous class run() method.

Method 4: Anonymous implementation of Runnable interface

public class Main {

    public static void main(String[] args) {
        System.out.println("Running in main thread.");

        new Thread(new Runnable() {
            @Override
            public void run() {
                System.out.println("Hello from the anonymous Runnable implementation of run() method");
            }
        }).start();
    }
}
Running in main thread.
Hello from the anonymous Runnable implementation of run() method

Gotchas

  • Every thread created in a process shares the process memory and files which can lead to concurrency problems if not handled correctly. Each Thread has its own Thread stack that only that particular thread can access.
  • A thread does not have to complete before another one starts unless we use something such as join() or interrupt() in Java or other ways to make a thread wait until another one completes execution. JVM decides when to schedule different threads to run.

Hope you learnt about the basics of Threads and some simplest ways of creating threads.

In the four methods that we saw above, threads should be instantiated and managed by developers manually. Oracle came up with a way of abstracting thread management using Executor API with its focus mainly on asynchronous processing rather than Thread management. This is a topic on its own, so let’s explore that in more detail in the next article along with synchronization of threads etc.

Posted in Better Programming, Java Programming, Software Engineering

6 Confusing Java Concepts Simplified!

Java, Software, Software Development

In this post, we will be learning about the following concepts.

  1. Instance Vs Static Methods
  2. Interfaces Vs Abstract Classes
  3. Inner Vs Anonymous Classes

1. Instance Vs Static Methods

What are static methods?

These methods are declared using the static modifier. They are mainly used when we don’t require any data access through the instance of a class. For the same reason, static methods do not have access to this keyword referring to the current instance of a class.

Syntax of Static methods

class MyClass {
    public static void staticMethodName() {
        System.out.println("This is a static method");
    }
}

// Accessing a static method
MyClass.staticMethodName();

When to use static methods?

  • Declare a method as static when it doesn’t use any class instance variables.
  • When every instance of a class should share the same copy of variables and methods, declare them as static.

What are instance methods?

Instance methods are those that are accessible through an instance of a class created using the new keyword. These methods can access the current instance of the class using this keyword.

Syntax of Instance methods

class MyClass {
    public void methodName() {
        System.out.println("This is an instance method");
    }
}

// Accessing an instance method requires an object to be created
MyClass object = new MyClass();
object.methodName();

When to use instance methods?

  • If the method uses or modifies instance variables, then declare them as instance methods.
  • When each instance of a class should have its own copy of variables, use instance variables and methods.

2. Interfaces Vs Abstract Classes

What are interfaces?

Interfaces contain declaration of methods of a class but not their implementation, but they can contain default and static methods from Java 8. Therefore, they define the type of operations that an object can perform but the details of those operations are to be defined by classes that implement an interface. They can also be Extended by other interfaces.

Interfaces cannot be instantiated because they represent a contract that the class that implements an interface will be able to perform all the operations declared in it. You can also think of it this way, if the methods contain no implementations at all to use or modify class variables, would there be any use to instantiating an interface?

Syntax of interfaces

public interface MyInterface {
    
    // Fields should be constant and are static and final
    final int number = 1;
    
    // Notice that the methods are abstract by default
    public void method1();
    public void method2();
    public void method3(int number);
}

An interface can be implemented using the implements keyword

class MyClass implements MyInterface {
    @Override
    public void method1() {
        System.out.println("Implementation of method 1");
    }

    @Override
    public void method2() {
        System.out.println("Implementation of method 2");
    }

    @Override
    public void method3(int number) {
        System.out.println("Implementation of method 3");
    }
}

An interface cannot be instantiated, but it is possible to declare a variable of an interface type and assign a class instance to it implementing that interface.

public class Main {
    public static void main(String[] args) {

        // A class instance declared using interface type
        MyInterface myClassInstance;
        myClassInstance = new MyClass();
    }
}

// This is not valid
MyInterface sampleInterface = new MyInterface();

When to use interfaces?

  • When completely unrelated classes should be able to implement the interface. Example: Comparable and Cloneable.
  • To specify the overall behavior without worrying about who implements them or how.

An excellent example of an interface is the Java collections API. Notice how various classes such as ArrayList, LinkedList and Stack all have the same APIs such as add(), isEmpty(), remove(), size() etc with different implementation details.

What are abstract classes?

Abstract classes cannot be instantiated into an object as well. They can contain methods with and without implementation details.

An abstract class can extend from only one parent class although it can implement multiple interfaces. The issue of being able to extend from only one class is not only limited to abstract classes. You can read more about the diamond problem here.

Syntax of abstract classes

public abstract class myAbstractClass {

    private String myVariable;
 
    public myConstructor(String myVariable) {
        this.myVariable = myVariable;
    }

    // Note how abstract classes can contain regular and abstract methods
    public abstract void myAbstractMethod1();
    public abstract void myAbstractMethod2();

    public String getMyVariable() {
        return myVariable;
    }
}

// Abstract classes can be extended this way
public class myExtendedClass extends myAbstractClass {

    public myExtendedClass(String myVariable) {
        super(myVariable);
    }

    @Override
    public void myAbstractMethod1() {
        System.out.println("Implementation of myAbstractMethod1");
    }

    @Override
    public void myAbstractMethod2() {
        System.out.println("Implementation of myAbstractMethod2");
    }

    public void printMyVariable() {
       System.out.println("My variable is " + getMyVariable());
    }

}

When to use abstract classes?

  • If a class contains abstract methods ie., methods without implementation details, then the class should be declared as abstract.
  • To share code with related classes.
  • To do things interfaces (Java < 9) don’t do – to enable classes to extend abstract classes and define fields or methods with access modifiers such as private, protected and to declare non-static and non-final fields.

To summarize, use abstract classes when you need a base class with certain definitions that different derived classes can share.

Java has several examples of abstract classes such as InputStream, OutputStream Reader etc. Note how they all extend only from one class but implement multiple interfaces.

3. Inner Vs Anonymous Classes

What are inner classes?

Inner classes are those that are declared inside another class or interface without a static modifier aka non-static nested classes. There are three main types.

  • Member inner class – lives inside a class
  • Anonymous inner class – to create an instance of an object with some extra functionality, to overload existing methods of a class or interface
  • Local inner class – lives inside a method

In most cases, they are declared as private so that they aren’t exposed to other classes.

The inner class can access all the member variables and methods of the outer class include private ones.

Syntax of inner classes

// This example shows a member inner class
class MyOuterClass {
    class MyInnerClass {
    }
}

// Example code to create an object of inner class
MyOuterClass outerObject = new MyOuterClass();
MyOuterClass.MyInnerClass innerObject = outerObject.new MyInnerClass();


// This example shows an anonymous inner class
// This is similar to a constructor invocation with class definition inside a block
MyClass object = new MyClass() {
                     @Override
                     public void method() {
                         System.out.println("This is an anonymous inner class with a method overriding the method of MyClass that implemented from another interface");
                     }
                 };

// To call the method
object.method();

When to use inner classes?

  • If a class is only useful inside the scope of another class and is coupled to it, then create an inner class ie., without an existing outer class object, there is no chance of existing inner class object.
  • Create an anonymous inner class to provide an additional functionality to an object.
  • To program a class that no other class can access except an outer class.
  • Declare an inner class as private if no other classes should be able to create an object of that inner class except the outer class.

What are anonymous classes?

We already learnt about anonymous classes in the previous section, but let’s get into a little more detail here. Anonymous classes are inner classes in Java that do not have a name and are declared and instantiated in the same statement.

Since they do not have a name, we can’t create instances of anonymous classes or define a constructor inside the class body.

They extend the top level class and implement an interface or extend an abstract class.

Syntax of anonymous classes

Runnable action = new Runnable() { // Runnable is a Java interface
    @Override
    public void run() {
        System.out.println("This is an anonymous class method");
    }
}; // Semicolon is important since anonymous classes are expressions

When to use anonymous classes?

  • To use a local class only once.
  • To quickly override a small amount of functionality instead of the overhead of creating a separate class.
  • To use variables or constants declared in the code right away in an anonymous class instead of passing it through the constructor of a class.
  • To avoid having to override all the unimplemented methods of an interface or abstract class.
  • Do not use them to override a lot of functionality since this can make the code unreadable.

Hope this article clarified some of the confusing concepts in Java to enable you to use them in the right places. Please feel free to give feedback if any part of article can be improved. Happy coding!

References:

Posted in Better Programming, Software Engineering

Learn the basics of Web Caching

Caching is a mechanism by which responses from the web server such as pages, images etc are stored so that when a client requests the resource again, the response is served from the cache instead of sending a request to the web server.

Why is it important to use caching?

  • Reduce the number of requests sent to the server
  • Reduce the latency of responses for the client by serving content from nearby cache instead of a remote server
  • Reduce network bandwidth by minimizing the number of times a resource is sent over the network from the web server

There are two main types of web caches. Let’s take a look at them.

Browser Cache

Browser Cache

If you are a Mac and Chrome user, you can find the the contents of the browser cache in the following path.

/Library/Caches/Google/Chrome/

The browser cache stores parts of pages, files, images etc to help them open faster during a user’s next visit. When a user clicks the back or next button in the browser, the contents are served from the cache directly. The contents of the cache are refreshed regularly after a certain amount of time or during every browser session.

How is the browser cache controlled?

There are several caching headers to define cache policy. Let’s look at the Cache-Control header in the following example which is set to private. This means that a private browser cache can store the response.

Source: redbot.com

There are different caching directives that can be used to set this header.

Caching HeadersDescription
Cache-Control: no-storeNothing should be cached about the request or response
Cache-Control: no-cacheThe cache sends a validation request to the server before serving from cache
Cache-Control: privateThe response is only applicable to a single user and must not be stored by a shared cache
Cache-Control: publicThe response can be stored by any cache
ExpiresThis header contains the date/time after which the response is considered stale. Ex: Expires: Wed, 22 Sept 2021 12:00:00 GMT
EtagThe entity-tag given in an ETag header field is used for Cache validation. One or more entity-tags, indicating one
or more stored responses, can be used in an If-None-Match header by the client for response validation.
Last-ModifiedThe timestamp given in a Last-Modified header can be used by the client in an If-Modified-Since header field for response validation
Caching Headers

For further reading, please refer to this detailed article on HTTP Caching.

Proxy Cache

Proxy Cache

Most web services these days use a proxy server as a gateway to handle requests before hitting the web servers. When a server acts as a caching proxy, it stores content and shares those resources with more users. Therefore this type of cache is also known as a shared cache. When a user sends a request, the proxy sever checks for a recent copy of the resource. If it exists, it is then sent back to the user, otherwise the proxy sends a request to the source server and caches the resulting content.

CDNs (Content Delivery Networks) are one of the most popular proxy servers. CDNs are a large network of servers geographically distributed around the world to serve content from a server closest to the user sending a request. When CDNs are configured properly, these can also help a web service prevent DDOS (Distributed Denial of Service) attacks as well.

What is cached?

HTTP caches usually cache responses to a GET request. This can be HTMP documents, images, style sheets or files such as media, javascript files etc. Secure and authenticated requests such as HTTPs will not be cached by shared caches. It is also possible to cache permanent redirects and error responses such as 404 (Not Found).

  • If the cached content is fresh (not expired or is in accordance with the max-age caching header, then it is served directly from the cache. There are other ways to determine freshness and perform cache validation, but we won’t go into the details here. I encourage you to read up on them if you’re interested.
  • If the content is stale, the must-revalidate Cache-Control directive is used to tell the cache to verify the freshness of the content.

The primary key used to cache contains the request method (GET) and the target URI (Uniform Resource Identifier). HTTP Caches are limited mostly to GET, so caches mostly ignore other methods and use the URI as the primary caching key.

Caching Best Practices

  • Consistent URLs – Use the same URL for serving same content on different pages and sites to users.
  • Library of content – Use a single source of truth library to store images and other shared content such as style sheets etc and refer to the same library from any page or site.
  • Avoid bulk modifications – The Last-Modified date will be set to a very recent once when you update too many files at the same time, so be aware of changing only the necessary ones.
  • Cache control – Use the appropriate cache control policies. If the response is private to the user, allow private caching and for generic content, set caching policy to public.
  • Use caching validators – Use the validation headers we learnt about in the table above such as Etag and Last-Modified so that caches can validate their content without having to download the resources from the server unnecessarily.
  • Max-age cache control – Set cache control to max-age for pages and images that will be updated only rarely.

I hope you enjoyed learning about the basics of web caching! In the next article, we will learn how to implement a simple cache from scratch.

Posted in Better Programming, Software Engineering

Fundamentals of HTTP Requests, Cookies and Sessions

What is HTTP?

Client Server Architecture

Hypertext Transfer Protocol is an application layer (layer 7 in the OSI model) protocol to transfer hypermedia (graphics, audio, plain text and hyperlinks etc) over the network. There are several iterations of the HTTP protocol namely

  • HTTP/1
  • HTTP/2
  • HTTP/3

Majority of the websites are using HTTP/1.1 and HTTP/2.

HTTP is a request-response based protocol for the purpose of communication in a Client-Server based architecture. HTTP commonly uses TCP (Transmission Control Protocol) underneath as its transport layer protocol to enable reliable communication.

How is HTTP stateless?

A stateless protocol is one in which the receiver does not retain any state or session information from previous requests. This means that each HTTP request is processed in isolation. IP (Internet Protocol) is another example of a stateless protocol.

On the other hand, TCP on top of which HTTP is built is a stateful protocol. This is because the client and server agree on

  • how much data will be transferred
  • order of the packets to be reassembled at either ends

which makes TCP a very reliable transport layer protocol. Within the scope of an HTTP request, the TCP connection is stateful thus ensuring reliable transfer of data. However once that request is processed and a response is sent back, no information about the request is retained. To store state information, various session management techniques are used by web servers.

What are Sessions?

Session Management is used to implement state on top of the stateless HTTP. For example: if a user logged in to a website and is authenticated, the server should not repeatedly ask for the user’s credentials with every subsequent interaction. This is accomplished by using HTTP cookies or session IDs.

HTTP Cookies

Cookies enable web browsers to store stateful information about a user session. These are chunks of data about a user’s session that is sent by the web server to a client device. More than one cookie can be stored by the browser in the user’s device.

Although authorization cookies are essential, this other type called tracking cookies have come under much scrutiny due to privacy concerns. Tracking cookies especially third-party tracking cookies are used to track your browsing history enabling behavioral advertising. Therefore European law requires that all websites targeting European Union member states gain “informed consent” from users before storing non-essential cookies on their device. So go ahead and click no when websites prompt you to accept third party cookies. Here’s a detailed article on third party cookies if you are interested.

Session ID

Session IDs or tokens are typically used in HTTP based connections to identify a user session. For example: when you are adding items to the Amazon shopping cart, the server should have a way of retaining items added to the cart even though you browse through various pages. In this case session ID or token is a way of keeping track of the user’s shopping cart.

Components of an HTTP Request

An HTTP request contains the following

  • Request Line
  • Request Headers
  • Body

Let’s look at an example GET request.

HTTP Request Line

The request line contains the name of the HTTP method to be used. We will look at all the HTTP methods in detail in another post. In the example below, GET is the HTTP method. Following the method is the URI (Unified Resource Identifier) which is the address used to locate a resource. The final part refers to the version of the HTTP protocol.

GET thatgirlcoder.com/ HTTP/1.1

Here’s a detailed example from inspecting the GET request from Google chrome.

Request URL: https://thatgirlcoder.com/
Request Method: GET
Status Code: 200 
Remote Address: 100.0.00.00:111
Referrer Policy: strict-origin-when-cross-origin

HTTP headers

Headers contain metadata to provide more information about a request. In the following example Accept and Host are headers

:authority: thatgirlcoder.com
:method: GET
:path: /
:scheme: https
accept: text/html,application/xhtml+xml,application/xml;
accept-encoding: gzip, deflate, br
accept-language: en-US,en;q=0.9
user-agent: Mozilla/5.0 (<system-information>) <platform> (<platform-details>) <extensions>

Request Body

A request body is used alongside HTTP methods which are used to change the state of the server such as PUT, POST etc. GET requests do not have a request body section.

Components of an HTTP Response

An HTTP response contains the following

  • Status Line
  • Response Header
  • Body

Status

An HTTP response contains a status code to indicate the successful completion of a request. For example:

HTTP/1.1 200 OK

Here’s a list of possible status codes and their descriptions.

Status CodesDescription
200 – 299Successful response
100 – 199Informational response
300 – 399Redirect response
400 – 499Errors on client side
500 – 599Errors on server side

Response Header

The server responds back with some HTTP headers as well. A popular one is the Set-Cookie header which the client and server use to authenticate a session.

Set-Cookie: key=fkhKFHlfhF; expires=Thur, 09-Sept-2023 12:00:00 GMT; Max-Age=4823982; Path=/; secure

Response Body

The body contains the content requested by the client. In the below example we requested an HTML document.

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>That Girl Coder – Learn and Grow Everyday!</title>

I hope you learnt some basics of HTTP requests today. Let’s keep diving deeper into this topic over the next few posts!