:::: MENU ::::

Programming Basics Prep

Programming Basics Prep

Most of the concepts on this page is taken from : Data Structures and Algorithms in Java by Robert Lafore

This is a basic programming concept anyone must know. I will keep this page updated with new code when I have time.

Linear Search vs Binary Search

Linear Search searches for each item in a list. For ordered,  average number of steps needed to find an item is N/2. For unordered, average number of steps needed to find an item is N.

Binary Search is like a Guess-a-number game where it checks for value in the middle and see if the number is less or more or equal to that middle number  thereby shortening the number of guesses. This can an be applied only to a sorted list.

Guess a number between 1-100 (number 33 guessed)

 Step Number Number Guessed  Result  Range of Possible Values 
0 1-100
1 50 Too High 1-49
2 25 Too Low 26-49
3 37 Too High 26-36
4 31 Too Low 32-36
5 34 Too High 32-33
6 32 Too Low 33-33
7 33 Correct

Binary search provides a significant speed increase over a linear search. If we used linear search it would have took us 33 guesses but in binary search it only took us 7. For small number of items it is not significant but for large items binary is way more faster than linear.

Logarithms

Comparisions needed in Binary Search

10 -> 4, 100 -> 7, 1000 -> 10, 10,000 -> 14 and so on…

Steps s Range r Range expressed as power of 2
0 1 2^0
1 2 2^1
2 4 2^2
3 8 2^3
4 16 2^4
5 32 2^5
6 64 2^6
7 128 2^7

As we can see 7 steps cover the range of 100 (128 in total).

Doubling the range each time creates a series that’s the same as raising two to a power. We can express this power as a formula. If s represents the steps and r represents the range, the equation is

r = 2^s

(eg. 128 = 2^7)

The inverse of raising something to the power is logarithm. This equation says that the number of steps (comparisions) is equal to the logarithm to the base 2 of the range.

s = log2(r)

It is usually log to the base 10 but we can convert easily to base 2 by multiplying by 3.322

[ex. log10(100) = 2,  log1(100) = 2*3.322,  which is equal to 6.644 (approx. 7)]

Big O Notation

Big O Notation is used to measure how efficient a computer algorithm is.

T is time, K is constant

Insertion in unordered array : T = K (Insertion requires a same amount of time no matter how big is the arrray)

Linear Search: T = K * N/2 (The number of comparisions that must be made to find a specified item is, on average, half of the total number of items) We can combine constant with 2 and the new formula is T = K*N where K is the new constant.

Big O Notation dispenses with the constant K. All we want to compare is how T change for different values of N, not what the actual numbers are.

We read O(1) as the order of 1.

Therefore we can say, Linear Search takes O(N) time and Binary Search takes O(log N) time. Insertion into unordered array takes O(1) or constant time which is ideal!

Algorithm Running Time in Big O
Linear Search O(N)
Binary Search O(log N)
Insertion in unordered array O(1)
Insertion in ordered array O(N)
Deletion in unordered array O(N)
Deletion in ordered array O(N)

 

Time Complexity

Time Complexity

O(1) is excellent, O(log N) is good, O(N) is fair, O(N^2) is poor. Bubble sort is O(N^2).

Simple Sorting

Bubble Sort

It is simplest of the sorting algorithms and very slow. You start at the left end of the list and compare the two values in index 0 and 1. If value in 0 is greater than a value in 1 then we swap the positions. If value in 0 is smaller than value in 1 we don’t do anything. We then move over one position and compare values in index 1 and index 2. We do this until all the values are sorted. As the algorithm progresses, the biggest items “bubble up” to the top end of the array. After this first pass through all the data, we’ve made N-1 comparisions and somewhere between 0 and N-1 swaps.

Now, we go back again and start another pass from left end of the array towards the right. However, we stop at N-2 because we know N-1 already contains largest number. On each external iteration/pass we decrease the size of N.

Efficiency of Bubble Sort

In general, where N is the number of items in the array, there are N-1 comparisons on the first pass, N-2 on the second and so on…

(N-1)+(N-2)+(N-3)+…+1 = N*(N-1)/2

Thus, the algorithm makes about N^2/2 comparisions as we get rid of -1 as it makes no difference. We know if data is random, a swap is necessay about half the time on average so there will be N^2/2 * 1/2 = N^2/4 swaps. We can ignore the constant 2 (or 4) thus it gives us the Big O Notation of O(N^2) which is very poor.

Selection Sort

This sort reduces the number of swaps necessary from O(N^2) as in bubble sort to O(N). Unfortunately, the number of comparisions remains the same O(N^2).

In this type of sorting, we look for the lowest number in the list and swap it with the index 0 of the one on the left. We then start again from index 1 and look for the lowest number and swap it with the value in index 1. We continue forward until we reach the index N-1. In this algorithm the sorted numbers accumulate on the left side.

Efficiency of Selection Sort

Selection sort performs the same number of comparisions as the bubble sort N*(N-1)/2. But, it is faster because there are so few swaps. For smaller values of N, hte selection sort may in fact be much faster than bubble sort. For large values of N, the comparision times will dominate, so we would have to say that the selection sort runs in O(N^2) time just as bubble sort.

Insertion Sort

In insertion sort, we divide the unsorted list into two and sort the list on the left (partial sorting).  Then we take the value from the start index of the second list, compare it with first list and put the number in appropriate position while shifting remaining numbers from first list to the right. We continue the process until we go through all the numbers in the second list.

Efficiency of Insertion Sort

On the first pass, it compares the maximum of 1 item. On second pass 2 and so on up to N-1

1 + 2 + 3 + … + N-1 = N*(N-1)/2

On each pass only 1/2 of the maximum number of items are actually compared before the insertion point is found, we can divide by 2. So, N*(N-1)/4

The insertion sort runs in O(N^2) time for random data and when data is almost sorted it runs at O(N).

 

Stacks and Queues

Stacks and Queues are more restricted and abstract as we can only access one data at a time – either on top or on the bottom. We can use Arrays to understand the concept but we can also use Linked Lists and Heap.

Stacks

A stack allows access to onle one data item: the last item inserted. Think of it as a stack of books on the table. You put a book on top of the stack and, you take a book from the top of the stack. Basically it is Last-In-First-Out(LIFO)Example: Stack  can be used to check whether paranthesesm, braces and brackets are balanced in a computer program. In binary trees, it can be used to traverse the nodes of a tree. In Graphs, it can be used to search the vertices of a graph. Most microprocessors use a stack based architecture. When a method is called, its return address and arguments are pushed onto a stack, and when it returns, they are popped off.

 

Efficiency of Stack : Constant O(1) time and is not dependent on number of items.

Queues

It is a data structure that is somewhat like a stack, except that in a queue the first item inserted is the first item to be removed (First-In-First-Out, FIFO). Example: A queue works as a line in an apple store: first person to stand gets the first iPhone, a printer’s queue when you hit that print command and, typing things on a screen, etc.

Circular queue: when you insert an item in a queue, it sits at the back of the line. And, when you remove an item, the spot is empty at the front. You can push all the values to the front to make up that space but it is somewhat inefficient. So, we just circle through the array or spots.

Efficiency of Queues: O(1) time as we insert and remove data

Priority Queues

This is a more specialized form of a queue. Like an ordinary queue, priority queue has a front and a rear, and items are removed from the front. However, in a priority queue, items are ordered by key value so that the item with the lowest key is always at the front. Example: Sorting of letters in order of priorities, minimum spanning trees, weighted graphs.

Efficiency of Priority Queues: Insertion runs in O(N) time as it has to sort the values first. Deletion takes O(1). We can improve insertion time using heaps.

Arrays

Arrays are the most commonly used data structure. But, it is not always ideal. In an unordered array insert takes O(1) time but searching is slow O(N). In ordered array search is quick O(log N) but insertion takes O(N). For both, deletion takes O(N). Linked Lists is another viable option.

  • Important things to understand: Insertion, Searching, Deletion
  • Average number of steps needed to find and item is N/2 (worst case N)
  • A deletion requires searching through an average of N/2 elements and then moving the remaining elements(N/2) to fill up the resulting hole. Total steps is N.

Creating an array:

Accessing Array:

Inserting Values in Array:

Some common examples:

Reverse a String

Sorting numbers in an Array

Sum of Numbers in an Array

Fibonacci Series with and without Recursion

Check prime number

Binary Search

Linear Search

Factorial

Palindrome

Shuffle a deck of cards

Check for Anagrams

Find first non repeating char in a string

 

Linked Lists

Arrays have certain disadvantages as data storage structures. In an unordered array, searching is slow, whereas in an ordered array, insertion is slow. In both, deletion is slow. Also, the size of an array can’t be changed after it’s created.

Linked Lists can replace an array as the basis for other storage structures such as stacks and queues. In fact, you can use a linked list in many cases in which you use an array, unless you need frequent random access to individual items using an index.

Links

In a linked list, each data item is embedded in a link. A link is an object of a class Link (or something similar). Each link object contains a reference to next link in the list. A field in the list itself contains a reference to the first link. This kind of class definition is sometimes called self-referential because it contains a field (called next) of the same type as itself.

In Java, a Link object doesn’t really contain another Link object. The next field of type Link is only a reference to another link, not an object.

Linked List

Linked List

How Linked List differs from Array

In an array each item occupies a particular position. This position can be directly accessed using an index number.Like a row of houses – once you know the address you know the position. In a List, the only way to find a particular element is to follow along the chain of elements.

A Simple Linked List

  • Inserting an item at the beginning of the list
  • Deleting the item at the beginning of the list
  • Iterating through the list to display contents

Finding and Deleting Specified Link

Double-Ended Lists

A double-ended list is similar to an ordinary linked list, but it has one additional feature: a reference to the last link as well as to the first. The reference to the last link permits you to insert a new link directly at the end of the list as well as at the beginning. You can also do this with Linked List but you will have to traverse through the whole list and then insert it which is inefficient.

Linked List Efficiency: Insertion and deletion at the very beginning of a linked list are very fast – O(1). Finding, deleting, or inserting next to a specific item requires searching through an average of half the items in the list. This requires O(N) comparisions. An array is also O(N) for these operations but linked list is nevertheless faster because nothing needs to be moved when an item is inserted or deleted. Linked List uses exactly as much memory as it needs and can expand to fill all of available memory. The size of an array is fixed (except vector, but is still inefficient).