What is an insertion sort?
Insertion sort is a simple sorting algorithm that builds the final sorted
array (or list) one item at a time. It is much less efficient on large
lists than more advanced algorithms such as quicksort, heapsort, or
merge sort. See process by animation
Pseudocode :
We use a procedure INSERTION_SORT. It takes as parameters an
array A[1.. n] and the length n of the array. The array
A is sorted in
place: the numbers are rearranged within the array, with at most a
constant number outside the array at any time.
INSERTION_SORT (A)
1. FOR j
← 2 TO length[A]
2. DO key ← A[j]
3. {Put A[j] into the sorted sequence A[1 . . j − 1]}
4. i ← j − 1
5. WHILE i > 0 and A[i] > key
6. DO A[i +1] ← A[i]
7. i ← i − 1
8. A[i + 1] ← key
2. DO key ← A[j]
3. {Put A[j] into the sorted sequence A[1 . . j − 1]}
4. i ← j − 1
5. WHILE i > 0 and A[i] > key
6. DO A[i +1] ← A[i]
7. i ← i − 1
8. A[i + 1] ← key
Analysis :
Since the running time of an algorithm on a particular input
is the number of steps executed, we must define "step" independent of machine.
We say that a statement that takes ci
steps to execute and executed n times contributes cin
to the total running time of the algorithm. To compute the running time, T(n), we sum the products of the
cost and times column [see CLRS page 26]. That is, the running time of the algorithm is
the sum of running times for each statement executed. So, we have
T(n) = c1n + c2
(n − 1) + 0 (n
− 1) + c4 (n
− 1) + c5 ∑2
≤ j ≤
n ( tj )+ c6 ∑2
≤ j ≤
n (tj
− 1) + c7 ∑2
≤ j ≤
n (tj
− 1) + c8
(n − 1)
In the above equation we supposed that tj
be the number of times the while-loop (in line 5) is executed for that value of
j. Note that the value of j runs from 2 to (n
− 1). We have
T(n) = c1n + c2
(n − 1) + c4 (n
− 1) + c5 ∑2
≤ j ≤
n ( tj
)+ c6 ∑2 ≤
j ≤ n (tj
− 1) + c7 ∑2
≤ j ≤
n (tj
− 1) + c8
(n − 1) ---------------------(1)
Best-Case:
The best case occurs if the array is already sorted.
For each j = 2, 3, ..., n, we find that A[i] less than
or equal to the key when i has its initial value of (j
− 1). In other words, when i = j
−1, always find the
key A[i] upon the first time the
WHILE loop is run.
Therefore, tj
= 1 for j = 2, 3, ..., n and the best-case running time can be computed
using equation (1) as follows:
T(n) = c1n + c2
(n − 1) + c4 (n
− 1) + c5 ∑2
≤ j ≤
n (1) + c6
∑2 ≤ j
≤ n (1 − 1) + c7 ∑2
≤ j ≤
n (1
− 1) + c8
(n − 1)
T(n) = c1n + c2
(n − 1) + c4 (n
− 1) + c5 (n − 1) + c8
(n − 1)
T(n) = (c1 + c2
+ c4 + c5 + c8
) n + (c2 + c4 +
c5 + c8)
This running time can be expressed as an + b for
constants a and b that depend on the statement costs
ci.
Therefore, T(n) it is a linear function of n.
The punch line here is that the while-loop in line 5 executed only once for each
j. This happens
if given array A is already sorted.
T(n)
= an + b = O(n)
It is a linear function of
n.
Worst-Case:
The worst-case occurs if the array is sorted in
reverse order i.e., in decreasing order. In the reverse order, we always
find that A[i] is greater than the key in the while-loop test.
So, we must compare each element A[j] with each element in the
entire sorted subarray A[1 .. j − 1] and so
tj
= j for j = 2, 3, ..., n. Equivalently, we can say that since the
while-loop exits because i reaches to 0, there is one additional test
after (j − 1) tests. Therefore, tj
= j for j = 2, 3, ..., n and the worst-case running time can be computed
using equation (1) as follows:

And using the summations in CLRS on page 27, we have
T(n) = c1n + c2
(n − 1) + c4 (n
− 1) + c5 ∑2
≤ j ≤
n [n(n
+1)/2 + 1] + c6 ∑2
≤ j ≤
n [n(n
− 1)/2] + c7 ∑2
≤ j ≤
n [n(n
− 1)/2] + c8
(n − 1)
T(n) = (c5/2 + c6/2
+ c7/2)
n2 + (c1 + c2
+ c4 + c5/2 −
c6/2 − c7/2
+ c8) n − (c2
+ c4 + c5 + c8)
This running time can be expressed as (an2
+ bn + c) for constants a, b, and
c that again depend on the statement
costs ci. Therefore, T(n) is a quadratic function of n.
Here the punch line is that the worst-case occurs, when line 5 executed
j times for each j. This can happens if array
A starts out in reverse
order
T(n)
= an2
+ bn + c = O(n2)
It is a quadratic function of
n.
The graph shows the n2 complexity of the
insertion sort.
Worst-case and average-case Analysis
We usually concentrate on finding the worst-case running time:
the longest running time for any input size n. The reasons for this choice are
as follows:
- The worst-case running time gives a guaranteed upper bound on the running time for any input. That is, upper bound gives us a guarantee that the algorithm will never take any longer.
- For some algorithms, the worst case occurs often. For example, when searching, the worst case often occurs when the item being searched for is not present, and searches for absent items may be frequent.
- Why not analyze the average case? Because it's often about as bad as the worst case.
Example: Suppose that we randomly choose n
numbers as the input to insertion sort.
On average, the key in A[j] is less than half the elements in
A[1 .. j − 1] and it is greater than the other half. It implies that on average,
the while loop has to look halfway through the sorted subarray A[1 .. j
− 1] to
decide where to drop key. This means that tj
= j/2.
Although the average-case running time is approximately half
of the worst-case running time, it is still a quadratic function of n.
Like selection sort, insertion sort loops over the indices of the array. It just calls
Let's take a situation where we call
Suppose that upon every call to
That sum is an arithmetic series, except that it goes up to
Using big-Θ notation, we discard the low-order term
Can insertion sort take less than
Can either of these situations occur? Can each call to
How about the opposite case? A call to
What else can we say about the running time of insertion sort? Suppose that the array starts out in a random order. Then, on average, we'd expect that each element is less than half the elements to its left. In this case, on average, a call to
What if you knew that the array was "almost sorted": every element starts out at most some constant number of positions, say 17, from where it's supposed to be when sorted? Then each call to
To sum up the running times for insertion sort:
If you had to make a blanket statement that applies to all cases of insertion sort, you would have to say that it runs in
Another Example for Analysis of insertion sort
Like selection sort, insertion sort loops over the indices of the array. It just calls
insert
on the elements at indices 1,2,3,…,n−1
. Just as each call to indexOfMinimum
took an amount of time that depended on the size of the sorted subarray, so does each call to insert
. Actually, the word "does" in the previous sentence should be "can," and we'll see why.Let's take a situation where we call
insert
and the
value being inserted into a subarray is less than every element in the
subarray. For example, if we're inserting 0 into the subarray [2, 3, 5,
7, 11], then every element in the subarray has to slide over one
position to the right. So, in general, if we're inserting into a
subarray with k
elements, all k
might have to slide over by one position. Rather than counting exactly
how many lines of code we need to test an element against a key and
slide the element, let's agree that it's a constant number of lines;
let's call that constant c
. Therefore, it could take up to c⋅k
lines to insert into a subarray of k
elements.Suppose that upon every call to
insert
, the value being inserted is less than every element in the subarray to its left. When we call insert
the first time, k=1
. The second time, k=2
. The third time, k=3
. And so on, up through the last time, when k=n−1
. Therefore, the total time spent inserting into sorted subarrays isc⋅1+c⋅2+c⋅3+⋯c⋅(n−1)=c⋅(1+2+3+⋯+(n−1))
.That sum is an arithmetic series, except that it goes up to
n−1
rather than n
. Using our formula for arithmetic series, we get that the total time spent inserting into sorted subarrays isc⋅(n−1+1)((n−1)/2)=cn2/2−cn/2
.Using big-Θ notation, we discard the low-order term
cn/2
and the constant factors c
and 1/2, getting the result that the running time of insertion sort, in this case, is Θ(n2)
.Can insertion sort take less than
Θ(n2)
time? The answer is yes. Suppose we have the array [2, 3, 5, 7, 11],
where the sorted subarray is the first four elements, and we're
inserting the value 11. Upon the first test, we find that 11 is greater
than 7, and so no elements in the subarray need to slide over to the
right. Then this call of insert
takes just constant time. Suppose that every call of insert
takes constant time. Because there are n−1
calls to insert
, if each call takes time that is some constant c
, then the total time for insertion sort is c⋅(n−1)
, which is Θ(n)
, not Θ(n2)
.Can either of these situations occur? Can each call to
insert
cause every element in the subarray to slide one position to the right? Can each call to insert
cause no elements to slide? The answer is yes to both questions. A call to insert
causes every element to slide over if the key being inserted is less
than every element to its left. So, if every element is less than every
element to its left, the running time of insertion sort is Θ(n2)
. What would it mean for every element to be less than the element to its left? The array would have to start out in reverse sorted order, such as [11, 7, 5, 3, 2]. So a reverse-sorted array is the worst case for insertion sort.How about the opposite case? A call to
insert
causes no
elements to slide over if the key being inserted is greater than or
equal to every element to its left. So, if every element is greater than
or equal to every element to its left, the running time of insertion
sort is Θ(n)
.
This situation occurs if the array starts out already sorted, and so an
already-sorted array is the best case for insertion sort.What else can we say about the running time of insertion sort? Suppose that the array starts out in a random order. Then, on average, we'd expect that each element is less than half the elements to its left. In this case, on average, a call to
insert
on a subarray of k
elements would slide k/2
of them. The running time would be half of the worst-case running time.
But in asymptotic notation, where constant coefficients don't matter,
the running time in the average case would still be Θ(n2)
, just like the worst case.What if you knew that the array was "almost sorted": every element starts out at most some constant number of positions, say 17, from where it's supposed to be when sorted? Then each call to
insert
slides at most 17 elements, and the time for one call of insert
on a subarray of k
elements would be at most 17⋅c
. Over all n−1
calls to insert
, the running time would be 17⋅c⋅(n−1)
, which is Θ(n)
, just like the best case. So insertion sort is fast when given an almost-sorted array.To sum up the running times for insertion sort:
- Worst case:
Θ(n2)
. - Best case:
Θ(n)
. - Average case for a random array:
Θ(n2)
. - "Almost sorted" case:
Θ(n)
.
If you had to make a blanket statement that applies to all cases of insertion sort, you would have to say that it runs in
O(n2)
time. You cannot say that it runs in Θ(n2)
time in all cases, since the best case runs in Θ(n)
time. And you cannot say that it runs in Θ(n)
time in all cases, since the worst-case running time is Θ(n2)
.
Stability:
Since multiple keys with the same value are placed in the sorted array in the
same order that they appear in the input array, Insertion sort is stable.
Extra Memory:
This algorithm does not require extra memory.
- For Insertion sort we say the worst-case running time is θ(n2), and the best-case running time is θ(n).
- Insertion sort use no extra memory it sort in place.
- The time of Insertion sort is depends on the original order of a input. It takes a time in Ω(n2) in the worst-case, despite the fact that a time in order of n is sufficient to solve large instances in which the items are already sorted.
The insertion sort, although still
We begin by assuming that a list with one item (position
, the current item is checked against those in the already sorted sublist. As we look back into the already sorted sublist, we shift those items that are greater to the right. When we reach a smaller item or the end of the sublist, the current item can be inserted.
Figure 2 shows the fifth pass in detail. At this point in the algorithm, a sorted sublist of five items consisting of 17, 26, 54, 77, and 93 exists. We want to insert 31 back into the already sorted items. The first comparison against 93 causes 93 to be shifted to the right. 77 and 54 are also shifted. When the item 26 is encountered, the shifting process stops and 31 is placed in the open position. Now we have a sorted sublist of six items.
Some Sets for Practice Insertion Sort
Original set:72 52 77 83 44 65 51 40
Pass 1: 52 72 77 83 44 65 51 40
Pass 2: 52 72 77 83 44 65 51 40
Pass 3: 52 72 77 83 44 65 51 40
Pass 4: 44 52 72 77 83 65 51 40
Pass 5: 44 52 65 72 77 83 51 40
Pass 6: 44 51 52 65 72 77 83 40
Pass 7: 40 44 51 52 65 72 77 83
Original set:
88 32 41 11 52 10 86 59
Pass 1: 32 88 41 11 52 10 86 59
Pass 2: 32 41 88 11 52 10 86 59
Pass 3: 11 32 41 88 52 10 86 59
Pass 4: 11 32 41 52 88 10 86 59
Pass 5: 10 11 32 41 52 88 86 59
Pass 6: 10 11 32 41 52 86 88 59
Pass 7: 10 11 32 41 52 59 86 88
Original set:
24 64 55 38 89 64 52 97 Pass 1: 24 64 55 38 89 64 52 97
Pass 2: 24 55 64 38 89 64 52 97
Pass 3: 24 38 55 64 89 64 52 97
Pass 4: 24 38 55 64 89 64 52 97
Pass 5: 24 38 55 64 64 89 52 97
Pass 6: 24 38 52 55 64 64 89 97
Pass 7: 24 38 52 55 64 64 89 97
Original set:
92 78 98 69 31 35 96 60
Pass 1: 78 92 98 69 31 35 96 60
Pass 2: 78 92 98 69 31 35 96 60
Pass 3: 69 78 92 98 31 35 96 60
Pass 4: 31 69 78 92 98 35 96 60
Pass 5: 31 35 69 78 92 98 96 60
Pass 6: 31 35 69 78 92 96 98 60
Pass 7: 31 35 60 69 78 92 96 98
Original set:
68 40 85 11 60 38 50 46
Pass 1: 40 68 85 11 60 38 50 46
Pass 2: 40 68 85 11 60 38 50 46
Pass 3: 11 40 68 85 60 38 50 46
Pass 4: 11 40 60 68 85 38 50 46
Pass 5: 11 38 40 60 68 85 50 46
Pass 6: 11 38 40 50 60 68 85 46
Pass 7: 11 38 40 46 50 60 68 85
Some basic question of insertion sort
Question: Insertion sort is 1.in-place 2.out-place 3.stable 4.non-stablea.1 and 3 b.1 and 4 c.2 and 3 d.2 and 4
Ans:a
The array is sorted in-place because no extra memory is required. Stable sort retains the original ordering of keys when identical keys are present in the input data.
Question: Sorting of playing cards is an example of
a.Bubble sort b. Insertion sort c.Selection sort d.Quick sort
Ans:b
Question: For insertion sort, the number of entries we must index through when there are n elements in the array is
a. n entries b.n*n entries c.n-1 entries d.none of the above
Ans:c
Assuming there are n elements in the array we must index through n – 1 entries. For each entry we may need to examine and shift up to n – 1 other entries resulting in a O(n2) algorithm.
Question: What is the average case complexity for insertion sort algorithm
a.O(n) b.O(n*n) c.O(nlogn) d.O(logn)
Ans:b
Question: What is the worst case complexity for insertion sort algorithm
a.O(n) b.O(n*n) c.O(nlogn) d.O(logn)
Ans:b
Question: What is the best case complexity for insertion sort algorithm a.O(n) b.O(n*n) c.O(nlogn) d.O(logn)
Ans:a
Question: What is the output of insertion sort after the 1st iteration given the following sequence of numbers: 7 3 5 1 9 8 4 6a. 3 7 5 1 9 8 4 6
b. 1 3 7 5 9 8 4 6
c. 3 4 1 5 6 8 7 9
d. 1 3 4 5 6 7 8 9
Ans:a
Question: The following loop is used for
For i<- 1 to N-1
{
For j<- i+1 to N
{
If(a(j)<a(i))
{
Temp=a(j);
For k<- j to i+1
A[k]=a[k-1];
}
}
A(i)=temp;
}
a.bubble sort b.selection sort c.insertion sort d.merge sort
Ans:c
Question: Total number of comparisons for insertion sort is
a. n(n-1)/2
b. n(n+1)/2
c. n
d. n*n
Ans: a
Question: In which cases are the time complexities same in insertion sort?
a. Worst and Best
b. Best and Average
c. Worst and Average
d. Worst, average and Best
Ans:c
Question: What is the output of insertion sort after the 2nd iteration given the following sequence of numbers: 7 3 5 1 9 8 4 6
a. 3 5 7 1 9 8 4 6
b. 1 3 7 5 9 8 4 6
c. 3 4 1 5 6 8 7 9
d. 1 3 4 5 6 7 8 9
Ans:a
Question: What is the output of insertion sort after the 3rd iteration given the following sequence of numbers: 7 3 5 1 9 8 4 6
a. 3 7 5 1 9 8 4 6
b. 1 3 7 5 9 8 4 6
c. 3 4 1 5 6 8 7 9
d. 1 3 5 7 9 8 4 6
Ans:d
Question: What changes need to be made to the Insertion Card Sort algorithm so that it can be used to sort numbers on a computer?
Answer: We only need to change the algorithm specifications so that we are
sorting lists of numbers rather than hands of cards.
Question: What operation does the Insertion Sort use to move numbers from the
unsorted section to the sorted section of the list?
Answer: The Insertion Sort uses the swap operation since it is ordering numbers
within a single list.
Question: Consider the following lists of partially sorted numbers. The double
bars represent the sort marker. For each list state how many comparisons
and swaps are needed to sort the next number.
[1 3 4 8 9 || 5 2
[1 3 4 5 8 9 || 2]
[1 3 || 4 9 2 8 ]
[1 2 3 8 || 7 6]
[2 3 4 6 7 8 9 || 1]
Answer: The comparisons and swaps needed are listed below.[1 3 || 4 9 2 8 ]
[1 2 3 8 || 7 6]
[2 3 4 6 7 8 9 || 1]
- 3 comparisons, 2 swaps
- 6 comparisons, 5 swaps
- 1 comparison, 0 swaps
- 2 comparisons, 1 swap
- 7 comparisons, 7 swaps
Answer: To modify the Insertion Sort Algorithm so that it sorts from highest to lowest, we need to swap the next unsorted card to the left until we reach a smaller card. The modifications to the original algorithm are marked in italics, notice we must simply redefine "correct sorted position".
Insertion Sort Algorithm
(sorting highest to lowest)
(sorting highest to lowest)
- Get a list of unsorted numbers
- Set a marker for the sorted section after the first number in the list
- Repeat steps 4 through 6 until the unsorted section is empty
- Select the first unsorted number
- Swap this number to the left until it arrives in the correct sorted position. (i.e. until the next number to the left is smaller or it has reached the head of the list.)
- Advance the marker to the right one position
- Stop
Algorithm:
// Sort an arr[] of size n
insertionSort(arr, n)
Loop from i = 1 to n-1.
……a) Pick element arr[i] and insert it into sorted sequence arr[0…i-1]
Answer:
12, 11, 13, 5, 6
Let us loop for i = 1 (second element of the array) to 5 (Size of input array)
i = 1. Since 11 is smaller than 12, move 12 and insert 11 before 12
11, 12, 13, 5, 6
i = 2. 13 will remain at its position as all elements in A[0..I-1] are smaller than 13
11, 12, 13, 5, 6
i = 3. 5 will move to the beginning and all other elements from 11 to 13 will move one position ahead of their current position.
5, 11, 12, 13, 6
i = 4. 6 will move to position after 5, and elements from 11 to 13 will move one position ahead of their current position.
5, 6, 11, 12, 13
No comments:
Post a Comment