Sorting and searching algorithms

After studying this section you should be able to:

  • understand algorithms to search for an item
  • describe methods of sorting data
  • explain how to merge two sets of data

The different types of sort and search

Searching algorithms are used to find an item in a list. The algorithm will be given the key of the item to be found and the search will work through the list looking

for a match.

Sorting algorithms are used to place a list of items into key sequence (either ascending or descending).

Linear search

A procedure to search an ordered list of items for an item can be performed by starting with the first item and proceeding through the list until the item is found.

The search is completed when the item is found or when it comes across an item that has a larger valued key than the required item. An algorithm to search an array

Arr might be:

PROCEDURE LinearSearch(RequiredItem)

     Found := false

     Failed := false

     Current := 1

     WHILE (Current <= MaxSize) AND (Found = false) AND (Failed = false)

          IF RequiredItem = Arr[Current] THEN Found := true

          ELSE

               IF RequiredItem < Arr[Current] THEN

                    Add 1 to Current

                    IF Current > MaxSize THEN Failed := true

                    ENDIF

               ELSE Failed := true

               ENDIF

          ENDIF

     ENDWHILE

ENDPROC

Binary search

A linear search requires that, on average, half the items will be inspected. A binary search is more efficient. It operates on the principle that the middle item divides the list into two halves. By inspecting the middle item the program can establish which half contains the required item. The algorithm continues by repeating the process on the half that contains the required item. A pseudocode procedure to perform this process is given below.

A binary search is dramatically faster than a linear search when there are a large number of items.

/* Low is the number of the first element in the array and

High is the number of the last element */

PROCEDURE BinarySearch(Low, High, RequiredItem)

     Found := False

     REPEAT

          Middle := Integer part of ((Low + High)/2)

          IF Arr[Middle] = RequiredItem

               THEN Found := true

          ELSE IF Arr[Middle] > RequiredItem

                    THEN High := Middle – 1

                ELSE Low := Middle + 1

                ENDIF

          ENDIF

       UNTIL (Found = True) OR (Low > High)

ENDPROC

Merging

It is often necessary to merge two lists of items. This can only be done efficiently if the lists are both in the same sequence. The algorithm is simpler if terminating values that are larger than any value in the lists are added to the end of the lists before the merge takes place. A pseudocode procedure to merge two lists into a third is as follows:

PROCEDURE Merge(Arr1,Arr2,Arr3)

     Current1 := 1

     Current2 := 1

     Current3 := 1

     WHILE Arr1[Current1] is not a terminator

          OR Arr2[Current2] is not a terminator do

          IF Arr1[Current1] < Arr2[Current2]

               THEN

                    Arr3[Current3] := Arr1[Current1]

                    Add 1 to Index1

               ELSE

                    Arr3[Current3] := Arr2[Current2]

                    Add 1 to Index2

          ENDIF

          Add 1 to Index3

     ENDWHILE

ENDPROC

Bubble sort

The bubble sort is not very efficient but it is simple to program and a useful algorithm when there are not many items involved. The bubble sort performs a number of passes through the items, each pass putting the items into a slightly better sequence. During a pass the algorithm compares each item with the following item and, if they are out of sequence, it swaps their positions. The effect is shown below.

Merging can equally apply to sequential files. The bubble sort is so called as the small numbers bubble slowly to the surface while the large numbers drop like stones to the bottom.

Image

The maximum number of passes required to obtain a correct sequence is N–1, where N is the number of items. The algorithm can be improved by halting if no items are swapped during a pass. This improved algorithm might be:

PROCEDURE Bubblesort

     Pass := 1

     REPEAT

          SwapFlag := false

          FOR Count := 1 TO NoOfItems

               IF Arr[Count] > Arr[Count+1] THEN

                    Temp := Arr[Count]

                    Arr[Count] := Arr[Count+1]

                    Arr[Count+1] := Temp

                   SwapFlag := true

              ENDIF

          ENDFOR

          Add 1 to Pass

     UNTIL Pass = NoOfItems OR SwapFlag = false

ENDPROC

Insertion sort

The insertion sort is an improvement on the bubble sort but it is still relatively slow and suitable only when there is a limited number of items. The items are copied into a new array one at a time. Each item is inserted in the correct place and all the succeeding items that are already in the new array are moved up one place to make space. It is generally more efficient to start at the end of the list, rather than the start, when inserting.

Quicksort

As its name might suggest, the quicksort is, indeed, very fast. It operates as follows:

  • 1 - Select an item at random (often the last item).
  • 2 - Create two new lists of items, one with all the items smaller than the chosen item and the other with the rest.
  • 3 - Repeat step 1 with each list until the lists have a single item.

You should not be asked to write a quicksort algorithm but you may have to trace one.

Category
sign up to revision world banner
Southampton University
Slot