Unix Review > Archives > 2002 > July 2002

July 2002

Shell Corner: The Soundex System in C and Shell

Hosted by Ed Schaefer

This month, Mendel Cooper (thegrendel@theriver.com) submits a bash script called soundex.ss. The Soundex system is a hashing algorithm for surnames. It catalogs variants of a given surname, such as Smith, Smyth, and Smythe, and encodes them according to the way they are pronounced. Following Cooper's presentation of soundex.ss, I present my surname tests, a non-GNU tr command class problem, and a Perl programming nugget.

Cooper explains:

soundex.ss is an error-minimizing code, as a misspelled name will often have the same or nearly the same Soundex representation as the correct version. This feature tends to mitigate the inevitable errors of workers who listen to and transcribe many names per work shift. This also, to some extent, corrects for typographical drift in name spelling over the years, which can be very useful when tracing one's ancestors (as an example, Smathers --> Smothers --> Smithers ... same Soundex code).

Soundex has been in use for over 80 years -- since well before the adoption of modern information handling techniques -- and can simplify filing procedures if used creatively. The National Archives, the Census Bureau, and some state governmental agencies use Soundex. It is also used for genealogical research -- by hobbyists and by adoptees attempting to find their birth parents.

Sys Admin Spotlight

CMP DevNet Spotlight

Global Web Site Performance Improvement
Jeffrey Fulmer explains how to get a comprehensive picture of your site's performance and describes some tips for improving it.

In the News

CD-ROM

Sys Admin and The Perl Journal CD-ROM version 11.0

Version 11.0 delivers every issue of Sys Admin from 1992 through 2005 and every issue of The Perl Journal from 1996-2002 in one convenient CD-ROM!

Order now!




MarketPlace

Online Crash Analysis
Automatically capture customer crash data, no debugger required. Support for .NET, C++, OS X, Java.

WinDev 11 - Powerful IDE
Develop 10 times faster ! ALM, IDE, .Net, RAD, 5GL, Database, 5GL, 64-bit, etc. Free Express version

Flowcharts from C/C++ code -- Free trial download
Understand C/C++ code in less time. A new team member ? Inherited legacy code ? Get up to speed faster with Crystal Flow for C/C++. Code-formatting improves readability. Flowcharts are integrated with code browser. Export flowcharts to Visio.

Domain Name Registrations, Web Hosting, Email
Pay less for Domain Names, Increase your company's bottom line - get a raise. Accredited domain name registrar, ZippyNames.us : Discount bulk transfers, email, webhosting, dedicated servers. Earn money as a domain name reseller - better discounts!

Wanna see your ad here?