EFFICIENT TRANSLATION FROM EDIT DISTANCE TO HAMMING DISTANCE
UCLA Technology Available For Licensing

UCLA researchers have developed a new approach to more efficiently search and compare strings for approximate matching according to so-called "edit distance," allowing for much more efficient search for applications such as bioinformatics, music and video search, and data backup and retrieval.

BACKGROUND:  Many data-intensive applications require computationally intensive algorithms for approximate string matching. Examples include text editors, database archiving, internet search-engines, and bioinformatics applications. For example, sequences of DNA or proteins are routinely searched against one another to determine biological similarity. The edit distance between two strings, the minimum number of character changes, insertions and deletes to map from one string to another, is usually hailed as the one of the best measures for accuracy. Unfortunately, calculating edit distances for hundreds of sequences, which is often the case, is extremely inefficient.

Many heuristic algorithms such as BLAST and FASTA have been developed to overcome this inefficiency. However, the innovation disclosed here provides a faster way to handle edit distance (by transforming into a much simpler form) thus potentially speeding up a host of applications that need approximate matching using the edit distance.

INNOVATION:  UCLA researchers have developed a novel method for more efficiently searching and comparing sequences. The innovation lies in a method for mapping the sequences to a new set of strings, whose similarity can be compared using hamming distances instead of edit distances. The hamming distance similarity between these new strings is proportional the similarity of the original sequences, and more importantly, there are already existing far more efficient algorithms that can search and index strings using hamming distances.

POTENTIAL APPLICATIONS: 

DEVELOPMENT-TO-DATE:  The mathematical method has been proven, and accepted by Symposium on Theory of Computing (STOC) 2005. Provisional patent has been filed.

ABOUT THE LAB:   This innovation was created by the researchers associated with the Center for Information and Computation Security (CICS) at UCLA which is involved in a wide range of research involving cryptography and computer security. The web site for the lab is http://www.cs.ucla.edu/security/.

INVENTORS:   Dr. Rafail Ostrovsky is a Professor in the Department of Computer Science, UCLA's Henry Samueli School of Engineering and Applied Science, and is the director of the Center For Information And Computation Security (CICS) at UCLA.

Reference: UCLA Case No. 2005-431 PCT Application Number: WO 2006/094016

For additional technical details and current licensing
availability, please contact the following UCLA office:

UCLA Office of Intellectual Property
11000 Kinross Avenue, Suite #200
Los Angeles, CA 90095-7231
Tel: 310-794-0558 Fax: 310-794-0638
email: ncd@research.ucla.edu
NCD URL:   http://www.research.ucla.edu/tech/ucla05-431.htm

Lead Inventor: Rafail Ostrovsky

UCLA Technologies Available for Licensing
http://www.research.ucla.edu/tech

Copyright © 2005 The Regents of the University of California.

keywords: bioinformatics, music search, video search, data archive, HIS, uclancd ucla technologies intellectual property patents technology transfer invention business card