Failure to canonicalize input can introduce vulnerability. Inadvertently canonicalizing input multiple times can introduce vulnerability.
Canonicalization is the process of transforming a potentially flexible data structure into one that has guaranteed characteristics. It is a frequent technique for input data validation. For example, the same input data "characters" can be encoded in many ways, ranging from 7-bit ASCII to variable-width multibyte Unicode. Before a program that accepts such input uses it, it is frequently required that the input be transformed into some canonical form that is universal (in the context of the program). Otherwise, even simple text comparisons (e.g., length, equal, ordering) cannot be made.
Failure to Canonicalize (When It Was Needed)
When input with identical semantics can be supplied in multiple syntaxes, then it is usually wise to define one of the syntaxes as "canonical" and transform all of the other representations into that one before using the input. Even better is to disallow all input that is not canonical [Hoglund 04].1
Redundant Canonicalization (Which Is Not Idempotent)
Hoglund, Greg & McGraw, Gary. Exploiting Software: How to Break Code. Boston, MA: Addison-Wesley, 2004.
Howard, Michael & LeBlanc, David. Writing Secure Code. 2nd. Redmond, WA: Microsoft Press, 2002.
MacInnis, Ken. Vulnerability Note VU#580299: Microsoft Internet Explorer contains URL decoding cross-domain vulnerability. June 14, 2005. http://www.kb.cert.org/vuls/id/580299.
Copyright © Carnegie Mellon University 2005-2012.
This material may be reproduced in its entirety, without modification, and freely distributed in written or electronic form without requesting formal permission. Permission is required for any other use. Requests for permission should be directed to the Software Engineering Institute at firstname.lastname@example.org.
The Build Security In (BSI) portal is sponsored by the U.S. Department of Homeland Security (DHS), National Cyber Security Division. The Software Engineering Institute (SEI) develops and operates BSI. DHS funding supports the publishing of all site content.
THIS MATERIAL OF CARNEGIE MELLON UNIVERSITY AND ITS SOFTWARE ENGINEERING INSTITUTE IS FURNISHED ON AN “AS-IS" BASIS. CARNEGIE MELLON UNIVERSITY MAKES NO WARRANTIES OF ANY KIND, EITHER EXPRESSED OR IMPLIED, AS TO ANY MATTER INCLUDING, BUT NOT LIMITED TO, WARRANTY OF FITNESS FOR PURPOSE OR MERCHANTABILITY, EXCLUSIVITY, OR RESULTS OBTAINED FROM USE OF THE MATERIAL. CARNEGIE MELLON UNIVERSITY DOES NOT MAKE ANY WARRANTY OF ANY KIND WITH RESPECT TO FREEDOM FROM PATENT, TRADEMARK, OR COPYRIGHT INFRINGEMENT.