Jump to content

Trimming (computer programming): Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Usage: fix underscores
Line 34: Line 34:


Delphi's Trim function considers characters U+0000 (NULL) through U+0020 (SPACE) to be whitespace.
Delphi's Trim function considers characters U+0000 (NULL) through U+0020 (SPACE) to be whitespace.

=== Non-space blanks ===

The Braille Patterns Unicode block contains U+2800 BRAILLE PATTERN BLANK (HTML ⠀), a Braille pattern with no dots raised.
The Unicode standard explicitly states that it does not act as a space.


==Usage==
==Usage==

Revision as of 12:45, 11 January 2017

‹The template Manual is being considered for merging.› 

In computer programming, trimming (trim) or stripping (strip) is a string manipulation in which leading and trailing whitespace is removed from a string.

For example, the string (enclosed by apostrophes)

'  this is a test  '

would be changed, after trimming, to

'this is a test'

Variants

Left or right trimming
The most popular variants of the trim function strip only the beginning or end of the string. Typically named ltrim and rtrim respectively, or in the case of Python: lstrip and rstrip. C# uses TrimStart and TrimEnd, and Common Lisp string-left-trim and string-right-trim. Pascal and Java do not have these variants built-in, although Object Pascal (Delphi) has TrimLeft and TrimRight functions.[1]
Whitespace character list parameterization
Many trim functions have an optional parameter to specify a list of characters to trim, instead of the default whitespace characters. For example, PHP and Python allow this optional parameter, while Pascal and Java do not. With Common Lisp's string-trim function, the parameter (called character-bag) is required. The C++ Boost library defines space characters according to locale, as well as offering variants with a predicate parameter (a functor) to select which characters are trimmed.
Special empty string return value
An uncommon variant of trim returns a special result if no characters remain after the trim operation. For example, Apache Jakarta's StringUtils has a function called stripToNull which returns null in place of an empty string.
Space normalization
Space normalization is a related string manipulation where in addition to removing surrounding whitespace, any sequence of whitespace characters within the string is replaced with a single space. Space normalization is performed by the function named Trim() in spreadsheet applications (including Excel, Calc, Gnumeric, and Google Docs), and by the normalize-space() function in XSLT and XPath,
In-place trimming
While most algorithms return a new (trimmed) string, some alter the original string in-place. Notably, the Boost library allows either in-place trimming or a trimmed copy to be returned.

Definition of whitespace

The characters which are considered whitespace varies between programming languages and implementations. For example, C traditionally only counts space, tab, line feed, and carriage return characters, while languages which support Unicode typically include all Unicode space characters. Some implementations also include ASCII control codes (non-printing characters) along with whitespace characters.

Java's trim method considers ASCII spaces and control codes as whitespace, contrasting with the Java isWhitespace() method,[2] which recognizes all Unicode space characters.

Delphi's Trim function considers characters U+0000 (NULL) through U+0020 (SPACE) to be whitespace.

Non-space blanks

The Braille Patterns Unicode block contains U+2800 BRAILLE PATTERN BLANK (HTML ⠀), a Braille pattern with no dots raised. The Unicode standard explicitly states that it does not act as a space.

Usage

References

  1. ^ "Trim". Freepascal.org. 2013-02-02. Retrieved 2013-08-24.
  2. ^ "Character (Java 2 Platform SE 5.0)". Java.sun.com. Retrieved 2013-08-24.