Fundamental text manipulations in SQL
Learn practical MySQL string functions for cleaning, extracting, and combining text.
Context
Text data is often inconsistent: mixed casing, extra whitespace, and values packed into a single column from multiple source systems.
String functions help, but the real skill is knowing which assumptions your query depends on. Positional extraction is fast and simple when the format is fixed; delimiter-based parsing is safer when the structure is stable but the length is not. If neither is true, the data likely needs to be normalized upstream.
Trimming
TRIM()removes leading and trailing spaces.LTRIM()removes leading spaces.RTRIM()removes trailing spaces.
A small but important detail: trimming is often a data-quality boundary, not just formatting. Use it when comparing values or storing canonicalized text, especially before joins or uniqueness checks.
SET @value = ' hello ';
SELECT TRIM(@value), LTRIM(@value), RTRIM(@value)
FROM dual;
Replacing
REPLACE(input, from, to)substitutes every occurrence of one substring with another.
This is useful for normalization, but it is literal replacement, not pattern matching. If you need rules like “remove all non-digits,” you need a different tool.
select REPLACE('John Doe', ' ', '') from dual;
Extracting parts of complex strings
LEFT(input, n)extracts the firstncharacters.RIGHT(input, n)extracts the lastncharacters.SUBSTRING_INDEX(input, delimiter, count)extracts text before or after a delimiter.- A negative count works from the right side.
Suppose one column stores both currency and amount:
SET @value = 'PLN 150.00';
If the format is strictly fixed — three-letter currency code, one space, then the amount — LEFT() is simple:
SELECT LEFT(@value, 3) FROM dual; -- PLN
But the fragile part is the assumption: this only works if the currency code is always exactly three characters. If the format changes, positional slicing breaks silently. A more robust option is delimiter-based parsing:
SELECT SUBSTRING_INDEX(@value, ' ', 1) AS currency,
SUBSTRING_INDEX(@value, ' ', -1) AS monetary_value
FROM dual;
This is usually better when the delimiter is stable and the field lengths are not. One caveat: if the delimiter is missing, SUBSTRING_INDEX() returns the whole string, so validate the input if malformed values are possible.
To compute on the amount, cast it to a numeric type:
SELECT CAST(SUBSTRING_INDEX(@value, ' ', -1) AS DECIMAL(10, 2)) FROM dual;
Alternately, RIGHT could be combined with LENGTH function to extract the monetary value:
SELECT RIGHT(@value, LENGTH(@value) - 4) FROM dual;
Also note the difference between LENGTH() and CHAR_LENGTH() in MySQL: LENGTH() returns bytes, while CHAR_LENGTH() returns characters.
If your data can contain multibyte characters, prefer CHAR_LENGTH() when you mean “visible characters.”
Concatenation
CONCAT(...inputs)joins strings together.CONCAT_WS(delimiter, ...inputs)joins strings using a delimiter and skips NULL values.
That NULL behaviour is the main design difference: CONCAT() returns NULL if any argument is NULL, while CONCAT_WS() is often safer for optional fields such as middle names or address lines.
SET @firstName = 'John';
SET @lastName = 'Doe';
SELECT CONCAT(@firstName, ' ', @lastName),
CONCAT_WS(' ', @firstName, @lastName)
FROM dual;