Using Regular Expressions in SAP ABAP (REPLACE, FIND REGEX)

Regular Expressions in SAP ABAP

Start using Regular Expressions with ABAP and your Code will be more efficient and shorter. FIND REGEX and REPLACE ABAP statement with samples and real case.

In this article, we will start with the common Regular Expressions Operators used in ABAP.

Then, you will find some useful ABAP Statement using Regular Expressions in order to detail the main Regex in ABAP: the FIND REGEX and REPLACE.

List of Regular Expressions Operators (in ABAP)

Let’s start first by the list of Regular Expressions’ Operator

Operator Purpose
. Dot matches a single character.
? Denotes either no or a single occurrence of a character or set of characters.
* Denotes any number of occurrences (0/1/ or more) of a character or a set of characters.
+ Matches one or more occurrence of a character or set of characters.
\< Matches start of a word. \> ?Matches end of a word.
^ Used for denoting negation when used with box brackets as well as the start of line marker.
?= as a preview condition.
?! Used as a negated preview condition.
\1 placeholders for subgroup registers .
\2 Used for placeholders for subgroup registers.
$ Denotes end of a line.
\d Denotes a digit (0-9).
\w Denotes an alphanumeric character
\u Matches a single alphabet

Useful ABAP Statement using Regular Expressions

You will find some useful usage of Regular Expressions in ABAP

Check alsoTwo ways to Check if email is valid in ABAP : Function Vs Regex

REPLACE OF REGEX

Syntax of REPLACE pattern IN

The Syntax of REPLACE pattern IN ABAP programming is the following

REPLACE[{FIRST OCCURRENCE}|{ALL OCCURRENCES}OF]pattern
IN[section_of]dobj WITH new
[IN{CHARACTER|BYTE}MODE]
[replace_options].

Look at this example of how to use REPLACE with Regular Expressions.
This ABAP Snippets will replace all the u in the text string with x

" After being replaced, text contains the value "x-xx-x". 
DATA text TYPE string VALUE '-uu-'. 

REPLACE ALL OCCURRENCES OF REGEX 'u*' IN text WITH 'x'.

Exceptions to REPLACE OF REGEX

In order to handle ABAP Regular Expression Errors, SAP added 3 Regex Exceptions.

Here the list of Exceptions to REPLACE OF REGEX

  • CX_SY_INVALID_REGEX
    • Cause: Illegal expression after the addition REGEX.
      Runtime Error: INVALID_REGEX
  • CX_SY_REGEX_TOO_COMPLEX
    • Cause: More information: Exceptions in Regular Expressions.
      Runtime Error: REGEX_TOO_COMPLEX
  • CX_SY_INVALID_REGEX_FORMAT
    • Cause: Invalid replacement pattern after the addition WITH.
      Runtime Error: INVALID_FORMAT

Removing Extra characters in Phone Number using FIND REGEX in ABAP

Here another example, and useful one. In order to remove none numerical from a phone number, you use the following line of ABAP Code :

REPLACE ALL OCCURRENCES OF REGEX '([^\d])'
   IN LV_PHONE WITH ''.

FIND REGEX statement in ABAP

In this variant, a search is performed for a match with a regular expression specified in regex.

Character-like expression position

For regex, either a character-like operand can be specified that contains a valid, regular expression when the statement is executed, or an object reference variable that points to an instance of the class CL_ABAP_REGEX.

If specified directly, regex is a character-like expression position.

In searches for a regular expression, specific search strings can be entered that permit further conditions including forecast conditions.

The occurrences are determined according to the “leftmost-longest” rule.

ABAP Regex: Multiple Matches

Of all the possible matches between the regular expression and the required character string, the substring starting in the furthest position to the left is selected.

If there are multiple matches in this position, the longest of this substring is selected.

An empty substring in regex is not a valid regular expression and raises an exception. A character string is empty if regex is either an empty string or is of type c, n, d, or t and only contains blanks.

Some regular expressions that are not empty, such as a*, are used to search for empty character strings. This is possible when searching for the first occurrence or all occurrences.

The relevant empty substring are found before the first character, between all characters, and after the last character of the search ranges. A search of this type is always successful.

A regular expression can have correct syntax, but be too complex for the execution of the statement FIND, which raises a handleable exception to the class CX_SY_REGEX_TOO_COMPLEX. Refer to Exceptions in Regular Expressions.

Example of FIND REGEX

The following search finds the substring ‘ababb’ from offset 3 or higher.

Using the “leftmost-longest” rule, the other matching substring ‘babboo’ from offset 4 or higher is not found.

DATA: moff TYPE i, 
      mlen TYPE i. 

FIND REGEX 'a.|[ab]+|b.*' IN 'oooababboo' 
     MATCH OFFSET moff 
     MATCH LENGTH mlen.

An other example can be ( replace multiple if )

FIND REGEX '[0-9]' IN LV_DATA.
IF SY-SUBRC = 0. 
 WRITE: 'Numeric Found'. 
ELSE.
 WRITE: 'No Numeric Found'. 
ENDID.

Note Also that you can use the FIND REGEX with ‘ignoring case’ if the uppercase makes no difference with lowercase in your case.

Or even you can concatenate condition with OR (|) in the Regular Expressions.