Removing part of string before a sub-string and its first occurrence

I have a text output like:

str <- '=== AAAA === B§BBB === remove === remove1 === remove2 === AAAA === AAAA'

I would like to remove all the substring between the ===remove (included) and its first next ocurrency:

str_2 <- '=== AAAA === B§BBB === AAAA === AAAA'

I tried in gsub():

gsub()

gsub("=== B§BBB*.*=== AAAA","",str))

But it doesn't work.

Any help is appreciated.

2 Answers
2

Answer to the updated question

str <- '=== AAAA === B§BBB === remove === remove1 === remove2 === AAAA === AAAA' sub("(?:\s*===\s*remove\S*)+","",str)

See the R demo online and an online regex demo.

The pattern matches 1+ consecutive occurrences of

\s*

===

\s*

remove

\S*

Answer to the original question

You may use

sub("=== remove.*?(n\s*?=== AAAA)","\1",str)

Details

=== remove

.*?

(n\s*?=== AAAA)

1

s*?

=== AAAA

An alternative PCRE regex can also be used:

sub("(?m)(?:(?:^|\R)\h*===\h*remove)+","",str, perl=TRUE)

Details

(?m)

^

(?:(?:^|\R)\h*===\h*remove)+

(?:^|\R)

\h*===\h*

===

remove

I am sorry, I don't have a newline in this string. Editing my question @Wiktor
– thequietus
Aug 22 at 13:02

@thequietus I updated the answer.
– Wiktor Stribiżew
Aug 22 at 18:09

You could use the stringi package.

stringi

library(stringi) stri_replace_all_fixed(str, " === remove", "") [1] "=== AAAA === B§BBB === AAAA === AAAA"

edited the question, "remove" string may also vary to "remove1" or whatever @milan. sorry about that.
– thequietus
Aug 22 at 13:27

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

搜尋此網誌

Dfyjkt