Catch regex catastrophic backtracking Node.js

Catch regex catastrophic backtracking Node.js



I've this regex in my node.js script:


const commentPattern = new RegExp(
'(\/\*([^*]|[\r\n]|(\*+([^*/]|[\r\n])))*\*+/)|(//.*)',
'g'
);



which I use to extract comments from open source Java projects.



I've found out that some piece of commits stops my script. This is due to 'Catastrophic Backtracking' and I was looking for a way to catch it or prevent it in order to allow my code to keep running even after this cases.



Here is an example of code that blocks the execution of the script:


import android.content.res.Resources;
import android.os.Handler;
import android.preference.PreferenceFragment;
import android.view.ViewGroup;
* Provides the regex to identify domain HTTP(S) protocol and/or 'www' sub-domain.
*
* Used to format user-facing @link String's in certain preferences.
*/
public static final String ADDRESS_FORMAT_REGEX = "^(https?://(w3)?|www\.)";

/**
// Used to ensure that settings are only fetched once throughout the lifecycle of the fragment
private boolean mShouldFetch;

public View onCreateView(@NonNull LayoutInflater inflater,
ViewGroup container,
Bundle savedInstanceState)
// use a wrapper to apply the Calypso theme
Context themer = new ContextThemeWrapper(getActivity(), R.style.Calypso_SiteSettingsTheme);
LayoutInflater localInflater = inflater.cloneInContext(themer);
View view = super.onCreateView(localInflater, container, savedInstanceState);

if (view != null)
setupPreferenceList((ListView) view.findViewById(android.R.id.list), getResources());


return view;


@Override
public void onChildViewAdded(View parent, View child)
if (child.getId() == android.R.id.title && child instanceof TextView)
// style preference category title views
TextView title = (TextView) child;
WPPrefUtils.layoutAsBody2(title);
else
// style preference title views
TextView title = (TextView) child.findViewById(android.R.id.title);
if (title != null) WPPrefUtils.layoutAsSubhead(title);



@Override
public void onChildViewRemoved(View parent, View child)
// NOP


@Override



I'm using Node.js version 8.6.0, I also tried on v9.8.0.





To match multiline comments, use RegExp('/\*[^*]*\*+(?:[^/*][^*]*\*+)*/', 'g')
– Wiktor Stribiżew
Mar 20 at 17:20


RegExp('/\*[^*]*\*+(?:[^/*][^*]*\*+)*/', 'g')





Together with a single line comment, it will look like RegExp('/\*[^*]*\*+(?:[^/*][^*]*\*+)*/|//.*', 'g'). You should be aware that this regex might now work in 100% cases correctly (e.g. it can match inside string literals).
– Wiktor Stribiżew
Mar 20 at 17:44



RegExp('/\*[^*]*\*+(?:[^/*][^*]*\*+)*/|//.*', 'g')





It does not work with this code unfortunately :( regex101.com/r/GaWSyh/1 as you can see the code is pretty strange from a comments point of view since there is a regex which contains //, a comment that starts but is never terminated (/**) and other regular comments
– Riccardo
Mar 20 at 20:38





You did not use my regex pattern in the regex tester, here is my pattern fiddle. As I have already mentioned, the // and /*...*/ will also be found in string literals. You cannot parse code with a single regex safely, only with some assumptions.
– Wiktor Stribiżew
Mar 20 at 21:45



//


/*...*/





You are right, i probably made some mistakes while trying to escape some characters.. thank you!
– Riccardo
Mar 21 at 7:46




1 Answer
1



You can't safely parse code with one regex, so, fixing the catastrophic backtracking won't really solve the issue.



Using some JavaScript code parser will be the right solution.



If you are fine with matching comment like substrings inside string literals, comments, etc., you may use


var rx = new RegExp('/\*[^*]*\*+(?:[^/*][^*]*\*+)*/|//.*', 'g')



See the online JS regex demo. Note that a regex constructor is prefereble due to many / chars in the pattern and thus all regex escaping is done using double chars.


/




Details


*[^*]**+(?:[^/*][^*]**+)*/


|


//.*



Thanks for contributing an answer to Stack Overflow!



But avoid



To learn more, see our tips on writing great answers.



Some of your past answers have not been well-received, and you're in danger of being blocked from answering.



Please pay close attention to the following guidance:



But avoid



To learn more, see our tips on writing great answers.



Required, but never shown



Required, but never shown




By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

𛂒𛀶,𛀽𛀑𛂀𛃧𛂓𛀙𛃆𛃑𛃷𛂟𛁡𛀢𛀟𛁤𛂽𛁕𛁪𛂟𛂯,𛁞𛂧𛀴𛁄𛁠𛁼𛂿𛀤 𛂘,𛁺𛂾𛃭𛃭𛃵𛀺,𛂣𛃍𛂖𛃶 𛀸𛃀𛂖𛁶𛁏𛁚 𛂢𛂞 𛁰𛂆𛀔,𛁸𛀽𛁓𛃋𛂇𛃧𛀧𛃣𛂐𛃇,𛂂𛃻𛃲𛁬𛃞𛀧𛃃𛀅 𛂭𛁠𛁡𛃇𛀷𛃓𛁥,𛁙𛁘𛁞𛃸𛁸𛃣𛁜,𛂛,𛃿,𛁯𛂘𛂌𛃛𛁱𛃌𛂈𛂇 𛁊𛃲,𛀕𛃴𛀜 𛀶𛂆𛀶𛃟𛂉𛀣,𛂐𛁞𛁾 𛁷𛂑𛁳𛂯𛀬𛃅,𛃶𛁼

Edmonton

Crossroads (UK TV series)