Regular expression for converting rtf to text in python

  python, regex, rtf

I am going to use this Regular expression on my rtf file:

((?:^|s)[^s]+(?:(?!line)[A-Za-z]+n?(?:-?d+)?[ ]?)+)(b[^s])

As you see in https://regexr.com/

xxxparfi-240li720 could not be matched completely due to having "–>" after it in my rtf file. The regular regex can only detect "xxxparfi-"

Do you have any idea how to solve it?

This is my rtf file:

{rtf1ansiansicpg1252cocoartf2513
cocoatextscaling0cocoaplatform0{fonttblf0fromanfcharset0 Times-Roman;}
{colortbl;red255green255blue255;}
{*expandedcolortbl;;}
paperw15000paperh15840margl1440margt1440margr1440margb1440deftab1134widowctrllytexcttpformshadeheadery720footery720pgwsxn15000pghsxn15840marglsxn1440margtsxn1440margrsxn1440margbsxn1440pgbrdropt32pardpardfi-240li720tx1200tx1920tx2640tx3360tx4080tx4800tx5520tx6240tx6960tx7680tx8400tx9120tx9840tx10560itap0nowidctlparplainf2fs20bchshdng0chcfpat0{XX, XX   XXplainf2fs20chshdng0chcfpat0parfi-240li720tx1200tx1920tx2640tx3360tx4080tx4800tx5520tx6240tx6960tx7680tx8400tx9120tx9840tx10560 URN: xxx  DOB: xx  Sex: XXparfi-240li720tx1200tx1920tx2640tx3360tx4080tx4800tx5520tx6240tx6960tx7680tx8400tx9120tx9840tx10560 Home address: 3 xxx xx, xxxxx 3134parpardfi-240li720pardpardfi-240li720itap0nowidctlpar Home Phone:   Mobile Phone:}
xxxxparfi-240li720 swab xxxparfi-240li720 to d/w xxxxparfi-240li720 -->case x/  XXparfi-240li720 to x/x xxx}

Source: Python Questions

LEAVE A COMMENT