Removing backslash coming in the string
Removing backslash coming in the string
I'm creating a dictionary for which one of the value is a string with backslash,I know that python is automatically adding escape sequence.But at the end when you print the dictionary it's still printing that value with muliple backslashes,Now i have to pass this dictionary to another tool which is not expecting multiple backslashes,So currently I'm forced to manually remove a backslash is there a way to remove the backslash from the value of dictionary automatically.
(Pdb) print value2
"x01x02x03x04x05"
(Pdb) value2
'"\x01\x02\x03\x04\x05"'
value2 = "x01x02x03x04x05"
repr(value2)
@CristiFati: no, these are literal backslashes. That's why they are shown when using
print
, and the repr()
shows the literal value with the slashes doubled.– Martijn Pieters♦
Sep 6 '18 at 9:50
print
repr()
You do not have a value with double backslashes. The
repr()
output gives you valid Python syntax to recreate the value. There is no problem here, you have the value you need, just pass it on.– Martijn Pieters♦
Sep 6 '18 at 9:50
repr()
For your understanding: a double backslash is the technical representation of a single backslash and
x01
is the technical representation of a byte with value 01 (hex). This is called escape sequences.– Klaus D.
Sep 6 '18 at 9:51
x01
@MartijnPieters: Ah right, I was typing
value2
instead of print(value2)
.– CristiFati
Sep 6 '18 at 9:52
value2
print(value2)
1 Answer
1
You are confusing string representations with string values.
When you echo a string object in the Python interpreter, the output is really produced by printing the result of the repr()
function. This function outputs debugging friendly representations, and for strings, that representation is valid Python syntax you can copy and paste back into Python.
repr()
print
on the other hand, just writes the actual value in the string to the terminal. That's rather different from the Python syntax that creates the value. A backslash in the string prints as a backslash, you wouldn't see a backslash if there wasn't one in the value.
print
In Python string literal syntax, the backslash character has special meaning, it's the first character of an escape sequence. So if you want to have an actual backslash in the value of the string, you would need to use
\
to 'escape the escape'. There is other syntax where the backslash would not have special meaning, but the repr()
representation of a string doesn't use that other syntax. So it'll output any backslash in the value as the escape sequence \
.
\
repr()
\
That doesn't mean that the value has two backslashes. It just means that you can copy the output, and paste it into Python, and it'll produce the same string value.
You can see that your string value doesn't have double backslashes by looking at individual characters:
>>> value2 = '"\x01\x02\x03\x04\x05"'
>>> value2
'"\x01\x02\x03\x04\x05"'
>>> print value2
"x01x02x03x04x05"
>>> print value2[0]
"
>>> print value2[1]
>>> print value2[2]
x
>>> value2[0]
'"'
>>> value2[1]
'\'
>>> value2[2]
'x'
Printing value2[1]
shows that that single character is a backslash. Echoing that single character shows '\'
, the Python syntax to recreate a string with a single character.
value2[1]
'\'
When you echo dictionaries or lists or other standard Python containers, they too are echoed using valid Python syntax, so their contents all are shown by using repr()
on them, including strings:
repr()
>>> d = 'foo': value2
>>> d
'foo': '"\x01\x02\x03\x04\x05"'
Again, that's not the value, that's the representation of the string contents.
On top of that, container types have no string value, so printing a dictionary or list or other standard container type will only ever show their representation:
>>> print d # shows a dictionary representation
'foo': '"\x01\x02\x03\x04\x05"'
>>> print d['foo'] # shows the value of the d['foo'] string
"x01x02x03x04x05"
You'd have to print individual values (such as d['foo']
above), or create your own string value from the components (which involves accessing all the contents and building a new string from that). Containers are not meant to be end-user-friendly values, so Python doesn't provide you with a string value for them either.
d['foo']
Strings can also contain non-printable characters, characters that don't have a human-readable value, such as a newline, or a tab character, and even the BELL character that'll make most terminals go beep when you write one of those to them. And in Python 2, the str
type holds bytes, really, and only printable characters in the ASCII range (values 00 - 7F) considered when producing the repr()
output. Anything outside is always considered unprintable, even if you could decode those bytes as Latin-1 or another commonly used codec.
str
repr()
So when you do have special characters other than in the string, you'd see this in the representation:
>>> value_with_no_backslashes = "This is mostly ASCII with a b bell and a newline:nSome UTF-8 data: 🦊"
>>> print value_with_no_backslashes # works because my terminal accepts UTF-8
This is mostly ASCII with a bell and a newline:
Some UTF-8 data: 🦊
>>> value_with_no_backslashes
'This is mostly ASCII with a x08 bell and a newline:nSome UTF-8 data: xf0x9fxa6x8a'
Now, when I echo the value, there are backslashes, to make sure the non-printable characters can easily be copied and reproduce the same value again. Note that those backslashes are not doubled in the echoed syntax.
Note that representations are Python specific and should only be used to aid debugging. Writing them to logs is fine, using them to pass values between programs is not. Always use a serialisation format to communicate between programs, including command-line tools started as subprocesses or by writing output to the terminal. Python comes with JSON support built in, and for Python-to-Python serialisation with no chance of third-party interference, pickle
can be used for almost any Python data structure.
pickle
I'm passing value for value2 from an excel sheet where i had passed as "x01x02x03x04x05" but python interprets it as (Pdb) (Pdb) value2 '"\x01\x02\x03\x04\x05"' (Pdb) new_d='apple':value2 (Pdb) new_d 'apple': '"\x01\x02\x03\x04\x05"'
– Ashwin kumar
Sep 6 '18 at 10:08
(Pdb) print new_d 'apple': '"\x01\x02\x03\x04\x05"'
– Ashwin kumar
Sep 6 '18 at 10:09
@Ashwinkumar: that's explained in my answer. You are printing the dictionary, and dictionaries have no
str()
string conversion, only a repr()
representation. Printing the dictionary will show the representation. print new_d['apple']
to access just the string value.– Martijn Pieters♦
Sep 6 '18 at 10:12
str()
repr()
print new_d['apple']
Now it works here If i individually print a single item,My end task is to copy a dictionary of dictionaries where this value could be anywhere,And I have to pass that whole dictionary to another tool which doesn't accept double backslash.
– Ashwin kumar
Sep 6 '18 at 10:18
@tripleee: I always look for such a dupe, and can never find a good one. I wrote this one because I got fed up with looking. Perhaps this one can be the canonical.
– Martijn Pieters♦
Sep 6 '18 at 11:12
Thanks for contributing an answer to Stack Overflow!
But avoid …
To learn more, see our tips on writing great answers.
Required, but never shown
Required, but never shown
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
The string value seems to be
value2 = "x01x02x03x04x05"
(contains 5 characters having the ASCII codes from 1 to 5). The 2nd form seems just arepr(value2)
.– CristiFati
Sep 6 '18 at 9:48