用户对问题“Python:将二进制文字文本文件转换为普通文本”的回答

Python:将二进制文字文本文件转换为普通文本

b'Chapter 1 \xe2\x80\x93 BlaBla'
b'Boy\xe2\x80\x99s Dead.'

Chapter 1 - BlaBla
Boy's Dead.

strings = [
    b'Chapter 1 \xe2\x80\x93 BlaBla',
    b'Boy\xe2\x80\x99s Dead.',
for string in strings:
    print(string.decode('utf-8', 'ignore'))
--output:--
Chapter 1 – BlaBla
Boy’s Dead.

import fileinput as fi
import sys
with open('data.txt', 'wb') as f:
    f.write(b'Chapter 1 \xe2\x80\x93 BlaBla\n')
    f.write(b'Boy\xe2\x80\x99s Dead.\n')
with open('data.txt', 'rb') as f:
    for line in f:
        print(line)
with fi.input(
        files = 'data.txt', 
        inplace = True,
        backup = '.bak',
        mode = 'rb') as f:
    for line in f:
        string = line.decode('utf-8', 'ignore')
        print(string, end="")

~/python_programs$ python3.4 prog.py
b'Chapter 1 \xe2\x80\x93 BlaBla\n'
b'Boy\xe2\x80\x99s Dead.\n'
~/python_programs$ cat data.txt
Chapter 1 – BlaBla
Boy’s Dead.

import fileinput as fi
import re
pattern = r"""
    \\              #Match a literal slash...
    x               #Followed by an x...
    [a-f0-9]{2}     #Followed by any hex character, 2 times 
repl = ''
with open('data.txt', 'w') as f:
    print(r"b'Chapter 1 \xe2\x80\x93 BlaBla'", file=f)
    print(r"b'Boy\xe2\x80\x99s Dead.'", file=f)
with open('data.txt') as f:
    for line in f:
        print(line.rstrip()) #Output goes to terminal window
with fi.input(
        files = 'data.txt', 
        inplace = True,
        backup = '.bak') as f:
    for line in f:
        line = line.rstrip()[2:-1]
        new_line = re.sub(pattern,  "", line, flags=re.X)
        print(new_line) #Writes to file, not your terminal window

~/python_programs$ python3.4 prog.py 
b'Chapter 1 \xe2\x80\x93 BlaBla'
b'Boy\xe2\x80\x99s Dead.'
~/python_programs$ cat data.txt
Chapter 1  BlaBla
Boys Dead.

print(r"b'Chapter 1 \xe2\x80\x93 BlaBla'", file=f)

\xNN  #=> e.g. \xe2

\\xNN

r"...."

\xe2

"\\xe2"

\xe2

string = "\\xe2"

Python:将二进制文字文本文件转换为普通文本

1 个回答