python3实现全角和半角字符转换的方法示例

作者:陈鹏 时间:2021-06-16 13:08:36 

前言

本文主要给大家介绍了关于python3中全角和半角字符转换的相关内容,分享出来供大家参考学习,下面话不多说了,来一起看看详细的介绍吧。

一、背景介绍

解决什么问题:快速方便的对文本进行全角半角自动转换

适用什么场景:学生答题数据中全角字符替换为半角字符

二、全角半角原理

全角即:Double Byte Character,简称DBC

半角即:Single Byte Character,简称SBC

在 windows 中,中文和全角字符都占两个字节,并且使用了 asciichart 2 (codes 128–255);
全角字符的第一个字节总是被置为 163,而第二个字节则是相同半角字符码加上128(不包括空格,全角空格和半角空格也要考虑进去);

对于中文来说,它的第一个字节被置为大于163,如'阿'为:176 162,检测到中文时不进行转换。

例如:半角 a 为 65,则全角 a 是 163(第一个字节)、193(第二个字节,128+65)。

全角半角示例:(文本 test.txt 包含全角和半角字符)


F:\test>type test.txt
123456
123456
abcdefg
abcdefg
中国你好

三、使用 Python3 实现全角半角转换


# -*- coding:utf-8 -*-
# i@mail.chenpeng.info

”'
全角即:Double Byte Character,简称:DBC
半角即:Single Byte Character,简称:SBC
”'

def DBC2SBC(ustring):
”' 全角转半角 ”'
rstring = “”
for uchar in ustring:
 inside_code = ord(uchar)
 if inside_code == 0x3000:
 inside_code = 0x0020
 else:
 inside_code -= 0xfee0
 if not (0x0021 <= inside_code and inside_code <= 0x7e):
  rstring += uchar
  continue
 rstring += chr(inside_code)
return rstring

def SBC2DBC(ustring):
”' 半角转全角 ”'
rstring = “”
for uchar in ustring:
 inside_code = ord(uchar)
 if inside_code == 0x0020:
 inside_code = 0x3000
 else:
 if not (0x0021 <= inside_code and inside_code <= 0x7e):
  rstring += uchar
  continue
 inside_code += 0xfee0
 rstring += chr(inside_code)
return rstring

s = ”'
array(‘0' => ‘0', ‘1' => ‘1', ‘2' => ‘2', ‘3' => ‘3', ‘4' => ‘4',
 ‘5' => ‘5', ‘6' => ‘6', ‘7' => ‘7', ‘8' => ‘8', ‘9' => ‘9',
 ‘A' => ‘A', ‘B' => ‘B', ‘C' => ‘C', ‘D' => ‘D', ‘E' => ‘E',
 ‘F' => ‘F', ‘G' => ‘G', ‘H' => ‘H', ‘I' => ‘I', ‘J' => ‘J',
 ‘K' => ‘K', ‘L' => ‘L', ‘M' => ‘M', ‘N' => ‘N', ‘O' => ‘O',
 ‘P' => ‘P', ‘Q' => ‘Q', ‘R' => ‘R', ‘S' => ‘S', ‘T' => ‘T',
 ‘U' => ‘U', ‘V' => ‘V', ‘W' => ‘W', ‘X' => ‘X', ‘Y' => ‘Y',
 ‘Z' => ‘Z', ‘a' => ‘a', ‘b' => ‘b', ‘c' => ‘c', ‘d' => ‘d',
 ‘e' => ‘e', ‘f' => ‘f', ‘g' => ‘g', ‘h' => ‘h', ‘i' => ‘i',
 ‘j' => ‘j', ‘k' => ‘k', ‘l' => ‘l', ‘m' => ‘m', ‘n' => ‘n',
 ‘o' => ‘o', ‘p' => ‘p', ‘q' => ‘q', ‘r' => ‘r', ‘s' => ‘s',
 ‘t' => ‘t', ‘u' => ‘u', ‘v' => ‘v', ‘w' => ‘w', ‘x' => ‘x',
 ‘y' => ‘y', ‘z' => ‘z',
 ‘(' => ‘(‘, ‘)' => ‘)', ‘〔' => ‘[‘, ‘〕' => ‘]', ‘【' => ‘[‘,
 ‘】' => ‘]', ‘〖' => ‘[‘, ‘〗' => ‘]', ‘”‘ => ‘[‘, ‘”‘ => ‘]',
 ‘\” => ‘[‘, ‘\” => ‘]', ‘{' => ‘{‘, ‘}' => ‘}', ‘《' => ‘<‘,
 ‘》' => ‘>',
 ‘%' => ‘%', ‘+' => ‘+', ‘—' => ‘-‘, ‘-' => ‘-‘, ‘~' => ‘-‘,
 ‘:' => ‘:', ‘。' => ‘.', ‘、' => ‘,', ‘,' => ‘.', ‘、' => ‘.',
 ‘;' => ‘,', ‘?' => ‘?', ‘!' => ‘!', ‘…' => ‘-‘, ‘‖' => ‘|',
 ‘”‘ => ‘”‘, ‘\” => ‘`', ‘\” => ‘`', ‘|' => ‘|', ‘〃' => ‘”‘,
 ‘' => ‘ ‘);
 ”'

# 全角转半角
print(DBC2SBC(s))

# 半角转全角
print(SBC2DBC(s))

s = ”'中文测试”'

# 全角转半角
print(DBC2SBC(s))

# 半角转全角
print(SBC2DBC(s))

四、总结

五、参考资料

http://thinkerou.com/2015-06/covert-dbc-sbc/

来源:http://chenpeng.info/html/3795

标签:python3,全角,半角
0
投稿

猜你喜欢

  • python学习之matplotlib绘制散点图实例

    2021-02-22 21:28:22
  • Flash在某些多标签浏览器中的“伪沙箱”问题

    2011-01-06 12:37:00
  • 简单了解Python字典copy与赋值的区别

    2022-07-16 11:35:52
  • 基于Numba提高python运行效率过程解析

    2022-09-10 11:00:38
  • Python中关于元组 集合 字符串 函数 异常处理的全面详解

    2021-04-14 12:07:37
  • 浅谈Python对内存的使用(深浅拷贝)

    2021-06-08 03:07:35
  • Centos7 安装 PHP7最新版的详细教程

    2023-10-16 21:14:12
  • Pandas操作CSV文件的读写实现方法

    2022-05-12 09:48:05
  • 最简洁的asp多重查询的解决方案

    2011-04-15 10:50:00
  • 用FrontPage制作缩略图和图片重叠效果

    2007-11-18 14:45:00
  • FrontPage XP设计教程4——Css样式表的应用

    2008-10-11 12:25:00
  • 利用python实现可视化大屏

    2023-08-17 17:29:17
  • 细线表格的处理

    2008-08-06 12:53:00
  • python爬虫urllib中的异常模块处理

    2022-12-06 10:42:33
  • asp读取xml实例代码

    2011-03-08 11:13:00
  • 基于Python3编写一个GUI翻译器

    2022-07-07 07:57:54
  • python 用正则表达式筛选文本信息的实例

    2023-04-29 14:12:26
  • Dreamweaver基础技巧全面接触

    2010-03-25 12:23:00
  • HTML 标签是否匹配检测代码

    2010-03-17 20:50:00
  • python数据预处理 :样本分布不均的解决(过采样和欠采样)

    2023-08-10 07:03:14
  • asp之家 网络编程 m.aspxhome.com