[R: New Features on pinyin] Convert Chinese Characters into Sijiao and Wubi codes

Repo

https://github.com/pzhaonet/pinyin

Brief Intro and curriculum

The 'pinyin' package was developed in R language. It can convert Chinese characters in to Latin letters, officially called pinyin, i.e. the romanization system for Standard Chinese in mainland China, Malaysia, Singapore, and Taiwan. An brief introduction can be referred to the post pinyin: an R package that converts Chinese characters into Latin letters.

New Features

What features did I add?

  • Four times faster for converting.
  • At the beginning of the year 2018 I received an issue report by psychelzh about a polyphone error. Now a new pinyin library has been added, which more or less solved the polyphone problem.
  • Convert Chinese characters into Sijiao codes (literally four corner code).
  • and Wubi codes (literally five-stroke).
  • Some minor bugs were fixed.


Figure 1: Test the new features in RStudio IDE

How did I implement them?

  • Following Qu Cheng's suggestions in personal communications, I converted the pinyin library into an environment to accelerate the converting procedure by the pylib() function.
  • A new pinyin library '/inst/lib/zh2.txt' was added and a parameter dic = c('zh', 'zh2') in the pylib() function allows the users to choose a preferable library for polyphone.
  • New functions fclib() and four_corner() imports a four-corner library and converts Chinese characters into four-corner codes, according to Qu Cheng's suggestions.
  • A new function wubi() imports a five-stroke library and converts Chinese characters into five-stroke codes, again according to Qu Cheng's suggestions.
  • The downstream functions bookdown2py(), file.rename2py(), file2py() were updated to support the updates mentioned above.

Each part of the functions are well documented. Other files were updated automatically by compilation.

Link to relevant lines in the code on GitHub can be found mainly in my latest commit (click to see the details):

GitHub Account

https://github.com/pzhaonet

H2
H3
H4
3 columns
2 columns
1 column
7 Comments