Using LaTeX to make PDF documents with Japanese characters
Even if you know nothing about LaTeX, you can make your first Japanese PDF document by taking a copy of this example file JIS.tex, going to a shell command line and typing "pdflatex JIS.tex". That should produce this output: JIS.pdf.
If that doesn’t work for you, then you need to install some LaTeX software or Japanese fonts. On my Debian GNU/Linux system, I think I just installed texlive-latex-base and latex-cjk-japanese, and the package manager automatically installed the other packages needed by those two. I don’t remember if I also had to install a fonts package.
Once you’ve got that working, you can start modifying and removing lines from that example file to see what you really need. I trimmed it down to eight lines:
\documentclass[12pt]{scrartcl}\usepackage{CJK}
\begin{document}\begin{CJK*}[dnp]{JIS}{min}
\section{What I learned today}I can write this 私はキランです in Japanese.
\end{CJK*}\end{document}
%%% Local Variables:%%% coding: euc-japan
Ok ok, that’s ten lines since I included two commands at the end to tell Emacs which character encoding to use when saving the file. This seems important since when I saved it as utf-8, the pdflatex program failed. Because these two lines start with percent signs, they will be ignored by LaTeX processors such as pdflatex, so it’s safe to leave them there even if you’re not using Emacs.
In the sixth line of my small example you should see seven mostly-simple Japanese characters. If that’s not what you see, try setting your browser’s character encoding to EUC-JP or maybe UTF-8. (This might be in [menu-bar]->View->Character Encoding->…)
Once you have this working, you should look at the other examples that came with the LaTeX CJK package. On my system, the examples are installed in the directory /usr/share/doc/latex-cjk-japanese/examples/ (Thanks for the tip, LUK ShunTim) This is probably also the best way to get started with other complex fonts such as Chinese and Korean.
It took me four hours to figure out how to use LaTeX to make a PDF document with Japanese characters. At one point, I became so frustrated with the LaTeX documentation that I gave up and decided to use DocBook instead. Unfortunately, DocBook’s documentation was just as bad.
I think I learned something from all this about what makes a good tutorial: get the user to a working example as quickly as possible. Once you have something working, then you can experiment and learning becomes fun.
For a start, I think I’ll put the "ruby" commands from JIS.tex back in since they’re a pretty useful reading aid for learners. "Ruby" here refers to the little superscript phonetic kana characters, usually called furigana. It has no relation to the Ruby programming language, which was developed by a Japanese guy.
To write Japanese hirigana, katakana, and kanji, in Emacs you just use the function M-x set-input-method and then type japanese at the prompt. The usual command (C-h I) will show the documentation for how the input method works. While using the japanese input method, typing qq will put you into the japanese-ascii input method, which you’ll need for typing LaTeX commands and symbols "\{}". And qq again will bring you from the japanese-ascii input method back to the normal japanese input method.
If you want to use other applications, then you’ll need to install some separate input method software. I installed the packages "anthy", "scim", and "scim-canna" and then was able to write Japanese in GNOME applications by right clicking in a text box and from the "Input Methods" submenu, choosing "SCIM Input Method". It’s annoying that SCIM uses Ctrl+Space as it’s activation sequence. You can change this by going to "Show command menu->SCIM Setup->Global Setup" I wasn’t able to get OpenOffice.org to work. From looking around, it seems OpenOffice only supports "IIIMP", but I can’t see any package that provides IIIMP.
You might find useful info on these pages:
- How to create Japanese language documents under GNU/Linux using LaTeX
- related discussion on an Ubuntu bug forum
- Inputting Japanese text in [GNU/]Linux and some BSDs
- On my Learning Japanese page I’ve added some links about computing and Japanese and that’s where I’ll add more in the future.
Hope that helps!
–
Ciarán O’Riordan,
Support free software: Join FSFE’s Fellowship
Comments
When I downloaded your webpage and crop out the tex example to include the Japanese text, when I run pdflatex on it, it complains:
! Argument of \CJK@XX has an extra }.
\par
l.10 ç§ã
¯ãã©ã³ã§ã
I understand that might be caused by using wrong or mixed characters.
I am not using an xterm or editor that recognizes the characters or displays them correctly. (They look fine in firefox though).
In my editor in my xterm they look like
\xe7\xa7\x81\xe3\x81\xaf\xe3\x82\xad\xe3\x83\xa9\xe3\x83\xb3\xe3\x81\xa7\xe3\x81\x99
(each code is represented by four characters but is only
one character)
In my xterm, they look like
ç§ã¯ãã©ã³ã§ã
instead of 私はキランです
Any tool recommendation(s) for converting it to something that pdflatex and CJK will like? Or maybe some cjk-latex package option to work with it?
I assume it works for you since you put it on your webpage, but something converted it? (Note that I used tnftp to download your weboage which does no processing or changing of its content.)
Thank you for your examples. By the way your JIS.tex works fine for me.
Thanks
Which webpage? JIS.tex?
If you perform any editing, then your editor might be messing up the coding. The commands at the end “coding: euc-japan” are commented out (%%%), so they’re ignored by pdflatex, but Emacs uses those commands for a hint about what coding system to use.
When I open JIS.tex with OpenOffice.org and select UTF-8 coding, the fonts are a mess, but if I select “Japanese (EUC-JP)”, then they display perfectly.
When I download JIS.tex with wget and ‘cat’ the file at the shell, the fonts don’t display correctly – but that’s probably either because my shell is UTF-8 or maybe it’s because I changed computers a while ago and might not have all the Japanese fonts installed.
So, the problem is probably that – if you do any editing, cropping etc. – you have to either use Emacs or OpenOffice, and select the EUC-JP coding system, or you have to find how to instruct your text editor to use that coding system.
Hope that helps.
The JIS.tex works good. It was the http://blogs.fsfe.org/ciaran/?p=150 page itself that had the other brief example. I fetched it, edited it with nvi (to remove HTML from top and bottom), and saved it. Fetching with tnftp and resaving with nvi shouldn’t have any made any change.
Here is YOUR example as saved by me:
http://reedmedia.net/~reed/tmp-cjk-98d57893tyncixhtr/test2.tex
Time for me to read the cjk-latex docs
I changed my YOUR
\begin{CJK*}[dnp]{JIS}{min}
to:
\begin{CJK*}{UTF8}{min}
and now don’t have that problem — but need to get the udmj79 font installed.
Thanks again for you examples here!