Python+pdfLaTeX+ebook

After using my iLiad for a while I found that it’s super to read stuff that you find on internet. But you have to put it in there….

I came with a openoffice template with proper page dimensions to fit in whatever I find interesting to read. But for blogs or so, something that I read more or less regularly this is a pain. So I used python to retrieve the feeds and reformat them for the real formatting program pdfLaTeX.

Well, the very first code looks like this:


import feedparser
import re
import os
import locale
language, output_encoding = locale.getdefaultlocale()
def remove_html_tags(data):
p = re.compile(r'<.*?>')
return p.sub('', data)
def remove_html_special_char(data):
p = re.compile(r' ')
return p.sub('', data)
def br2dobleLine(data):
p = re.compile(r'
')
return p.sub('\n\n',data)
d = feedparser.parse("http://voglioscendere.ilcannocchiale.it/blogs/feeds/blogrss20.aspx?blogid=32495")
print "feed parsed"
##print e.title
##print desc
os.chdir ("/Users/paskino/temp/")
header = """\documentclass[iliad,12pt,oneside,onecolumn,final,openany]{iliad}
\usepackage[latin1]{inputenc}
\usepackage[italian]{babel}
\usepackage{hyperref}
\setlength{\hoffset}{-0.8 in}
%remove
\setlength{\\voffset}{-1 in}
\setlength{\\textwidth}{\paperwidth}
\\addtolength{\\textwidth}{-9mm}
\setlength{\\textheight}{\paperheight}
\\addtolength{\\textheight}{-22mm}
\\title{Voglioscendere}
\\begin{document}
\\tableofcontents
“”"
footer=”\end{document}”
e = []
news = []
section = header
for i in range(len(d['items'])):
e.append(d.entries[i])
desc =remove_html_special_char(remove_html_tags(br2dobleLine(e[i].description)))
news= “\section{”+e[i].title+”}”+desc
section += news
section += footer
##section = header+news[1]+footer
f=open(”prova.tex”,”w”)
f.write(section.encode(”ISO-8859-1″, ‘ignore’))
f.close()

You can download the script here.

The class iliad.cls is just a sligthly modified article.cls from standard LaTeX, in which I defined the dimensions of the iLiad screen as:

\DeclareOption{iliad}
{\setlength\paperheight {163mm}%
\setlength\paperwidth {122mm}}

By now it works impressively well, with minor bugs/problems.

UPDATE:
The code depends on Universal Feed Parser.

0 comments ↓

There are no comments yet...Kick things off by filling out the form below.

Leave a Comment