Contents Contents
-
Introduction
-
Reference
-
Module Documentation
-
Manual Pages
The Blurb
EtText is a simple plain-text format which allows conversion to and from HTML.
Instead of editing HTML directly, it provides an easy-to-edit, easy-to-read and
intuitive way to write HTML, based on the plain-text markup conventions we've
been using for years.
Like most simple text markup formats (POD, setext, etc.), EtText markup handles
the usual things: insertion of P tags, header recognition and markup. However
it also adds a powerful link markup system.
EtText markup is simple and effective; it's very similar to setext, WikiWikiWeb
TextFormattingRules or Zope's StructuredText.
EtText is distributed under the same licensing terms as Perl itself.
Contributors to Text::EtText
Here's a list of people who've contributed to Text::EtText:
-
Justin Mason <jm /at/ jmason.org>: original author and maintainer
-
Caolan McNamara <caolan /at/ csn.ul.ie>: EtText contributions; lists,
pre-formatted text, lots of suggestions
-
rudif /at/ bluemail.ch: lots of help with supporting Windows
-
Chris Barrett, chris /at/ getfrank.com: suggested CSS class support for the
Latte-style balanced tags
Thanks all! Patches and suggestions are welcomed -- send them in!
(By the way, patch contributors get listed at the top, 'cos patches save
me writing the code ;)
Using EtText
Like most simple text markup formats (POD, setext, etc.), EtText markup
handles the usual things: insertion of <P> tags, header
recognition and markup. However it adds a powerful link markup system
and several other useful features.
EtText markup is simple and effective; it's based loosely on setext, with bits
of WikiWikiWeb TextFormattingRules thrown in.
EtText was previously part of WebMake, but is now distributed
as a standalone component.
Basic Text Markup
If you leave blank lines between paragraphs, <p> and
</p> tags will be inserted in the correct places.
EtText does quite a good job of this.
Words wrap and fill automatically, so there's no need to worry about wrapping
before 80 characters. (It's good form to do so anyway, in case other people
ever need to edit your text, or you need to mail it around.)
A paragraph consisting of a line of 10 or more consecutive - or _ signs will be
converted to a HR tag.
Sections of text between pairs of certain characters will be turned into
markup, as follows:
EtText
|
Tag Used
|
Result
|
**text**
|
<strong>
|
text
|
__text__
|
<em>
|
text
|
##text##
|
<code>
|
text
|
& signs that have whitespace on either side will be converted
to & signs automatically.
Text indented from the left margin will be converted into a <P>
paragraph wrapped in a <blockquote> -- unless it starts with a
* , - , + , o character
followed by whitespace, or is numbered -- 1. , A) or a. ,
etc. -- in which case it's interpreted as a list item; see Lists below.
Another exception to the above rule is that text indented by only 1 space, or
on lines starting in the first column with two colon characters, will be
surrounded by <pre> tags.
If you find writing HTML tag-pairs manually annoying, EtText includes an idea
from Latte; balanced-tag generation. Wrap the text to be tagged with
the name of the tag followed immediately by a { character on the left, and a }
character on the right. In other words,
strong{text}
will be rendered as
<strong>text</strong>
or, in other words, text . This can be nested, so strong{text
with i{italic} bits} will be rendered as text with italic
bits.
In addition, the balanced-tag support has a bonus feature, in that it supports
CSS classes; follow the name of the tag with a full stop and the class, and
it will use that class, like so:
i.green{foo}
will be rendered as
<i class="green>foo</i>
Mail headers, and mail messages, are now marked up automatically.
Lists
A paragraph indented from the left margin (by either spaces or tabs, or both),
and starting with a * , - , + or
o character followed by whitespace, will be converted into a list
item (<li> tag).
The same goes for indented paragraphs that start with the string
1. , a. , A. , 1) , A) , or a) , followed by
whitespace. However the default list tag in this case will be an
<ol>...</ol> list. Any positive integer followed
immediately by a full stop and a space will do the trick. The <ol>
tag will use the correct type attribute to match the indexing you're using.
(Compatibility note: previous versions of EtText required that the
<ul> or <ol> tags be written manually. This is no
longer the case, they will be added automatically.)
When you're writing <ul> lists, note that some text editors (such as
vim) will reformat list items automatically, assuming that you want the
text to line up with the start of the text, instead of the bullet-point
character, on the previous line, like so:
- this is a list item. We should make sure that
blah blah etc. etc.
This is pretty handy, so using a - as the list bullet point character is
recommended.
Indented paragraphs that start with term: tab rest
of paragraph will be converted into definition lists (this is another
StolenFromWikiIdea). As a result, this:
Foo: Blah blah blah etc.
Will look like this:
-
Foo
-
Blah blah blah etc.
Sidebars and Side Images
If you wish to display an image, or small sidebar, beside a paragraph of text,
use the <etleft> and <etright>
tags. These are rendered as a one-row, two-column
<table> wrapping the paragraph and the sidebar, as
follows:
<etleft><img src=bubba.png></etleft>This is the main
paragraph body. Foo bar baz blah blah blah etc.
Is displayed as:
|
This is the main paragraph body.
Foo bar baz blah blah blah etc.
|
<etright><img src=bubba.png></etright>This is the
main paragraph body. Foo bar baz blah blah blah etc.
Is displayed as:
This is the main paragraph body.
Foo bar baz blah blah blah etc.
|
|
Links in EtText
As well as the standard <a href=url>...</a> link
specification used in HTML, EtText will automatically add href tags for URLs
and email addresses that occur in the text. In addition, EtText supports its
own link format, as follows.
To use labelled links, you surround the link text with double-square-brackets,
and (optionally) use a single open-square-bracket on the right-hand side with
the link label.
Here's an example:
WebMake's home page is [[at this website [WebMake]].
Alternatively, if the link text matches the link label, the link label is
optional.
Here's an example: [[WebMake]].
The href used in the link is then defined at another point in the document, as
an indented line like this:
[WebMake]: http://webmake.taint.org/
Even simpler: if the link label has been set as an Auto link, you can omit the square
brackets altogether:
Here's an example: WebMake.
Text and markup can be enclosed in the double-square-brackets, everything
quoted will become part of the link text. Unlike the older form of EtText
links (see below), even single words need to be enclosed in brackets
to become links. This protects against accidentally interpreting normal
text as a broken link.
EtText Linking, Backwards Compatibility
The following text describes the old style for EtText links. Since it
was way too easy to produce links this way where they were not intended
to be, it has now been obsoleted by the method described above. However,
support for it will remain on by default for a few revisions.
To turn off this backwards compatibility, set the EtTextOldLinkStyle option
to 0, either using WebMake's <option> tag, or from your code.
The basic concept is of a word or "quoted set of words" followed by an
optional link label in [square brackets], like this: "this is a
link" [label].
The href used in the link is then defined at another point in the document, as
above.
Text and markup can be enclosed in the quotes, everything quoted will become
part of the link text. Single words or HTML tags do not need to be quoted, so
<img src="/license_plate.jpg" width="10" height="10"> [homepage]
will work correctly.
Glossary Links
EtText also supports a concept called glossary links; if you define a link,
the name of that link will automatically become a href if enclosed in
double-square-brackets or quotes. For example:
[Justin Mason]: http://jmason.org/
will mean that any occurrence of [[Justin Mason]], or
"Justin Mason", in any EtText content chunk or file in the
site, becomes a link to that address.
These links are stored in the WebMake cache file, if WebMake is being used.
If you use EtText in a standalone mode, without WebMake, you can provide an
implementation of the Text::EtText::LinkGlossary interface to store
defined links so that they can be used in other EtText files.
Quoted bits of text that do not map to an entry in the glossary are not
converted to links (unless they're followed by a square-bracketed link-label
reference).
Auto Links - Even More Convenient
In addition, if the link definition is preceded with Auto: , the quotes are
not required, and any occurrence of the link label -- with or without quotes or
double-square-brackets -- will become a link.
Auto: [WebMake]: http://webmake.taint.org/
Auto: [any occurrence of the words]: http://webmake.taint.org/
URLs and Email Addresses
URLs, such as http://webmake.taint.org/ , and email addresses, such as
jm@nospam-jmason.org, are automatically converted into links to that same
address.
Blocking EtText Link Interpretation
To block interpretation as a link, replace square brackets with the HTML
entities &etsqi; and &etsqo;, which map to [ and ]
respectively; replace quote characters, ", with two apostrophes,
''. If that doesn't do the trick, wrap the entire section of text
with the <!--etsafe-->...<!--/etsafe--> tags.
Similar Systems
EtText-like plain-text-to-markup conversion systems have a long history. The
first time I came across the concept was with Setext, which was
included with Tony Sanders' Plexus web server, back in September 1993.
Yes, 1993. Setext has been around for a while!
WikiWikiWeb is quite a recent, well-established system which uses
a similar markup style.
txt2html provided a lot of impetus to rewrite the core of EtText since 2.0,
since its list-parsing engine was much better. However EtText is now up to
scratch again ;)
The real inspiration for EtText was Userland's Frontier; Dave
Winer's evangelisation of its easily-editable markup system convinced me that
it was worth polishing up the rudimentary EtText system I had then. In
addition, the name "EtText" is derived from "Edit This Text", in
a tip of the hat to Dave's "Edit This Page" concept.
Some well-known sites that use their own converters to convert
plain-text to markup include http://www.blogger.com/, http://slashdot.org/
(for comments) and http://www.advogato.org/.
Jorn Barger maintains an impressive summary of etext formats at his Robot
Wisdom site. Skip down to section 3, Internet etext
standards, for the directly-relevant stuff.
Zope and ZWiki use a format called StructuredText, which again comes from
WikiLand. There's some interesting work going on there with the STXDocument
object, which is "a web-managable object that contains information marked up
in the structured text format".
When HTML and EtText Collide
HTML tags can be used freely throughout an EtText document. However, in some
situations, you may wish to preserve whitespace, avoid paragraph tags being
added, etc.; to use your own HTML without meddling from EtText, wrap it in an
<!--etsafe-->...<!--/etsafe-->
tag pair; this will protect it.
Note that text blocks wrapped in <pre>,
<listing> and <xmp> tags are
automatically protected in this way; the <!--etsafe-->
tag pair is not required.
EtText adds two entities, &etsqi; and &etsqo;. These represent
[ and ] respectively, and are used to protect a square-bracketed
piece of text from being interpreted as a link URL (see Link Markup below).
If this is insufficient, and you're using WebMake, the <safe> tag
will escape any type of code to protect it from interpretation by WebMake,
EtText or HTML.
Text::EtText::DefaultGlossary
Text::EtText::DefaultGlossary - default, non-persistent link glossary
The Text::EtText::DefaultGlossary is an implementation of
Text::EtText::LinkGlossary which is used if no other implementation is registered.
It will not save glossary link details persistently.
Text::EtText::EtText2HTML
Text::EtText::EtText2HTML - convert from the simple EtText editable-text
format into HTML
my $t = new Text::EtText::EtText2HTML;
print $t->text2html ($text);
or
my $t = new Text::EtText::EtText2HTML;
print $t->text2html (); # from STDIN
ettext2html will convert a text file in the EtText editable-text format
into HTML.
For more information on the EtText format, check the WebMake documentation
on the web at http://webmake.taint.org/ .
-
$f = new Text::EtText::EtText2HTML
-
Constructs a new Text::EtText::EtText2HTML object.
-
$f->set_option ($optname, $optval);
-
Set an EtText option. (Options can also be set on the WebMake object
itself, or from inside the WebMake file.) Currently supported options are:
-
EtTextOneCharMarkup (default: 0)
-
Allow one-character sets of asterisks etc. to mark up as strong, emphasis
etc., instead of the default two-character markup.
-
EtTextOldLinkStyle (default: 1)
-
Use the older EtText link-markup style, with quote characters and single
square brackets. This is easy to type, but if you're using text from other
people, it can easily destroy formatting; so the new link-markup style,
with double square brackets, can be used instead.
-
EtTextBaseHref (default: '')
-
The base HREF to use for relative links. If set, all relative links in tags
with HREF attributes will be rewritten as absolute links, making the output
HTML independent of the URL tree structure.
-
EtTextHrefsRelativeToTop (default: 0)
-
Indicates that all EtText links are relative to the top of the WebMake
document tree. This (obviously) is only relevant if you are using EtText in
conjunction with WebMake, and WebMake sets it by default. If set, all
relative links in tags with HREF attributes will be rewritten as relative
to the ''top'' of the WebMake site, making the output HTML independent of
the URL tree structure.
$html = $f->set_glossary ($glosobj)
Provide a glossary for shared link definitions, allowing link definitions
to be shared and reused across multiple EtText files. $glosobj must implement the interface defined by Text::EtText::LinkGlossary .
See below for more information on this interface.
$html = $f->text2html( [$text] )
Convert text, either from the argument or from STDIN, into HTML.
See also http://webmake.taint.org/
for more information.
webmake ettext2html ethtml2text HTML::WebMake Text::EtText::EtText2HTML Text::EtText::HTML2EtText Text::EtText::LinkGlossary Text::EtText::DefaultGlossary
Justin Mason <jm /at/ jmason.org>
WebMake is distributed under the terms of the GNU Public License.
The latest version of this library is likely to be available from CPAN as
well as:
http://webmake.taint.org/
Text::EtText::HTML2EtText
Text::EtText::HTML2EtText - convert from HTML to the EtText editable-text
format
my $t = new Text::EtText::HTML2EtText;
print $t->html2text ($html);
or
my $t = new Text::EtText::HTML2EtText;
print $t->html2text (); # from STDIN
ethtml2text will convert a HTML file into the EtText editable-text format,
for use with webmake or ettext2html.
For more information on the EtText format, check the WebMake documentation
on the web at http://webmake.taint.org/ .
-
$f = new Text::EtText::HTML2EtText
-
Constructs a new Text::EtText::HTML2EtText object.
-
$text = $f->html2text( [$html] )
-
Convert HTML, either from the argument or from STDIN, into EtText.
See also http://webmake.taint.org/
for more information.
webmake ettext2html ethtml2text HTML::WebMake Text::EtText::EtText2HTML Text::EtText::HTML2EtText
Justin Mason <jm /at/ jmason.org>
WebMake is distributed under the terms of the GNU Public License.
The latest version of this library is likely to be available from CPAN as
well as:
http://webmake.taint.org/
Text::EtText::LinkGlossary
Text::EtText::LinkGlossary - interface for EtText link glossaries to
implement.
use Text::EtText::LinkGlossary;
@ISA = qw(Text::EtText::LinkGlossary);
sub open { ... }
sub close { ... }
...
The Text::EtText::LinkGlossary is an interface which allows EtText to support ''link glossaries'',
persistent collections of link text and its corresponding HREF.
The interface which needs to be implemented is as follows:
-
$g->open()
-
Open the link glossary $g for reading and writing.
-
$g->close()
-
Close the link glossary; no more links can be written or read.
-
$url = $g->get_link ($name)
-
Get a named link from the glossary.
-
$g->put_link ($name, $url)
-
Put a named link to the glossary.
-
$url = $g->get_auto_link ($name)
-
Get a named automatic link from the glossary.
-
$g->put_auto_link ($name, $url)
-
Put a named automatic link to the glossary.
-
@keys = $g->get_auto_link_keys ()
-
Get a list of the names of automatic links stored in the glossary.
-
$g->add_auto_link_keys (@keys)
-
Add to the list of names of automatic links stored in the glossary.
ethtml2text(1)
ethtml2text - convert from HTML to the EtText editable-text format
ethtml2text file.html > file.txt
ethtml2text will convert a HTML file into the EtText editable-text format,
for use with webmake or ettext2html.
For more information on the EtText format, check the WebMake documentation
on the web at http://ettext.taint.org/ .
The ethtml2text command is part of the HTML::WebMake Perl module set. Install this as a normal Perl module, using perl -MCPAN -e shell , or by installing WebMake.
No environment variables, aside from those used by perl, are required to be
set.
webmake ettext2html ethtml2text HTML::WebMake Text::EtText
Justin Mason <jm /at/ jmason.org>
HTML::Entities
ettext2html(1)
ettext2html - convert from the simple EtText editable-text format into HTML
ettext2html file.txt > file.html
ettext2html will convert a text file in the EtText editable-text format
into HTML.
For more information on the EtText format, check the WebMake documentation
on the web at http://ettext.taint.org/ .
The ettext2html command is part of the HTML::WebMake Perl module set. Install this as a normal Perl module, using perl -MCPAN -e shell , or by installing WebMake.
No environment variables, aside from those used by perl, are required to be
set.
webmake ettext2html ethtml2text HTML::WebMake Text::EtText
Justin Mason <jm /at/ jmason.org>
HTML::Entities
|