技術ノート10-独自の詳細言語を編み出す

Webページの構築を例にして

G.C.Wraith

このノートは、アナ・ヘスターのHTMLToolkitとロベルト・イエルサリムスキーの技術ノート9における文字列の連結処理非効率性に関する記述にインスパイアされました。文法システムと、例えばHTML文書の構成の基本となる数学的抽象概念は「ツリー」です。ここで、説明文からツリーを構築し、それを書き出すためのweaveと呼ばれるLuaプログラムを簡単に紹介します。例えば、このドキュメントはHTMLに合わせたweaveによって構築しています。HTMLの大きな欠点は、変数や抽象化機能がないことです。今はすべてのWebページを作成するのにweaveを使用しています。このWebページのweaveソースは以下に示します。

'ツリー'とは、インデックスが整数で、値が文字列(ツリーの葉)またはツリーのテーブルです。次に、ツリーを構築する'ノード'関数があります。そのメソッドには'プッシュ'と'ウォーク'があります。プッシュは、新しい文字列またはツリーを挿入します。ウォークが再帰的に葉に関数を適用します。

 

node = function (list)
       local a
       if list then 
         a = list
         a.n = getn(list)
       else a = { n = 0 } end -- if
       if not a.push then
         a.push = function (self,x) tinsert(self,x) end
       end -- if
       if not a.walk then  
        a.walk = function (self,f)
                 local b
                 for i = 1,self.n do
                   b = self[i]
                   if type(b) == "string" then f(b)
                   else b:walk(f) end -- if
                 end -- for
               end -- function
       end -- if        
       return a
       end -- function
       
これはweaveの中核です。この場合、HTMLの構造を解釈するには、特定の定義が必要です。これらの定義をdofileを使用して読み取ることで、weaveをモジュール化することが理にかなっています。weaveの最後の部分は、葉をその出力「out」に書き込みます。これは処理対象のドキュメントによって指定されていると想定しています(「arg[1]」で指定)。これが記述されています。
 page,out = PAGE(dofile(arg[1])) -- ページの説明を読み取ります assert(page,"Cannot read "..arg[1]) writeto(out) page:walk(write) -- HTMLファイルを書き込みます writeto() 
weaveのHTML固有の部分、特に関数PAGEについて説明します。この関数は3つの引数を取ります。タイトル、ドキュメントが格納されるファイルのパス名、およびドキュメントの本文を記述するツリーです。ツリー全体とパス名を出力します。
 PAGE = function (title, saveas, body) local x = node { [[<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">]], HTML { HEAD { [[<meta name="MSSmartTagsPreventParsing" content="TRUE">]], TITLE { title }  }, -- HEAD BODY(body) } -- HTML } -- node return x,saveas end -- function 
コンストラクタ(つまりツリーのノード)にHTMLタグ名の英大文字バージョンを使用し、ユーザーが引数に選択する小文字の変数名と対比させるという規約を使用しました。これはHaskellのスタイルに似ています。これらは次のように定義されます。
 tag = function (name,attr) local f = function (obj) local n,att,x = %name,%attr,node() x:push "<" x:push(n) if att then x:push " " x:push(att) end -- if x:push ">" x:push(node(obj)) x:push "</" x:push(n) x:push ">\n" return x end -- function return f end -- function monotags = { "p","b","hr" } for _,s in monotags do setglobal(strupper(s),"\n<"..s..">\n") end -- for tagwords = { "html","head","body","title","center", "h1","h2","h3","h4","h5","h6","tt","b", "ul","ol","li","dl","dd","dt","tr","td", "th","pre","small"} for _,s in tagwords do setglobal(strupper(s),tag(s)) end -- for 
これですべての定義が完了し、独自のアンカー、画像、その他のHTMLビットを記述できます。HTML以外の便利な追加機能として、ドキュメントに別のファイルをテキストの一部としてINCLUDEする機能があります。
 INCLUDE = function (fname) readfrom(fname) local text = read("*a") readfrom() return text end -- function 
一例として、このドキュメント全体をもう一度読み込みますが、weaveソースコードの形式で表示します。

-- Technote source for weave
--  GCW August 2001

-- Please set value of filename for your own computer
filename = "ltn010.html"

-- Please adjust XXX yourself
title = "Technical Note 10 - Weave your own description languages"

this = "article" -- filename of this file. Please adjust if necessary.

subtitle = "Web page construction as an example"
author = "G.C.Wraith"
mailto = "mailto:gavin@wraith.u-net.com"

code = function (x)
       return B(TT(PRE {x}))
       end
       
intro = [[This note was inspired by Anna Hester's HTMLToolkit and 
Roberto Ierusalimschy's Technical Note 9 on the inefficiencies of
concatenating strings. The mathematical abstraction that underlies
grammars and the structure of, for example HTML documents, is that  
of a tree. We present here a short Lua program, called weave, for
building trees out of descriptions and writing them out. This
document, for example, is constructed by weave adapted to HTML.
The big drawback of HTML is that it has neither variables nor
abstraction facilities. I make all my web pages with weave now.]]   

para1 = [[By a 'tree' we mean a table, whose indices are integers,
and whose values are either strings (leaves of the tree) or trees.
Here is the function 'node' which constructs trees. It has two
methods, 'push' and 'walk'. Push inserts a new string or tree.
Walk applies a function recursively to the leaves.]]

node_def = [[ 

node = function (list)
       local a
       if list then 
         a = list
         a.n = getn(list)
       else a = { n = 0 } end -- if
       if not a.push then
         a.push = function (self,x) tinsert(self,x) end
       end -- if
       if not a.walk then  
        a.walk = function (self,f)
                 local b
                 for i = 1,self.n do
                   b = self[i]
                   if type(b) == "string" then f(b)
                   else b:walk(f) end -- if
                 end -- for
               end -- function
       end -- if        
       return a
       end -- function
       
]]  

para2 = [[This is the core of weave. Then some specific definitions
are required to interpret the structure in question, in this case
HTML. It makes sense to modularise weave by reading in these
definitions using dofile. The final part of weave writes out the
leaves to its output, 'out', which I am presuming is specified by the
document being processed (which is given by 'arg[1]'). Here it is:]]

out_def = [[

page,out = PAGE(dofile(arg[1])) -- read the page description
assert(page,"Cannot read "..arg[1])
writeto(out)
page:walk(write) -- write the HTML file
writeto() 

]]

para3 = [[It remains to describe the HTML-specific part of weave,
and, in particular, the function PAGE. This function takes three
arguments, the title, the pathname of the file that the document is to 
reside in, and the tree describing the body of the document. It 
outputs the whole tree and the pathname.]]

PAGE_def = [[

PAGE = function (title, saveas, body)
       local x = node {
       [[<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">]],
       HTML { 
              HEAD {
                   [[<meta name="MSSmartTagsPreventParsing" content="TRUE">]],
                   TITLE { title }
                    }, -- HEAD   
             BODY(body)
            } -- HTML
                    } -- node
       return x,saveas
       end -- function 
       
]]  

para4 = [[We have used the convention of using upper case versions of
HTML tag names for constructors (i.e. nodes of the tree) to contrast
with lower case variable names for the user to choose for their
arguments, much in the style of Haskell. These are defined by:]]

tag_def = [[

tag = function (name,attr)
      local f = function (obj)
                  local n,att,x = %name,%attr,node()
                  x:push "<"
                  x:push(n)
                  if att then
                    x:push " "
                    x:push(att)
                  end -- if
                  x:push ">"
                  x:push(node(obj))
                  x:push "</"
                  x:push(n)
                  x:push ">\n"
                return x  
                end -- function
      return f
      end -- function        

monotags = { "p","b","hr" }

for _,s in monotags do
  setglobal(strupper(s),"\n<"..s..">\n")
end -- for                

tagwords = { "html","head","body","title","center",
             "h1","h2","h3","h4","h5","h6","tt","b",
             "ul","ol","li","dl","dd","dt","tr","td",
             "th","pre","small"} 
             
for _,s in tagwords do             
 setglobal(strupper(s),tag(s))
end -- for  

]]  

para5 = [[You can probably now write your own definitions for
anchors, images and other bits of HTML. A useful non-HTML addition
is the facility to INCLUDE another file as a piece of text in
the document.]]

INCLUDE_def = [[

INCLUDE = function (fname)
           readfrom(fname)
           local text = read("*a")
           readfrom()
           return text
          end -- function 
          
]]  

para6 = [[By way of example, if you want to read this whole document
all over again, but in the form of weave source code, here it is.]] 

margin = function (x)
          return TABLE(TR(TD(x)))
         end
         
get_source = " The weave source of this web page is shown below."         
       
body = { CENTER {
                  H1 {title},
                  H3 {subtitle},
                  H6 {author}
                 }, -- CENTER
         margin {         
          P, intro, 
          A("#source") {get_source},
          P, para1, code(node_def), 
          para2, code(out_def),
          para3, code(PAGE_def), 
          para4, code(tag_def), 
          para5, code(INCLUDE_def),
          para6, LABEL("source") {P}, code(INCLUDE(this)) -- self reference !
                }, -- margin
         HR,
         SMALL { "by", A(mailto) {"gcw"} }
       } -- body

         
return title, filename, body
-- end weave source --------

gcwが作成