Jump to content

SiSU: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
tagged
 
(47 intermediate revisions by 20 users not shown)
Line 1: Line 1:
{{short description|Unix command line-oriented framework}}
{{unreferenced}}
{{Other uses}}
{{Other uses|Sisu (disambiguation)}}
{{Infobox software
{{Infobox software
| name = SiSU
| name = SiSU
| logo = [[Image:SiSU logo.png|100px|SiSU logo]]
| logo = SiSU (software) logo.png
| logo_size = 100px
| logo_alt = SiSU logo
| developer = Ralph Amissah
| developer = Ralph Amissah
| released = {{Start date and age|2005|01|05}}
| latest_release_version = 2.7.7
| latest_release_version = 7.1.11
| latest_release_date = {{release date|2010|10|17}}
| latest_release_date = {{Start date and age|2017|07|14}}
| operating_system = [[Unix-like]]
| operating_system = [[Unix-like]]
| genre = Text Structuring, [[Publishing]], [[Search]]
| genre = Text Structuring, [[Publishing]], [[Search engine technology|Search]]
| license = [[GNU General Public License#Version 3|GPLv3]]
| license = [[GNU General Public License#Version 3|GPLv3]]
| website = [http://www.jus.uio.no/sisu www.jus.uio.no/sisu]
}}
}}
'''SiSU''' ("SiSU information Structuring Universe" or "Structured information, Serialized Units"),<ref>also chosen for the meaning of the [[Finnish language|Finnish]] term "[[sisu]]".</ref> is a [[Unix]] [[command line interface|command line]]-oriented framework for document structuring, publishing and search. Using [[Markup language|markup]] applied to a document, SiSU can produce [[plain text]], [[HTML]], [[XHTML]], [[EPUB]], [[XML]], [[OpenDocument]], [[LaTeX]] or [[Portable Document Format|PDF]] files, and populate an [[SQL]] database with ''objects''<ref>objects are described more accurately below.</ref> (equating generally to paragraph-sized chunks) so searches may be performed and matches returned with that degree of granularity (e.g. your search criteria are met by these documents and at these locations within each document). Document output formats share a common object numbering system for locating content. This is particularly suitable for "published" works (finalized texts as opposed to works that are frequently changed or updated) for which it provides a fixed means of reference of content.
'''SiSU''' ('''SiSU information structuring universe''' or '''Structured information, serialized units'''),<ref>also chosen for the meaning of the [[Finnish language|Finnish]] term ''[[sisu]]''.</ref> is a [[Unix]] [[command line interface|command line]]-oriented framework for document structuring, publishing and search.


==Usage==
==Summary of features==
Using [[Markup language|markup]] applied to a document, or a collection of documents, SiSU can produce [[plain text]], [[HTML]], [[XHTML]], [[EPUB]], [[XML]], [[OpenDocument]], [[LaTeX]] or [[Portable Document Format|PDF]] files, and populate an [[SQL]] database.


===Document structuring===
* documents are prepared in a single [[UTF-8]] file using a minimalistic mnemonic syntax. Typical literature, documents like "War and Peace" require almost no markup, and most of the headers are optional.
SiSU offers its user a way to structure plain text and to add graphics, hyperlinks, endnotes, footnotes etc. with simple text editing programs such as Notepad (Windows), TextEdit (Mac) or Gedit (Linux). The [[lightweight markup language]] is mnemonic and [[human readable]].


To process the marked up document(s) with SiSU, the user issues a command via the [[Command-line interface|command-line]] of the computer terminal. The output can be generated in multiple formats (html, pdf, epub, and others) with one single command.
* markup is easily readable/parsable by the human eye, (basic markup is simpler and more sparse than the most basic HTML), [this may also be converted to XML representations of the same input/source document].


===Publishing and self-publishing===
* markup defines document structure (this may be done once in a header pattern-match description, or for heading levels individually); basic text attributes (bold, italics, underscore, strike-through etc.) as required; and semantic information related to the document (header information, extended beyond the Dublin core and easily further extended as required); the headers may also contain processing instructions.
A document, or a collection of documents, which has been processed by SiSU is technically ready to be published on the web, or printed on paper. Canadian author [[Cory Doctorow]], for instance, has used SiSU as a publishing tool and blogged about it.<ref>{{cite web|url=http://craphound.com/walh/e-book/browse-all-versions|title=Doctorow: Browse all versions|date=2010-10-03|access-date=2011-08-11|work= With a Little Help}}</ref> In a newspaper article, Doctorow called SiSU an "automated ebook workflow tool".<ref>{{cite news|url=https://www.theguardian.com/technology/blog/2010/dec/17/internet-problem-choice-self-publishing|title=The Internet Problem: when an abundance of choice becomes an issue|last=Doctorow|first=Cory|date=2010-12-17|access-date=2011-08-11|location=London|work=The Guardian}}'' Guardian'' (London) 17 December, 2010. </ref>


Earlier examples of webpublishing with SiSU are ''Projet de traité instituant l'Union Européenne / Draft Treaty Establishing the European Union''<ref>{{cite web|url=http://www.spinellisfootsteps.info|date=2005-11-28|title=Spinelli's Footsteps|access-date=2011-08-11}}</ref> and the novel [[Tainaron (novel)|Tainaron]] by Finnish author [[Leena Krohn]].<ref>http://www.kaapeli.fi/krohn/tainaron/english/3/leena_krohn/tainaron.leena_krohn.1998/ This example was created with SiSU in February 1999. Accessed 2011-08-11.</ref>
* for output utilises established industry and institutionally accepted open standard formats.<ref>outputs currently include: plaintext (utf-8), HTML, XML, ODF (open document format text), EPUB, LaTeX, pdf, sql type databases, concordance files, document content certificates (md5 or sha256 digests of headings, paragraphs, images etc.), (currently postgresql and sqlite).</ref>


===Search===
* the outputs share a common numbering system that is meaningful (to man and machine) across all digital outputs whether paper, screen, or database oriented, (pdf, HTML, EPUB, XML, sqlite, postgresql), this numbering system can be used to reference content.
SiSU can populate an [[SQL]] database with ''objects'' (equating generally to paragraph-sized chunks) so searches may be performed and matches returned with that degree of granularity (e.g. your search criteria are met by these documents and at these locations within each document). Document output formats share a common object numbering system for locating content. This is particularly suitable for "published" works (finalized texts as opposed to works that are frequently changed or updated) for which it provides a fixed means of reference of content.


==History==
* sql databases are populated at a paragraph level and become searchable at that level of granularity, the output information provides the object/paragraph numbers which are relevant across all generated outputs; it is also possible to look at just the matching paragraphs of the documents in the database; output indexing also work well with search indexing tools like [http://hyperestraier.sourceforge.net/ HyperEstraier].


SiSU has been under development since 1997, and written in [[Ruby (programming language)|Ruby]] since 2000. It was released under the GPL in January 2005. SiSU developed out of work done on a project started earlier on documents related to (primarily private) [[international commercial law]] and international trade law started in 1993 on a site known then as Ananse, and more recently as [http://www.jus.uio.no/lm/ LexMercatoria]
* there is a considerable degree of future-proofing, output representations are "upgradeable", and new document formats may be added.


SiSU first open source was on January 5, 2005, <ref>{{cite web|url= http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/125110|title=Announce SiSU - publishing for e-documents, books, libraries, relational databases|date=2005-01-05|access-date=2015-05-05|work= Ruby Maillist}}</ref> and to [[Debian]] was in July 2005. SiSU version 1 was released December 2009. SiSU version 2 was released March 2010. Version 2 features a new processing engine. Markup remains substantially identical between versions, apart from changes to the markup for document headers (which contain document metadata and processing instructions). Both version 1 and 2 text processing engines are available in the version 2 tarball. Development takes place on the version 2 branch. Version 1 is available to guarantee compatibility with older prepared texts (prior to the updating of document headers), and as an earlier reference implementation.
* SQL search aside, documents are generated as required and static once generated.

* documents produced are static files, and may be batch processed, this needs to be done only once but may be repeated for various reasons as desired (updated content, addition of new output formats, updated technology document presentations/representations)

* document source (plaintext utf-8) if shared on the net may be used as input and processed locally to produce the different document outputs

* document source may be bundled together (automatically) with associated documents (multiple language versions or master document with inclusions) and images and sent as a zip file called a sisupod, if shared on the net these too may be processed locally to produce the desired document outputs

* generated document outputs may automatically be posted to remote sites.

* for basic document generation, the only software dependency is Ruby, and a few standard Unix tools (this covers plaintext, HTML, XML, ODF, EPUB, LaTeX). To use a database you of course need that, and to convert the LaTeX generated to pdf, a latex processor like tetex or texlive.

* as a developers tool it is flexible and extensible

[[Syntax highlighting]] for SiSU markup is available for a number of [[text editors]].

==How it works==

SiSU markup is fairly minimalistic, it consists of: a (largely optional) document header, made up of information about the document (such as when it was published, who authored it, and granting what rights) and any processing instructions; and markup within text which is related to document structure and typeface. SiSU must be able to discern the structure of a document, (text headings and their levels in relation to each other), either from information provided in the instruction header or from markup within the text (or from a combination of both). Processing is done against an abstraction of the document comprising of information on the document's structure and its objects<ref>objects include: headings, paragraphs, verse, tables, images, but not footnotes/endnotes which are numbered separately and tied to the object from which they are referenced.</ref>, which the program serializes (providing the object numbers) and which are assigned [[Cryptographic hash function|hash sum]] values based on their content. This abstraction of information about document structure, objects, (and hash sums), provides considerable flexibility in representing documents different ways and for different purposes (e.g. search, document layout, publishing, content certification, concordance etc.), and makes it possible to take advantage of some of the strengths of established ways of representing documents, (or indeed to create new ones).

==Short history==

SiSU has been under development since 1997, and written in [[Ruby (programming language)|Ruby]] since 2000. It was released under the GPL in January 2005. SiSU developed out of work done on a project started earlier on documents related to (primarily private) international commercial law and international trade law started in 1993 on a site known then as Ananse, and more recently as [http://www.jus.uio.no/lm/ LexMercatoria]

SiSU version 1 was released December 2009. SiSU version 2 was released March 2010. Version 2 features a new processing engine. Markup remains substantially identical between versions, apart from changes to the markup for document headers (which contain document metadata and processing instructions). Both version 1 and 2 text processing engines are available in the version 2 tarball. Development takes place on the version 2 branch. Version 1 is available to guarantee compatibility with older prepared texts (prior to the updating of document headers), and as an earlier reference implementation.


==Notes and references==
==Notes and references==
<!--See http://en.wikipedia.org/wiki/Wikipedia:Footnotes for an explanation of how to generate footnotes using the <ref(erences/)> tags-->
<!--See http://en.wikipedia.org/wiki/Wikipedia:Footnotes for an explanation of how to generate footnotes using the <ref(erences/)> tags-->

{{Reflist|2}}
{{Reflist}}


==External links==
==External links==
{{Portal|Free software}}
{{Portal|Free and open-source software}}
*{{Official website}}
*[http://sisudoc.org/ SiSU] homepage
*[http://jus.uio.no/sisu SiSU <http://jus.uio.no/sisu/>] original homepage


{{DEFAULTSORT:Sisu}}
{{DEFAULTSORT:Sisu}}
Line 68: Line 52:
[[Category:Lightweight markup languages]]
[[Category:Lightweight markup languages]]
[[Category:Linux text-related software]]
[[Category:Linux text-related software]]
[[Category:Free software]]
[[Category:Software using the GPL license]]

Latest revision as of 10:09, 24 December 2023

SiSU
Developer(s)Ralph Amissah
Initial releaseJanuary 5, 2005; 19 years ago (2005-01-05)
Stable release
7.1.11 / July 14, 2017; 7 years ago (2017-07-14)
Repository
Operating systemUnix-like
TypeText Structuring, Publishing, Search
LicenseGPLv3
Websitesisudoc.org Edit this on Wikidata

SiSU (SiSU information structuring universe or Structured information, serialized units),[1] is a Unix command line-oriented framework for document structuring, publishing and search.

Usage

[edit]

Using markup applied to a document, or a collection of documents, SiSU can produce plain text, HTML, XHTML, EPUB, XML, OpenDocument, LaTeX or PDF files, and populate an SQL database.

Document structuring

[edit]

SiSU offers its user a way to structure plain text and to add graphics, hyperlinks, endnotes, footnotes etc. with simple text editing programs such as Notepad (Windows), TextEdit (Mac) or Gedit (Linux). The lightweight markup language is mnemonic and human readable.

To process the marked up document(s) with SiSU, the user issues a command via the command-line of the computer terminal. The output can be generated in multiple formats (html, pdf, epub, and others) with one single command.

Publishing and self-publishing

[edit]

A document, or a collection of documents, which has been processed by SiSU is technically ready to be published on the web, or printed on paper. Canadian author Cory Doctorow, for instance, has used SiSU as a publishing tool and blogged about it.[2] In a newspaper article, Doctorow called SiSU an "automated ebook workflow tool".[3]

Earlier examples of webpublishing with SiSU are Projet de traité instituant l'Union Européenne / Draft Treaty Establishing the European Union[4] and the novel Tainaron by Finnish author Leena Krohn.[5]

[edit]

SiSU can populate an SQL database with objects (equating generally to paragraph-sized chunks) so searches may be performed and matches returned with that degree of granularity (e.g. your search criteria are met by these documents and at these locations within each document). Document output formats share a common object numbering system for locating content. This is particularly suitable for "published" works (finalized texts as opposed to works that are frequently changed or updated) for which it provides a fixed means of reference of content.

History

[edit]

SiSU has been under development since 1997, and written in Ruby since 2000. It was released under the GPL in January 2005. SiSU developed out of work done on a project started earlier on documents related to (primarily private) international commercial law and international trade law started in 1993 on a site known then as Ananse, and more recently as LexMercatoria

SiSU first open source was on January 5, 2005, [6] and to Debian was in July 2005. SiSU version 1 was released December 2009. SiSU version 2 was released March 2010. Version 2 features a new processing engine. Markup remains substantially identical between versions, apart from changes to the markup for document headers (which contain document metadata and processing instructions). Both version 1 and 2 text processing engines are available in the version 2 tarball. Development takes place on the version 2 branch. Version 1 is available to guarantee compatibility with older prepared texts (prior to the updating of document headers), and as an earlier reference implementation.

Notes and references

[edit]
  1. ^ also chosen for the meaning of the Finnish term sisu.
  2. ^ "Doctorow: Browse all versions". With a Little Help. 2010-10-03. Retrieved 2011-08-11.
  3. ^ Doctorow, Cory (2010-12-17). "The Internet Problem: when an abundance of choice becomes an issue". The Guardian. London. Retrieved 2011-08-11. Guardian (London) 17 December, 2010.
  4. ^ "Spinelli's Footsteps". 2005-11-28. Retrieved 2011-08-11.
  5. ^ http://www.kaapeli.fi/krohn/tainaron/english/3/leena_krohn/tainaron.leena_krohn.1998/ This example was created with SiSU in February 1999. Accessed 2011-08-11.
  6. ^ "Announce SiSU - publishing for e-documents, books, libraries, relational databases". Ruby Maillist. 2005-01-05. Retrieved 2015-05-05.
[edit]