The ParaVoz package provides a simple, yet effective interface for a parallel corpus using OpenCWB (http://cwb.sourceforge.net). It should work on any linux machine with only minimal changes in the settings files to reflect paths, and language codes. All settings are found in the settings directory.
ParaVoz 2.0 extends (but not replaces) ParaVoz 1.0 and is more intuitive, but probably less suited for corpus with a large number of languages; it is best used with a corpus of two or three language. In distinction to ParaVoz 1, with ParaVoz 2.0, the parallel corpus is encoded as a single corpus file for each language, rather than for each text in the corpus. ParaVoz 2.0 now supports both sentence and word alignment.
For ParaSol 2.0, see the demo at http://parasolcorpus.org/ParaVoz ). For ParaSol 1.0, see the movie on the ParaSol website (http://parasolcorpus.org; movie at http://parasolcorpus.org/ParaSol_demo.mp4).
This web interface to CWB was initially written by Roland Meyer for use with the ParaSol corpus (then Regensburg Parallel Corpus) in 2006 and has since been in
development by successive authors. The java script based functionality was mainly added by Andreas Zeman, XSLT-support in the new modular interface mainly by Ruprecht von Waldenfels, who has supervised the publication as open source. Part of the architecture is described in Waldenfels (2011). We thank the Center for the Study of Language and Society, University of Berne, (http://www.csls.unibe.ch) for granting financial support enabling the publication of ParaVoz as open source at this stage.
ParaVoz 2.0 was then developed during the work on a German-Polish parallel corpus supported by a grant of the Johannes Gutenberg University Mainz; mostly by Michal Wozniak, with valuable input from Jan Machalica and under supervision by Ruprecht von Waldenfels.
Source: https://bitbucket.org/rvwfels/paravoz2 [accessed: 21/05/2015]
Quote Tool as:
- Roland Meyer, Ruprecht von Waldenfels, Michal Wozniak, Andreas Zeman (2006-2015): ParaVoz – a simple web interface for querying parallel corpora. Second Version. Bern, Regensburg, Berlin, Krakow.
- Ruprecht von Waldenfels (2011): Recent Developments in ParaSol: Breadth for Depth and XSLT based web concordancing with CWB. In: Daniela Majchráková and Radovan Garabík (eds.): Natural Language Processing, Multilinguality. Proceedings of Slovko 2011, Bratislava: Tribun, 156-162. Available online.
> Date: Thu, 21 May 2015 14:41:13 +0200
> From: ruprecht.waldenfels _(at)_ gmx.net
> To: cwb _(at)_ sslmit.unibo.it
> Subject: [CWB] Interface for parallel corpora
> Dear colleagues,
> we would like to let you know that a new version of the ParaVoz corpus
> interface for parallel corpora hosted with CWB has been released.
> ParaVoz 2.0 has a user friendly interface, it features basic metadata
> management and supports word alignment.
> ParaVoz 2.0 extends (but not replaces) Paravoz 1.0; it is open-source
> and found here: https://bitbucket.org/rvwfels/paravoz2
> A demo version is found here: www.parasolcorpus.org/ParaVoz
> Ruprecht von Waldenfels
> Michał Woźniak
> Institute of Polish, Polish Academy of Sciences, Cracow
> CWB mailing list
> CWB _(at)_ sslmit.unibo.it