Download Language Technology for Cultural Heritage: Selected Papers by Martin Volk, Lenz Furrer, Rico Sennrich (auth.), Caroline PDF

By Martin Volk, Lenz Furrer, Rico Sennrich (auth.), Caroline Sporleder, Antal van den Bosch, Kalliopi Zervanou (eds.)

The electronic age has had a profound influence on our cultural background and the educational learn that experiences it. magnificent quantities of items, lots of them of a textual nature, are being digitised to lead them to extra effortlessly obtainable to either specialists and laypersons. in addition to an unlimited power for more advantageous and effective upkeep, administration, and presentation, digitisation deals possibilities to paintings with cultural history information in ways in which have been by no means possible or perhaps imagined.

To discover and make the most those percentages, an interdisciplinary strategy is required, bringing jointly specialists from cultural background, the social sciences and arts at the one hand, and knowledge expertise at the different. as a result of a incidence of textual information in those domain names, language expertise has an important position to play during this endeavour. Language expertise can holiday during the "Google barrier" through supplying the aptitude to examine texts at complex degrees, extracting details and information on the point of the arts or social sciences researcher, who desires to learn about the who, what, the place, and whilst, but additionally the how and the why. even as cultural historical past information poses huge demanding situations for latest language know-how: expertise aimed toward "generic" language has to stand such disparate difficulties as historic language version, OCR digitisation mistakes, and near-extinct educational services.

This publication is basically meant for researchers in info know-how and language processing who want to obtain a state of the art review of the total breadth of the hot and colourful box of language expertise for cultural background and its linked educational examine within the humanities and social sciences. Researchers operating within the objective domain names of cultural history, the social sciences and arts also will locate this booklet precious, because it presents an summary of ways language know-how may help them with their details wishes. The e-book covers functions starting from pre-processing and knowledge cleansing, to the difference and compilation of linguistic assets, to personalisation, narrative research, visualisation and retrieval.

Show description

Read or Download Language Technology for Cultural Heritage: Selected Papers from the LaTeCH Workshop Series PDF

Similar technology books

Compression Schemes for Mining Large Datasets: A Machine Learning Perspective

This publication addresses the demanding situations of knowledge abstraction new release utilizing a least variety of database scans, compressing information via novel lossy and non-lossy schemes, and accomplishing clustering and category at once within the compressed area. Schemes are provided that are proven to be effective either by way of area and time, whereas at the same time supplying a similar or larger class accuracy.

Subsurface Sediment Mobilization

Sedimentary facies within the subsurface are typically interpreted from a depositional/stratigraphical viewpoint: the depositional layering is mostly thought of to stay undisturbed, other than in a couple of settings. yet, there's transforming into proof that subsurface sediment mobilization (SSM) is extra common than formerly idea, as new observations come up from the ever-increasing solution of subsurface facts.

Chemistry of Nanocarbons

Over the last decade, fullerenes and carbon nanotubes have attracted specific curiosity as new nanocarbons with novel houses. as a result of their hole caged constitution, they are often used as packing containers for atoms and molecules, and nanotubes can be utilized as miniature test-tubes. Chemistry of Nanocarbons offers the main updated learn on chemical points of nanometer-sized varieties of carbon, with emphasis on fullerenes, nanotubes and nanohorns.

Gene silencing by RNA interference : technology and application

Content material: Gene silencing via RNA interference and the position of small interfering RNAs -- fundamentals of siRNA layout and chemical synthesis -- Oligonucleotide scanning arrays within the layout of small interfering RNAs -- siRNA construction by means of in vitro transcription -- construction of siRNAs with the appliance of deoxyribozymes -- construction of siRNA in vitro via enzymatic digestion of double-stranded RNA -- Plasmid-mediated intracellular expression of siRNAs -- Lentiviral vector-mediated supply of si/shRNA -- Exogenous siRNA supply: protocols for optimizing supply to cells -- RNAi in drosophila mobile cultures -- RNAi in caenorhabditis elegans -- supply of RNAi reagents in C.

Extra resources for Language Technology for Cultural Heritage: Selected Papers from the LaTeCH Workshop Series

Sample text

Northeast Document Conservation Center (2000) 2. , Stiff, M. ): Definition of the CIDOC Conceptual Reference Model. ICOM/CIDOC CRM Special Interest Group (2009) 3. Lee, E. ): MIDAS: A Manual and Data Standard for Monument Inventories. English Heritage, Swindon (1998) 4. : Speaker attribution in cabinet protocols. In: The seventh international conference on Language Resources and Evaluation (LREC), pp. 2510– 2515 (2010) 5. : Making a clean sweep of cultural heritage. IEEE Intelligent Systems 34(2), 54–63 (2009) 6.

As a consequence, we not only consider the output of FineReader (Recensione-»,) and OmniPage (Rccensionen), but also the combinations Rccensione-», and Recensionen. In this way, the correct word form Recensionen can be constructed from two wrong alternatives. Our decision procedure is based on a unigram language model trained on the latest release of the Text+Berg corpus. The choice to bootstrap the decision procedure with noisy data generated by Abbyy FineReader bears the potential risk of skewing the selection in Abbyy FineReader’s favor.

Punctuation marks and 12 Martin Volk, Lenz Furrer and Rico Sennrich other special characters are thus penalized in our decision module, which we found to be an improvement. A language model approach is problematic for cases in which the alternatives are tokenized differently. Generally, alternatives with fewer tokens obtain a higher probability. We try to counter this bias with a second score that prefers alternatives with a high ratio of known words. This means that in Göschenen is preferred over inGöschenen, even if we assume that both Göschenen (the name of a village) and inGöschenen are unknown words in our language model6.

Download PDF sample

Rated 4.38 of 5 – based on 29 votes