remko caprio


home home
home aboutme
home writing
      dull boy jack
      ophelia's love
      the birth of ego
      theombrotus
      the shield of achilles
home music
      bs
      carackus
      blockbuster noise
home other
      drawings

RSS
Comments RSS
Valid XHTML
XFN
WP

International Year of Astronomy 2009

 
iOptron SmartStar G-MC90   Norton’s Star Atlas 20th Edition

- International Year of Astronomy 2009
- Galileo at the Franklin Institute, Philadelphia
- Ältester Sternenatlas: Jupiterstationen und Mondhäuser (faz.de)
- Sir John Ritblat Gallery: Treasures of the British Library: Astronomy
- Dunhuang Map (wikipedia.org)
- Sky & Telescope

comments (0) | category: Research | tags:

Litarary Word Comparison

Introduction
This is one of the small research projects that I am currently conducting. I am not pretending to offer or accomplish any scientific added value to the research community in the field of Natural Language Processing (NLP) but humbly submit my efforts to gain further personal learning. While the research remains unfinished and until I publish it formally, I will keep this post as a mini-post. As a Universal Man, a Humanist, a Renaissance Man each individual man has an obligation to question and further his or her knowledge and understanding, as it lies within our capacities. Learning is a tool to humble our heart, and most of all we should mistrust brave hearts.

Matt Ridley in his book Nature via Nurture says (says Richard Dawkins in his The Ancestor’s Tale in The Mouse Tale chapter) that “the list of words in David Copperfield is almost the same as the list of words in The Catcher in the Rye.” Springing from this saying, I concluded that it would be an interesting project to create a plotter diagram in which the major works in literature (written, translated or edited into modern English for reasons of ease of comparison) are set out as number of total words versus the number of different words used and another network graph that displays the relative closeness of literary works by words used. The first diagram is the easiest to create of course, so I will start with this first, then moving on to the next network diagram.

In the network diagram, several pieces of information can weigh into the closeness of one to another point. Number of words, wordlength, number of long words versus number of short words etcetera. I will create a list of possible factors to include in the calculation of closeness, extending the application from a simple calculation to grow more complex in time, based on the feedback of more educated specialists.

But in principle I will treat the texts without semantical intepretation, but as blind data, as numbers, not as complex thoughts that erupted from a spur of genius.

Planning:

  1. Background Reading
  2. Write thesis and project description
  3. Evaluate planning
  4. Write simple parser, separating words from a text, eliminating grammar marks
  5. Evaluate Planning
  6. Use third party tool or compare similar projects

Texts:

Resources:

Lexical Data:

comments (0) | category: Research |