On knowledge extraction problem

Tagged:  

Thought of the weighted graph of the dictionary learned by the system, where node corresponds to the word, each link's value corresponds to the weight of the appropriate phrase, and amount of links going out of the every node determines how frequently given word was used in the learned texts.

The same graph can be built out of the new input sentence and already learnt data (like from memory), which will correspond to the meaning of the phrase by the given system. It is still quite rough in my head, but I have a thinking direction.

What I'm a bit stuck with yet, is a response generation. Let's suppose I determined from the question sentence its meaning. I expect it to be similar to what search engines output, but for example it should be quite different for questions like "where to swim on vacation in Moscow" or "do you like Stanislav Lem" and other similar kinds of questions.

Having a local system memory means subjective decision on each question, but I have troubles imagining how system should respond. How to program its own decision on what and how to answer when there is a (some kind of) understanding what was said.

Going to sleep, one says brain will think about the problem in the short sleeping stage, maybe I will be enlighted. I think its time to create AI tag for this kind of ideas :)
And it absolutely does not mean I stopped my other projects!

Imagine who's asking this question?
Is this question is "really a question"?
What kind of data you have to match with?
What is left that you never know because there is not resides in your system?

Short answer is create a kinda mind map to understand relations. You only have entities like classic RDF structures coming from web. You have a grammar part to analyse sentences. But!
You haven't got a "human like" connection between those entities. There should be a meaningful intersection to refer to.
I am recommending "search mind maps" term to bring all of those come together.

Parts;
1. "do you like Stanislav Lem" -> got web search results.
2. English grammar check -> lives on system as a tree of rules.
3. search mind map -> have metadata definitons, only holds abstractions like question types, behavioral definitions etc. Which we really need to discuss and evolve.
4. Matching is now doing against to this mind map, not only basic words.

Until we could use the quantum systems, that is what i could figure out for now.
Peace
Kunthar

I wanted to have kind of RDF graph but on larger scale than simple word. I.e. where knowledge itself is 'located' in the sentences not words, and thus operate with weighted graph of relations of one term to another.

I am not a real fan of Chrome - mainly because I have become so used to firefox over the years. Everyone says that Chrome is faster but this is not true. Chandelier Lighting.
Regards,

I suppose that's it - not mine, but some other creation.

NLP based on a grammatical rules engine, while an interesting toy, is essentially a dead-end when it comes to developing an approach to cognition. Language is a complex system that has evolved over time and continues to evolve each and every day. Grammar is an artificial construct that we have developed as a vehicle to describe language but describing something doesn't mean you understand it or that it can be used to extract knowledge or understanding from what it attempts to describe.

Take the example from Cyc (a link you were given earlier in response to another post):
* Fred saw the plane flying over Zurich.
* Fred saw the mountains flying over Zurich.

Grammar itself will help develop a weighted tree of the sentences and you'll be able to describe the scene - but the system will lack enough reference to be able to respond. In such a situation what is the proper response?

To answer we need a reference model - which luckily we have all around us everyday - people. What do people do when they encounter a phrase and don't have enough information to process it? They ask a question. What question would they ask? Who's fred? What's a plane? What's Zurich? or would they laugh out loud as they exclaim (and picture) the mountains flying? (in itself a valid hypothesis)

Knowledge is obtained from the answer to the question - as it provided an addendum - a relationship between the phrase, the question and the answer. Additionally the question itself often gets corrected - providing a short-circuit feedback loop to the knowledge acquisition process. The description of the answer also provides information about the relationship of items in the phrase to other information stored within the system.

What's Zurich? Zurich is the name of a city in a country called Switzerland.

(assuming that there is some information about what a plane is or that there is some relationship that interprets plans as machines like a car)
What color is the planes? Planes are all shapes and colors but this plane is bright green.
(note in this example the question indicates the singular but uses the plural - which is corrected in the answer)

The question provides insight into the internal state of the system we are interacting with (be it a computer program, a child we're reading a story to or a colleague we are interacting with). Inherent in any interaction is feedback, correction, elucidation of terms and phrases to assist understanding with those we are interacting with. Often it happens in a subconscious way and tends to be in the style of continuous correcting feedback (the same approach we use when we reach down to pick up an object off of a surface).

A system needs to adapt & correct, to provide feedback (both to itself and with the other party it is interacting with) in a way that's more than just updating state - but that also affects the very rules that make up the system itself. This, however, is where many people tend to start going wrong. A common pitfall is that the rules are considered to be the weightings between nodes of information or its relationships. This however means that the underlying reference system (often implemented as grammar rules) rarely changes - which in essence lobotomizes the system. It's an indicator that you've put too much forward knowledge into the system.

Take how children learn - not the mechanic but the approach that's used and not just for language or understanding (which is what we are trying to replicate when we implement the system) but with everything they do. Nature, bless her cotton socks, is frugal with how she expends energy - so she reuses as much as possible (in essence cutting things down to their most common denominator). You'll see the same approach being uses for walking, talking, breathing, looking and following objects - in everything that we see, do or think. Over time the system specializes domains of knowledge - further compartmentalizing - but also reusing that which has been learned and found to be valid in the domain. Which in turn allows for further specialization and compartmentalization.

Anyway - I've rambled on a little more that I initially intended - this is a subject that holds great interest for me :)

this was my main question - how system is supposed to act when it encounters some new knowledge.
Or actually how human brain decides on how to act.

I plan to develop a weighted grammar tree and thus allow computer to extract a knowledge from the provided data. While people have many sources of information: life experience, digital information, talks and so on, machine has only digital 'input' which may be enough to form some kind of memory.

But I still do not know exactly how to program action model. So far I end up with the solution when weighted graph of the received information 'highlights' some inner tree in the long memory storage, where system will randomly selects a direction from the most active node. Or if this is a new knowledge (lie mountains flying over Zurich), this becomes a main point of the reply action.

So in the example above system will highlight nodes related to Fred, flying planes, and Zurich (the first sentece without mountains), depending on the previous experience system will select one of the objects to be a main subject of the reply action, for example it can be Fred, and reply can be something like 'Fred likes to go fishing', which in turn can be tied to next extracted object, like 'There is a good fishing lake in Zurich' and so on...

That's how I want to test it, but results of course can be very much different.

How the system should respond and how the brain works are two very different questions :)
There's an old system built around query/response which you can read about here http://en.wikipedia.org/wiki/ELIZA

From what it appears you're trying to achieve - this should point you in the right direction. It is nowhere near cognition however but it's a handy road to explore. The true problem is somewhat different - which I'll explain in a moment.

It's also worth looking into the SOAR project - a project that has the right approach to simulating cognitive processing - the project home page is here: http://sitemaker.umich.edu/soar/home

While it's a hard concept to get around - don't think about human input as being different from digital input. This is what I alluded to previously - the pattern, reduction & relationship mechanisms used are essentially the same.

There are several underlying problems with cognition which are different from what most expect.

The primary issue is due to perception where too much emphasis is attributes to the human senses (primarily sight and sound) - which as I've mentioned before - are just inputs. As you'll know from physics - you'll often see simple patterns repeated in many different fields - it's unlikely that cognitive processes will be any different when dealing with sound/sight and thought.

The next issue is that many fall foul of attempting to describe the system in terms they can understand - a natural approach but essentially it boils down to the pushing of grammar parsers and hand lexers with too much forward weighting to identify external grammar (essentially pre-weighting the lexers with formal grammar). An approach that can produce interesting results but isn't cognition and fails as an end game for achieving it. Essentially this is the approach used in current machine translation processes in it's various forms.

The key fundamental issue is much simpler and related to issues around: pattern, reduction & relationship. An area that had some activity a while ago in various forms (cellular networks, etc) but fell to the wayside generally due to poor conceptual reference frameworks and the over-emphasis on modelling approaches used in nature (neural networks).

Now comes the time of definitions - a vehicle to ensure we're on the same page :)

Pattern:
Cognitive processes thrive on them - and it's one of the main drivers behind how it perceives, processes and responds to information. There's a constant search to find similarities between what is perceived and what is known. It's a fuzzy matching system that is rewarded, in the sense that it promotes change or adaptation, as much by differences as it is with finding similarities. When thinking about similarities - a handy term is to think about something being true or false. Don't confuse true/false as the general definitions of the terms - it's more about the sense of confidence. If something has a high confidence of being valid then it is true. The threshold of confidence is something that evolves and adapts within the cognition over time (essentially as a result of experience).
The development of patterns is both external (due to an external perception or input) and internal. To avoid turning this comment into something massive (and boring you :) ) - think along the lines of the human cognitive process and the subconscious or dreams.

Reduction:
Reduction happens at several key stages - essentially it's when a domain of experience breaches a threshold. It's a way of reducing the processing required to a more automatic response. Think along the lines of short-circuit expressions. It's a fundamental part of the cognitive process. From a human cognitive perspective you have probably seen it in your climbing and in your learning of the trumpet. We often express it as "having the knack" or "getting the hang" of something.
It's important for 2 reasons: a) it means it has gained knowledge about a domain; b) it allows the cognitive process to further explore a domain. While Reduction is a desirable end-game - it is not The End from a cognitive process perspective. The meta information for this node of Reduction combines again and again with Pattern and Relationship allowing the process to reuse both the knowledge itself but more importantly the lessons learned when achieving reduction.

Relationship:
Relationship is really a meta process for drawing together apparently unrelated information into something that's cohesive and is likely to either help with identifying patterns or for bringing about Reduction. Relationship at first looks very similar to Pattern but differs in it's ability to ask itself "what if" and by being able to adjust things (facts, perception, knowledge, Pattern, Reduction and versions of these[versions are actually quite important]) to suit the avenue that it being explored. When expressed in human cognitive terms think of Relationship as the subconscious, dreams or the unfolding of events in thought. The unfolding of events is an example of versions. Essentially Relationship is a simulation that allows the testing of something.

I've once again written far more than I expected - and I've still a few things I'd like to touch one - but I'll save them for another post and hope it's not been too dull for you :)

ELIZA is not really what I'm interested in - it is a simple programm which rephrases what it get as input, there is no knowledge and no own opinion created out of some internal states. And while it can server my particular purpose of the mail list bot, it is not really what I want to implement.

SOAR looks much more interesting, I downloaded some documentaion about its cognitive operation theory. Although it is rather short description of the idea I hope to get something interesting.

As of cognition itself, my main misunderstanding currently concerns the way action starts. Brain continuously gets some information and it may or may not fire up some related or unrelated from the first view patterns. But how it decides that it is time to say, or to ask, or to move or to make some action.

Getting digital application - how can it determine that from this input (which can tell about some new fantasy book or elections in Iran) this particular action should be taken.
If input implies a question, then answer is quite obvious - make a reply based on existing knowledge, but what if it is a definitive statement or just plain set of some data?

In the example where 'Fred saw the mountains flying over Zurich' what direction should system take to reply to this sentence? The most obvious is a highlighting of the non-known facts, like I do not know about flying mountains, so I will take this direction and either ask how is it possible (if I'm 3 years old) or will laugh on this and say that this is a nonsence.

But for the more common 'Fred saw the plane flying over Zurich' digital input I do not know what to answer if I know Fred, saw the planes and Zurich.

Looks like I answered my own question - if provided input implies a question, then answer it, otherwise highlight unknown facts in the data and concentrate on them. Otherwise accept and enjoy :)

Eliza brings a couple of things to the table that other systems don't - mostly as it allows a way to quickly load some structure into systems - which then allow the running of test data against those structures. It's often a way to short-circuit starting from 0 knowledge (new born infant) and to boot-strap yourself a 3 year old. A simple example is extracting sentences from a paragraph. It can be used as a pre-parser or a post parser or as a way of rephrasing data. Rephrasing is a handy tool for testing validity.

It provides a vehicle for asking questions but also provides an approach to determining the relevance of information within the available context. The term available context was used as it's often interesting to limit the available information to cognitive processes.

You often ask questions about statements you encounter: Who, What, Where, When, Why

You'll also have an operational mode that you'll switch between: operational modes help to define how a cognitive process should approach the problem.

In the human model - think along the lines of how your state of mind changes based on the situational aspects of the encounter. The context of the situation can be external, reflective or constructed.

External contexts are where we are expected to respond - maybe not to all input - but to some. Often these situations are where an action or consensus is required.
Reflective contexts are where information is absorbed and processed - generally to bring out understanding or knowledge but also when a pattern is reverse fit - not proving a fact but re-assimilating input so that it correlates.
Constructed contexts are the what if situations & problem solving. Similar to the reflective context but more about adjusting previous input to test fitness to something new while attempting to maintain it's validity to other knowledge.

You'll often start in a reflective context as you assimilate information and then move into a constructed context to maximise knowledge domains. Then you'll often edge into the external context - while running reflective contexts in the background. Periodically you'll create constructed contexts to boot-strap knowledge domains and to learn from how knowledge domains are created (which in turn will tune how the reflective domains obtain information).

Essentially this is a lot of talk for saying that you don't always need to provide an output. :)

Now I mentioned at the beginning that it's often interesting to limit the information available to an available context - often it's not only interesting but also important. The available context is the set of prior knowledge (and the rules (or the approach) of applying the relationships to the information the it's surrounding knowledge).

If all knowledge is available to an available context and the same approach is used for processing that information - then it's hard for a system to determine relevance or importance of which facts to extract from data. In essence the system can't see the wood from the trees.

Think about how you tackle a problem you encounter - you start with one approach based on your experience (so you're selecting and limiting the tools you're going to apply to deal with the situation) and based on how the interaction with the situation goes - you'll adjust. Sometimes you'll find that you adjust to something very basic (keep it simple stupid or one step at a time) - at others you'll employ more complex toolsets.

The Eliza approach can be used not just as a processing engine - but also as a way of allowing cognitive systems to switch or activate the contexts I mentioned earlier. It's also a handy pre-parser for input into SOAR.

One of the reasons I visited your site was to understand more about POHMELFS, Elliptics and your DST implementation. I've been looking for a paralleled distributed storage mechanism that is fast and supports a decent approach to versioning for a while for a NLP & MT approach. Distribution and parallelism are required as I implement a virtualised agent approach which allow me to run modified instances of knowledge domains and/or rules to create dynamic contexts. Versioning is important as it allows working with information from earlier time periods, replaying the formation of rules and assumptions and greatly helps to roll-back processing should the current decision tree appear fruitless. In human cognitive terms these act as sub-concious processing domains.

I plan to use regexp parser to extract common knowledge, if there will be enough time I would definitely like to develop proper grammatics based analizer, but this may require more time. ELIZE approach does not 'scale', i.e. it will not grow up into what I would really like to develop. So I will play with weighted graph of words as a long-standing memory and analysis of the input based on that knowledge.

oops -forgot to login when I posted the above :)

GF

Maybe you'd be interested in GF: http://www.cs.chalmers.se/Cs/Research/Language-technology/GF/

Thanks for the link, I will study it.

I also thought about a bit different thing: not only how to create a phrase, but what system should say by itself. How could system determine that it wants to answer or to express its own opinion on the problem.