To find out documents which match a given query search engine iterates over inverted indexes and intersect the results. Inverted index is usually being built over terms found in a document, term can be a word, its transformation, normal form, a sentence and so on.
This indexing process is rather costly and slow, but usually it is not a problem – documents in a set are not frequently changed. Indexing is being performed in batches – for example iterate over all documents downloaded since the last indexing batch.
Realtime search in this context is basically a batch reduction – to 1 document in a batch in its extreme.
But there is a completely different search pattern – when a realtime flow of documents has to match a set of queries. The most common example is live twitter search.
They use Kafka to store stream of documents and queries and a tricky way to optimize queries that way it would not require to run all queries against all new documents. It is achieved by building a query index which matches only those queries which potentially might match given document and then only run those queries.