Global search finds Objects that no longer contain the search string

What’s The Bug?

I have a Page with a very large text in it.
The text was once even larger, but I deleted a lot of its content, after moving it into other Objects.

The strange thing:
The global search still shows me this specific Page in its results if I search for a certain word that was once in it.
But this word is in fact no longer in this specific Page, it was moved into other Objects weeks ago.
(These other Objects also appear in the search results)

First I suspected, the search function finds terms in the Pages history.
But I couldn’t reproduce this behavior in a test Space (tested with a much shorter Page).

As already discussed with @Razor in the Detective Bureau on May 29, the reason is that Anytype uses an “text index” for the search - instead of the actual real content in the Pages.
According to Razor “it’s middleware side”.

Nothing has happen since then, that’s why I write this bug report now.

How To Reproduce It

  1. Copy repeatedly a lot of content into a Page.
  2. Work with the Page in different ways.
  3. Remove Parts of that content.
  4. Search globally for a term that no longer exists in this Page.

If the search function shows this Page in its results, you’ve reproduced the bug.
It doesn’t happen in all cases. It seems to be necessary to work a bit with the Page; doing some edits and so on.
But if the error occurs, it doesn’t matter what you further do with that Page - search will forever find this Page.

That means, that the “text index” never becomes updated, it seemingly contains from now on a shaddow copy of everything that once was in this Page.

The Expected Behavior

In my opinion it is a bad concept to use such an index that isn’t 100 percent synchrone with the real content.
A user clearly expects that the search function only finds Object that actually really contain the search string.

Additional Context

I see this bug even as a security thread.
In case you have once temporarily stored some very sensitive information in a Page but deleted them afterwards, the search function still shows a part of that information in its results.
I don’t know if under certain circumstances such information can even “leak” into other (shared) Spaces, for example after exporting some content and importing it into another Space. If not (yet) it may become a relevant problem in future if we get more advanced functions for moving content into other Spaces, or functions for linking from Space to Space etc.

See also this discussion:

Device

Desktop PC

OS

Win 10

Anytype Version

0.41.44-beta (and older versions)

Network Mode

AnySync

Technical Information

OS version: win32 x64 10.0.19045
App version: 0.41.44-beta
Build number: build on 2024-07-26 09:26:16 +0000 UTC at #880ac3d5134fe60092e57266300db2831da32e9c (dirty)
Library version: v0.35.0-rc11
Anytype Identity: AAtt6aReARByswg2CbvZhveoJEEcnydfDu7U2VAkgJALsE7D
Analytics ID: 8a514008-4f5d-40d3-970f-a0d241b63af2
Device ID: 12D3KooWGqd6JSafCCcEwvnke5JAVgBNnuR9gGQeSBQ5HdUboWtG

This report has been added to our issue tracker and received by the Development Team.

This issue has been fixed by the Development Team and will be implemented in an upcoming release.

The bug is finally gone with the today’s version! :+1:

@Razor btw:
The today’s version has still the same version number as the previous beta:
v0.42.38-beta.
Also the technical information shows this version number and a built number from four days ago:

OS version: win32 x64 10.0.19045
App version: 0.42.38-beta
Build number: build on 2024-10-11 13:08:06 +0000 UTC at #1f6513697af29f7e061c0ac001376b99793bcbc3 (dirty)
Library version: v0.36.0-rc8
Anytype Identity: AAtt6aReARByswg2CbvZhveoJEEcnydfDu7U2VAkgJALsE7D
Analytics ID: 8a514008-4f5d-40d3-970f-a0d241b63af2
Device ID: 12D3KooWGqd6JSafCCcEwvnke5JAVgBNnuR9gGQeSBQ5HdUboWtG

And “off course”, as often mentioned, the “What’s new” section has never entries for the news in beta versions …

That makes a not so easy to report issues if the data in these three menu points seem to belong to an earlier version.