Reply to thread

Message: <blockquote data-quote="MJGraf" data-source="post: 1033050" data-attributes="member: 17886">Hi everyone,breese has given me so much information that I can hardly cope with it <img src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" class="smilie smilie--sprite smilie--sprite8" alt=":D" title="Big Grin    :D" loading="lazy" data-shortname=":D" /> Thanks for that - very interesting results.But before I get into a deep analysis of this (real live is quite stressful at the moment) I wanted to report about some more things I noticed while playing with SQLite.First of all, I have implemented a VERY verbose additional logging possibility. It hooks into the System.Data.SQLite logging facility and creates a new SQLiteDebug.log file. I started an import while I was logging and stopped it 2 minutes later because the log file was already > 250MB. So this is really just for debugging purposes. I therefore added a new setting (EnableTraceLogging) which is off by default. The logger reports all errors that may come directly from SQLite (i.e. the C native library) as well as all information logged by System.Data.SQLite (i.e. the C# wrapper). This is interesting because you not only see the commands sent by MP2, but also the commands that are transparently generated by System.Data.SQLite (e.g. if you dispose a SQLiteTransaction object, you can see that this triggers a "ROLLBACK" to the database; or instantiating a new SQLiteTransaction object issues a "BEGIN IMMEDIATE" to the database because we use isolationmode.serializable, etc.). So this is very interesting information and I even plan to make it more verbose. In SQLite you can start every "regular" SQL command with "EXPLAIN QUERY PLAN ...", which does not actually execute the following query, but just tells you what it would do exactly, if your query was executed. This information is even more interesting (see below) and I will try to get that into the SQLiteDebug.log as well.What I did with that information so far is the following: As you can read some pages before in this thread, our ReadTest 2 (i.e. just entering the audio section in MP2 Client) has a very long QueryTime. Actually the QueryTime is a big part of the whole ReadTime, which is not the case for all other queries (there reading takes much more time than the query itself). I wanted to find out why that's the case.As a recap, the SQL query, which is executed when you enter the audio section is the following:[CODE]SELECT  COUNT(  V.C) C,  V.A0FROM (  SELECT  DISTINCT T1.MEDIA_ITEM_ID C,  T0.ALBUM A0FROM M_PROVIDERRESOURCE T1  INNER JOIN M_MEDIAITEM T2 ON T2.MEDIA_ITEM_ID = T1.MEDIA_ITEM_ID  INNER JOIN M_AUDIOITEM T0 ON T0.MEDIA_ITEM_ID = T1.MEDIA_ITEM_IDWHERE  (  T1.SYSTEM_ID = @V0  OR T1.SYSTEM_ID = @V1)) VGROUP BY  V.A0-------------------------------------------------------"V0" [String]: '62133952-2a3d-4e55-8b9c-cc3fc52b4d14'"V1" [String]: 'cbe4e0ac-a21d-4d18-be63-7bf6b27cb13a'[/CODE](actually this exact query is executed twice - which I already discovered as a likely bug, still have to put this into Jira so that it is not forgotten...)Now I put "EXPLAIN QUERY PLAN" in front of this query and got the following result:[CODE]1|0|0|SEARCH TABLE M_PROVIDERRESOURCE AS T1 USING INDEX IDX_0 (SYSTEM_ID=?)1|0|0|EXECUTE LIST SUBQUERY 21|1|1|SEARCH TABLE M_MEDIAITEM AS T2 USING COVERING INDEX IDX_6 (MEDIA_ITEM_ID=?)1|2|2|SEARCH TABLE M_AUDIOITEM AS T0 USING INDEX IDX_25 (MEDIA_ITEM_ID=?)1|0|0|USE TEMP B-TREE FOR DISTINCT0|0|0|SCAN SUBQUERY 1 AS V0|0|0|USE TEMP B-TREE FOR GROUP BY[/CODE]Those of you, who are interested enough, can read what that means in detail <a href="http://www.sqlite.org/eqp.html" target="_blank">here</a>.The important part now is "USE TEMP B-TREE...". This means not less than SQLite creates a temporary index to perform the "DISTINCT" operation and an additional temporary index to perform the "GROUP BY" operation. This not only explains why breese is seeing a lot of temporary files (these temporary indices are actually written to temporary files by SQLite), it also explains why this query takes so long, because creating a temporary index is extremely expensive.But why does it create those indices? To be honest, I don't really know...First of all, the "DISTINCT" keyword in the context of this query doesn't make sense in my opinion. We want all the distinct rows from "T1.MEDIA_ITEM_ID C". But I hope there are not multiple songs WITH THE SAME MEDIA_ITEM_ID in one album. The MEDIA_ITEM_ID should be unique so that to my understanding the results of this query are already unique without using DISTINCT. If we leave out the "DISTINCT" keyword, we should get exactly the same query result, but the query plan is reduced by "1|0|0|USE TEMP B-TREE FOR DISTINCT".Secondly, I don't know why it uses a temp index for GROUP BY V.A0. V.A0 is just our album name - and there is already an index created by MP2 for the album name. Most likely, SQLite doesn't recognize this, because of the subquery. Maybe this changes if I perform an "ANALYZE" on the database, which is said to analyze the database structure and thereby improve the query optimizer of SQLite. We will see...Now what I want to say is: This is again outside the scope of the SQLiteDatabase plugin. But it shows that there are still a lot of performance improvement possibilities in the MediaLibrary system of MP2 itself. Let's see if I have a look at this after SQLiteDatabase v0.08 is ready...</blockquote>

Verification